Consider the following data set, with the least squares regression line:
x
[1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
y
[1] 2.06 2.12 2.32 2.02 2.76 3.04 2.83 3.15 3.36 3.68 3.96
In a typical least squares regression we define the distance between a data point and the line as the horizontal line (red), and then we minimize the sum of the squared distances. However, in geometry one defines the distance of a point and a line via the perpendicular line (blue). For the data set below find the best “least squares perpendicular line”.
Say we have a point \((x_1,y_1)\) and a line \(y=a+bx\). Then the line with equation \(y-y_1=-\frac1{b}(x-x_1)\) is perpendicular. It again should go through \((x_1,y_1)\). Any point on original line is \((u, a+bu)\), so we have
\[ \begin{aligned} &a+bu-y_1=-\frac1{b}(u-x_1) \\ &-ab-b^2u+by_1=u-x_1 \\ &-ab+by_1+x_1=(1+b^2)u \\ &u =\frac{x_1+by_1-ab}{1+b^2} \\ &\text{RSS} = \sum_{i=1}^n \left(x_i-u\right)^2+\left(y_i-a-bu\right)^2 \end{aligned} \]
Finding the derivatives would lead to a non-linear system of equations, so we need to use a numerical method:
perp.lm= function(x,y) {
RSS=function(beta) {
u=(x+beta[2]*y-beta[1]*beta[2])/(1+beta[2]^2)
sum((x-u)^2+(y-beta[1]-beta[2]*u)^2)
}
lsq=coef(lm(y~x))
optim(lsq, RSS)$par
}
plot(x,y)
abline(coef(lm(y~x)), col="red")
abline(perp.lm(x,y), col="blue")
round(coef(lm(y~x)), 3)
## (Intercept) x
## 1.882 1.926
round(perp.lm(x,y), 3)
## (Intercept) x
## 1.814 2.062
Here one uses the criterion
\[\text{RSS}=\sum_{i=1}^n \vert y_i-a-bx_i\vert\]
Again find the corresponding line for the data set above. Make an argument in favor of this solution.
As before this needs to be done numerically.
lav.lm= function(x,y) {
RSS=function(beta) {
sum(abs(y-beta[1]-beta[2]*x))
}
lsq=coef(lm(y~x))
optim(lsq, RSS)$par
}
plot(x,y)
abline(coef(lm(y~x)), col="red")
abline(lav.lm(x,y), col="blue")
Because of the absolute value these lines will be less affected by outliers:
x=c(x,0)
y=c(y, 4)
plot(x,y)
abline(coef(lm(y~x)), col="red")
abline(lav.lm(x,y), col="blue")
There is however a better solution: it is possible to show that this minimization problem is equivalent to a linear programming problem, which can be solved using the simplex algorithm. This is implemented in
library(L1pack)
coef(lad(y~x))
## (Intercept) x
## 1.931429 1.942857
lav.lm(x,y)
## (Intercept) x
## 1.931429 1.942857