Alternatives to Least Squares Regression

Consider the following data set, with the least squares regression line:

x
 [1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
y
 [1] 2.06 2.12 2.32 2.02 2.76 3.04 2.83 3.15 3.36 3.68 3.96

Problem 1: Orthogonal Regression

In a typical least squares regression we define the distance between a data point and the line as the horizontal line (red), and then we minimize the sum of the squared distances. However, in geometry one defines the distance of a point and a line via the perpendicular line (blue). For the data set below find the best “least squares perpendicular line”.

Say we have a point \((x_1,y_1)\) and a line \(y=a+bx\). Then the line with equation \(y-y_1=-\frac1{b}(x-x_1)\) is perpendicular. It again should go through \((x_1,y_1)\). Any point on original line is \((u, a+bu)\), so we have

\[ \begin{aligned} &a+bu-y_1=-\frac1{b}(u-x_1) \\ &-ab-b^2u+by_1=u-x_1 \\ &-ab+by_1+x_1=(1+b^2)u \\ &u =\frac{x_1+by_1-ab}{1+b^2} \\ &\text{RSS} = \sum_{i=1}^n \left(x_i-u\right)^2+\left(y_i-a-bu\right)^2 \end{aligned} \]

Finding the derivatives would lead to a non-linear system of equations, so we need to use a numerical method:

perp.lm= function(x,y) {
   RSS=function(beta) {
     u=(x+beta[2]*y-beta[1]*beta[2])/(1+beta[2]^2)
     sum((x-u)^2+(y-beta[1]-beta[2]*u)^2)
   }
   lsq=coef(lm(y~x))
   optim(lsq, RSS)$par
   
}
plot(x,y)
abline(coef(lm(y~x)), col="red")
abline(perp.lm(x,y), col="blue")

round(coef(lm(y~x)), 3)
## (Intercept)           x 
##       1.882       1.926
round(perp.lm(x,y), 3)
## (Intercept)           x 
##       1.814       2.062

Problem 2: Least Absolute Values Regression

Here one uses the criterion

\[\text{RSS}=\sum_{i=1}^n \vert y_i-a-bx_i\vert\]

Again find the corresponding line for the data set above. Make an argument in favor of this solution.

As before this needs to be done numerically.

lav.lm= function(x,y) {
   RSS=function(beta) {
      sum(abs(y-beta[1]-beta[2]*x))
   }
   lsq=coef(lm(y~x))
   optim(lsq, RSS)$par
   
}
plot(x,y)
abline(coef(lm(y~x)), col="red")
abline(lav.lm(x,y), col="blue")

Because of the absolute value these lines will be less affected by outliers:

x=c(x,0)
y=c(y, 4)
plot(x,y)
abline(coef(lm(y~x)), col="red")
abline(lav.lm(x,y), col="blue")

There is however a better solution: it is possible to show that this minimization problem is equivalent to a linear programming problem, which can be solved using the simplex algorithm. This is implemented in

library(L1pack)
coef(lad(y~x))
## (Intercept)           x 
##    1.931429    1.942857
lav.lm(x,y)
## (Intercept)           x 
##    1.931429    1.942857