In this homework we will study the no-intercept model \(y_i=\beta x_i+\epsilon_i\), where \(\epsilon_i\sim N(0, \sigma^2)\), \(\sigma\) known. As data we have

set.seed(123)
x=1:10
y=round(10*x+rnorm(10, 0, 10), 1)
rbind(x, y)
##   [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## x  1.0  2.0  3.0  4.0  5.0  6.0  7.0  8.0  9.0  10.0
## y  4.4 17.7 45.6 40.7 51.3 77.2 74.6 67.3 83.1  95.5

Solve all the problems without using any theorems from the class.

Problem 1

Find the least squares estimator of \(\beta\).

\[ \begin{aligned} &R(\beta) = \sum_{i=1}^n (y_i-\beta x_i)^2\\ &\frac{dR}{d\beta} = \sum_{i=1}^n (-2)(y_i-\beta x_i)x_i\\ &(-2)\left(\sum_{i=1}^n x_iy_i- \beta \sum_{i=1}^n x_i^2\right) = 0\\ &\hat{\beta} = \frac{\sum_{i=1}^n x_iy_i}{\sum_{i=1}^n x_i^2} \end{aligned} \]

betahat = sum(x*y)/sum(x^2)
round(betahat, 2)
## [1] 9.93

In the following we will often use the abbreviation \(s_{xx}=\sum_{i=1}^n x_i^2\).

Problem 2

Show that \(\hat{\beta}\) is an unbiased estimator of \(\beta\). Find its variance.

\[ \begin{aligned} &E[\sum_{j=1}^n x_jy_j] = \sum_{j=1}^n x_jE[y_j]=\sum_{j=1}^n x_j\beta x_j=\beta\sum_{j=1}^n x_j^2\\ &E[\hat{\beta}] =E[\frac{\sum_{j=1}^n x_jy_j}{\sum_{j=1}^n x_j^2}] = \beta \end{aligned} \] \[ \begin{aligned} &var(\hat{\beta}) =var\left(\frac{\sum_{j=1}^n x_jy_j}{\sum_{j=1}^n x_j^2}\right) = \\ &\left(\frac{1}{\sum_{j=1}^n x_j^2}\right)^2var\left(\sum_{j=1}^n x_jy_j\right) = \\ &\left(\frac{1}{\sum_{j=1}^n x_j^2}\right)^2\sum_{j=1}^nvar\left( x_jy_j\right) = \\ &\left(\frac{1}{\sum_{j=1}^n x_j^2}\right)^2\sum_{j=1}^n x_j^2var\left( y_j\right) = \\ &\left(\frac{1}{\sum_{j=1}^n x_j^2}\right)^2\sum_{j=1}^n x_j^2\sigma^2 = \\ &\frac{\sigma^2}{\sum_{j=1}^n x_j^2}\end{aligned} \]

Problem 3

Find the distribution of \(\hat{\beta}\).

\(\hat{\beta}\) is a linear combination of normal random variables, so it is also normal. We already found the mean and the variance, and so we find

\[\hat{\beta}\sim N\left(\beta , \sigma^2/(\sum_{i=1}^n x_i^2)\right)\]

Problem 4

Find a \(90\%\) confidence interval for \(\beta\).

Using problem 3 we find a \((1-\alpha)100\%\) confidence interval to be

\[\hat{\beta}\pm z_{\alpha/2}\sigma/\sqrt{\sum_{i=1}^n x_i^2}\]

from the R code that generated the data we know \(\sigma=10\), and so

betahat = sum(x*y)/sum(x^2)
round(betahat+c(-1, 1)*qnorm(0.95)*10*sqrt(1/sum(x^2)), 2)
## [1]  9.09 10.77

Problem 5

Find that the maximum likelihood estimator of \(\beta\).

\[ \begin{aligned} &f(\pmb{y}\vert\beta) = (2\pi\sigma^2)^{-n/2}\exp \left\{ -\frac1{2\sigma^2} \sum_{i=1}^n \left(y_i-\beta x_i\right)^2 \right\} \\ &l(\beta\vert\pmb{y}) = -n/2\log(2\pi\sigma^2)-\frac1{2\sigma^2} \sum_{i=1}^n \left(y_i-\beta x_i\right)^2 \end{aligned} \] which clearly has a maximum at \(\hat{\beta}\), so the mle and the least squares estimator are the same.

Problem 6

Now assume \(\sigma\) is not known. Show that

\[s^2=\frac{1}{n-1}\sum_{i=1}^n (y_i-\hat{\beta}x_i)^2\] is an unbiased estimator of \(\sigma^2\).

First notice that

\[ \begin{aligned} &E[y_i^2] =var(y_i)+E[y_i]^2=\sigma^2+(\beta x_i)^2 \\ &E[y_iy_j] =cov(y_i,y_j)+E[y_i]E[y_j]=0+(\beta x_i)(\beta x_j);i\ne j \\ &E[y_iy_j] = \sigma^2I(i=j)+\beta^2x_ix_j \end{aligned} \] Let \(t=\sum_i x^2\), then

\[ \begin{aligned} &E\left[\left(y_i-[\sum_j x_jy_j/t]x_i\right)^2\right] = \\ &E\left[y_i^2-2y_i[\sum_j x_jy_j/t]x_i+([\sum_j x_jy_j/t]x_i)^2\right] = \\ &E\left[y_i^2-2x_i/t\sum_j x_jy_iy_j+x_i^2/t^2\sum_{j,k} (x_jy_j)(x_ky_k) \right] = \\ &E[y_i^2]-2x_i/t\sum_j x_jE[y_iy_j]+x_i^2/t^2\sum_{j,k} x_jx_kE[y_jy_k] = \\ &(\sigma^2+\beta^2x_i^2) -2x_i/t\sum_{j} x_j\left(\sigma^2I(i=j)+\beta^2x_ix_j\right) +\\ &+x_i^2/t^2\sum_{j,k} x_jx_k\left(\sigma^2I(j=k)+\beta^2x_jx_k\right) = \\ &\text{ }\\ &\sigma^2+\beta^2x_i^2 -2x_i/t (\sigma^2x_i+\beta^2x_i\sum_{j}x_j^2) +\\ &+x_i^2/t^2 (\sigma^2\sum_{j}x_j^2+\beta^2\sum_{j,k}x_j^2x_k^2) = \\ &\text{ }\\ &\sigma^2+\beta^2x_i^2 -2\sigma^2 x_i^2/t-2\beta^2x_i^2 +x_i^2/t^2 (\sigma^2t+\beta^2t^2) = \\ &\sigma^2+\beta^2x_i^2 -2\sigma^2 x_i^2/t-2\beta^2x_i^2 + \sigma^2x_i^2/t+\beta^2x_i^2 \end{aligned} \]

\[ \begin{aligned} &E[(n-1)s^2] = E[\sum_ i(y_i-\sum_j x_jy_j/t)^2] =\\ &\sum_i E[(y_i-\sum_j x_jy_j/t)^2] = \\ &\sum_i \left(\sigma^2+\beta^2x_i^2 -2\sigma^2 x_i^2/t-2\beta^2x_i^2 + \sigma^2x_i^2/t+\beta^2x_i^2\right) = \\ &n\sigma^2+\beta^2\sum_i x_i^2 -2\sigma^2 \sum_i x_i^2/t-2\beta^2\sum_ix_i^2 + \sigma^2\sum_ix_i^2/t+\beta^2\sum_ix_i^2 =\\ &n\sigma^2+\beta^2t -2\sigma^2 -2\beta^2+ \sigma^2+\beta^2t =(n-1)\sigma^2\\ \end{aligned} \]


Here is another solution using a theorem from class:

\[\pmb{a}_i = \begin{pmatrix} -x_1x_i/t & ... &1-x_i^2/t& ... &-x_nx_i/t \end{pmatrix}\]

now

\[\left(y_i- \hat{\beta}x_i\right)^2 = (\pmb{a_i'y})^2=\pmb{y'a_ia_i'y}\] so

\[E[(n-1)s^2]=E[\sum_{i=1}^n (y_i-\hat{\beta}x_i)^2]= \\E[\sum_{i=1}^n \pmb{y'a_ia_i'y}]=Ec\] and by theorem (5.3.3)

\[ \begin{aligned} &E[\sum_i(y_i- \hat{\beta}x_i)^2] = \\ &E\left[\pmb{y}'\left(\sum_{i=1}^n\pmb{a_ia'_i}\right)\pmb{y}\right] = \\ &tr(\left(\sum_{i=1}^n\pmb{a_ia'_i}\right)\Sigma) + (\beta\pmb{x})'\left(\sum_{i=1}^n\pmb{a_ia'_i}\right)(\beta\pmb{x}) = \\ &\sigma^2tr\left(\sum_{i=1}^n\pmb{a_ia'_i}\right) + \beta^2\pmb{x}'\left(\sum_{i=1}^n\pmb{a_ia'_i}\right)\pmb{x} \end{aligned} \]

Now let \(\pmb{A}=\sum_i \pmb{a_ia_i'}\)

\[ \begin{aligned} &\pmb{a_ia_i'}= \begin{pmatrix} -x_1x_i/t \\ ... \\1-x_i^2/t\\ ... \\-x_nx_i/t \end{pmatrix} \begin{pmatrix} -x_1x_i/t & ... &1-x_i^2/t& ... &-x_nx_i/t \end{pmatrix}\\ &(\pmb{a_ia_i'})_{ii} =(1-x_i^2/t)^2\\ &(\pmb{a_ia_i'})_{jj} =x_i^2x_j^2/t^2;j\ne i\\ &(\pmb{a_ia_i'})_{ij} =-x_ix_j/t(1-x_i^2/t);j\ne i\\ &(\pmb{a_ia_i'})_{jk} =x_i^2x_jx_k/t^2;j,k\ne i\\ &\text{ }\\ &A_{ii}=(1-x_i^2/t)^2+\sum_{j\ne i} x_i^2x_j^2/t^2 = \\ &(1-x_i^2/t)^2+ x_i^2(\sum_{j} x_j^2-x_i^2)/t^2 = \\ &(1-x_i^2/t)^2+ x_i^2(t-x_i^2)/t^2 = \\ &1-2x_i^2/t+x_i^4/t^2+ x_i^2/t-x_i^4/t^2 = 1-x^2_i/t\\ \end{aligned} \]

and so

\[tr(\pmb{A}) = \sum_i (1-x^2_i/t) = n-1\]

For the second part we need to show that \(\pmb{x}'\left(\sum_i\pmb{a_ia'_i}\right)\pmb{x} =0\),

\[ \begin{aligned} &A_{jk}=(\pmb{a_ja_j'})_{jk}+(\pmb{a_ka_k'})_{jk}+\sum_{i\ne j,k}(\pmb{a_ia_i'})_{jk} = \\ &-x_jx_k/t(1-x_j^2/t)-x_kx_j/t(1-x_k^2/t)+\sum_{i\ne j,k} x_i^2x_jx_k/t^2 = \\ &-x_jx_k/t(2-x_j^2/t-x_k^2/t)+x_jx_k/t^2(\sum_{i} x_i^2-x_j^2-x_k^2) = \\ &-2x_jx_k/t-x_j^3x_k/t^2-x_jx_k^3/t^2+x_jx_k/t^2(t-x_j^2-x_k^2) = \\ &-2x_jx_k/t-x_j^3x_k/t^2-x_jx_k^3/t^2+x_jx_k/t^2(t-x_j^2-x_k^2) = \\ &-x_jx_k/t \end{aligned} \]

and so

\[ \begin{aligned} &\pmb{x'Ax} = \sum_{i,j} A_{ij}x_ix_j =\\ &\sum_i A_{ii}x_i^2 + \sum_{i\ne j} A_{ij}x_ix_j = \\ &\sum_i (1-x_i^2/t)x_i^2 + \sum_{i\ne j} (-x_ix_j)/tx_ix_j = \\ &\sum_i (x_i^2-x_i^4/t) - (1/t) \sum_{i\ne j} x_i^2x_j^2 = \\ &t - (\sum_i x_i^4)/t - (1/t) (\sum_{i,j} x_i^2x_j^2 -\sum_i x_i^4) = \\ &t - (1/t) (\sum_{i} x_i^2)^2 = t-t=0\\ \end{aligned} \]

Let’s just do a check with R:

es = function(x) {
   n=length(x)
   t=sum(x^2)
   x=cbind(x)
   A=matrix(0, n, n)
   for(i in 1:n) {
     a=cbind(ifelse(1:n==i,1,0))-x[i,1]*x/t
     A=A+a%*%t(a)
   }
   c(sum(diag(A)), round( t(x)%*%A%*%x, 4))
}
es(sort(runif(10, 0, 10)))
## [1] 9 0