Problem 1

Let \(X\sim N(1,1)\), \(Y\sim N(2,1)\), \(Z\sim N(3,1)\), all independent. Find

\[P\left((X+Z)^2/2+Y^2<10\right)\]

Note that if \(\pmb{A}=\frac12\begin{pmatrix} 1&0&1\\0&2&0\\1&0&1 \end{pmatrix}\), then

\[ \begin{aligned} &\pmb{X'AX} = \begin{pmatrix} x&y&z \end{pmatrix} \frac12\begin{pmatrix} 1&0&1\\0&2&0\\1&0&1 \end{pmatrix} \begin{pmatrix} x\\y \\z \end{pmatrix}=\\ &\frac12\begin{pmatrix} x&y&z \end{pmatrix} \begin{pmatrix} x+z\\2y\\x+z \end{pmatrix}=\\ &\frac12(x^2+2xz+2y^2+z^2)=\frac12(x+z)^2+y^2 \end{aligned} \]

Also note that \(\pmb{A}\) is symmetric and idempotent. Therefore by corrolary (5.5.3) \(\pmb{X'AX}\sim\chi^2(r, \lambda)\), where \(r=2\) is the rank of \(\pmb{A}\) and

\[\lambda = \frac12\mu\pmb{j'Aj}=\frac121\left[(1+1)^2/2+1^2\right]=3/2\]

and so

\[P\left((X+Z)^2/2+Y^2<10\right)=P(\pmb{X'AX}<10)\]

lambda=(1+3)^2/2+2^2
lambda
## [1] 12
round(pchisq(10, 2, lambda), 4)
## [1] 0.3245

Let’s check:

B=1e5
mu=c(1, 2, 3)
x=rnorm(B, mu[1])
y=rnorm(B, mu[2])
z=rnorm(B, mu[3])
lambda=(mu[1]+mu[3])^2/2+mu[2]^2
u=(x+z)^2/2+y^2
round(c(sum(u<10)/B, pchisq(10, 2, lambda)),4)
## [1] 0.3263 0.3245
hist(u, 100, freq=FALSE)
curve(dchisq(x, 2, lambda), 0, 40,lwd=2, add=TRUE)

Problem 2

highways in Resma3 has data on the rate of accidents on a number of highways, together with a list of predictor variables. These are

rate: accidents per million vehicle miles
len: length of the segment in miles
adt: average daily traffic count in thousands
trks: truck volume as a percent of total traffic
slim: speed limit
lwid: lane width in feet
shld: with in feed of outer shoulder
itg: number of freeway-type interchanges per mile
sigs: number of signalized interchanges per mile
acpt: number of access points per mile
lane: total number of traffic lanes in both directions
fai: 1 if federal aid interstate highway, 0 otherwise
pa: 1 if principal arterial highway, 0 otherwise
ma: 1 if major arterial highway, 0 otherwise

Some of these predictors are indicator variables, ignore this issue here.

Do all the calculations without using the lm command

  1. Find the least squares estimators of the predictors
y=cbind(highways[, 1])
X=cbind(1, as.matrix(highways[, -1]))
betahat=solve(t(X)%*%X)%*%t(X)%*%y
round(c(betahat), 4)
##  [1] 13.6582 -0.0648 -0.0040 -0.1002 -0.1238 -0.1338  0.0141 -0.4755  0.7136
## [10]  0.0666  0.0267  0.5436 -1.0098 -0.5480
round(coef(lm(rate~., data=highways)), 4)
## (Intercept)         len         adt        trks        slim        lwid 
##     13.6582     -0.0648     -0.0040     -0.1002     -0.1238     -0.1338 
##        shld         itg        sigs        acpt        lane         fai 
##      0.0141     -0.4755      0.7136      0.0666      0.0267      0.5436 
##          pa          ma 
##     -1.0098     -0.5480
  1. Find the \(R^2\)
yhat=X%*%betahat
cor(y,yhat)^2
##          [,1]
## [1,] 0.760527
  1. Find an estimate of \(\sigma^2\)

\[s^2=\frac{\text{SSE}}{n-k-1}\]

n=nrow(X)
k=ncol(X)-1
sse=sum((y-yhat)^2)
round(sse/(n-k-1), 4)
## [1] 1.4357
  1. Test at the 5% level whether the 8 predictors that have the lowest correlation with rate can be eliminated from the model.
cors=cor(highways)[, 1][-1]
sort(abs(cors), decreasing =TRUE)
##        acpt        slim        sigs        trks         len        shld 
## 0.752025471 0.680983625 0.564479507 0.512522209 0.465289581 0.386907190 
##          ma         fai          pa        lane         adt         itg 
## 0.337848301 0.207610359 0.161539823 0.032978589 0.028569805 0.024840883 
##        lwid 
## 0.005619311
X=X[, c(1, order(abs(cors), decreasing =TRUE)+1)]
X1=X[, 1:6]
colnames(X)
##  [1] ""     "acpt" "slim" "sigs" "trks" "len"  "shld" "ma"   "fai"  "pa"  
## [11] "lane" "adt"  "itg"  "lwid"
colnames(X1)
## [1] ""     "acpt" "slim" "sigs" "trks" "len"
H=X%*%solve(t(X)%*%X)%*%t(X)
H1=X1%*%solve(t(X1)%*%X1)%*%t(X1)
h=ncol(X)-ncol(X1)
num=t(y)%*%(H-H1)%*%y/h
denom=t(y)%*%(diag(nrow(H))-H)%*%y/(n-k-1)
FTS=num/denom
round(c(FTS,  1-pf(FTS, h, n-k-1)), 4)
## [1] 0.4383 0.8865

and so we fail to reject the null hypothesis, it seems those 8 predictors are not useful.

Problem 3

Prove theorem (6.5.1).

We have \(\pmb{y=X\beta+\epsilon}\), \(E[\pmb{y}]=\pmb{X\beta}\) and \(cov(\pmb{y})=\sigma^2\pmb{V}\).

\(\pmb{V}\) is assumed to be positive definite, that is it is symmetric and has an square root as well as an inverse. So consider the model

\[\pmb{z=U\beta+\tau}\] where \(\pmb{z=V^{-1/2}y}\), \(\pmb{U=V^{-1/2}X}\) and \(\pmb{\tau=V^{-1/2}\epsilon}\).

By theorem (5.1.10) we have

\[ \begin{aligned} &cov(\pmb{z}) = \\ &cov(\pmb{V^{-1/2}y}) = \\ &\pmb{V^{-1/2}}cov(\pmb{y}\pmb{(V^{-1/2})'} = \\ &\pmb{V^{-1/2}}\sigma^2\pmb{V(V^{-1/2})'} =\\ &\sigma^2\pmb{V^{-1/2}}\pmb{V^{-1/2}}\pmb{V} =\\ &\sigma^2\pmb{I} \end{aligned} \]

so now we have a regular regression model, and according to theorem (6.2.2)

\[ \begin{aligned} &\hat{\pmb{\beta}} = (\pmb{U'U})^{-1}\pmb{U'z} = \\ &\left(\pmb{(V^{-1/2}X)'V^{-1/2}X}\right)^{-1}\pmb{(V^{-1/2}X)'(V^{-1/2}y)} =\\ &\left(X'\pmb{V^{-1}X}\right)^{-1}\pmb{X'V^{-1}y} \end{aligned} \]

Again using theorem (5.1.10) we have

\[ \begin{aligned} &cov(\hat{\pmb{\beta}}) = cov(\left(X'\pmb{V^{-1}X}\right)^{-1}\pmb{X'V^{-1}y}) = \\ &\left[\left(X'\pmb{V^{-1}X}\right)^{-1}\pmb{X'V^{-1}}\right]\sigma^2\pmb{V}\left[\left(X'\pmb{V^{-1}X}\right)^{-1}\pmb{X'V^{-1}}\right]' = \\ &\sigma^2\left(X'\pmb{V^{-1}X}\right)^{-1}\pmb{X'V^{-1}VV^{-1}X\left(X'\pmb{V^{-1}X}\right)^{-1}} = \\ &\sigma^2\left(X'\pmb{V^{-1}X}\right)^{-1} \end{aligned} \]

\[ \begin{aligned} &E\left[(n-k-1)s^2\right] = \\ &E\left[\pmb{(y-X\hat{\beta})'V^{-1}(y-X\hat{\beta})}\right] = \\ &E\left[\pmb{(y-X\hat{\beta})V^{-1/2}V^{-1/2}(y-X\hat{\beta})}\right] = \\ &E\left[\pmb{(V^{-1/2}y-V^{-1/2}X\hat{\beta})'(V^{-1/2}y-V^{-1/2}X\hat{\beta})}\right] = \\ &E\left[\pmb{(z-U\hat{\beta})'(z-U\hat{\beta})}\right] = \sigma^2\\ \end{aligned} \]

by theorem (6.2.13)

Problem 4

Say we have a model of the form \(y_i = \alpha x_i+\beta z_i+\epsilon_i\), that is a no-intercept model with two predictor variables. Find the least squares estimators of \(\alpha\) and \(\beta\).

  1. directly

\[ \begin{aligned} &R(\alpha,\beta) =\sum_{i=1}^n \left(y_i-\alpha x_i-\beta z_i\right)^2 \\ &\frac{d R}{d\alpha} = (-2)\sum_{i=1}^n \left(y_i-\alpha x_i-\beta z_i\right)x_i=\\ &(-2n)\left(\overline{xy}-\alpha \overline{x^2}-\beta \overline{xz}\right) = 0\\ &\frac{d R}{d\beta} =(-2n)\left(\overline{zy}-\alpha \overline{xz}-\beta \overline{z^2}\right) = 0\\ &\alpha \overline{x^2}+\beta \overline{xz} = \overline{xy}\\ &\alpha \overline{xz}+\beta \overline{z^2} = \overline{zy}\\ &\beta\left(\overline{xz}^2-\overline{x^2}\times\overline{z^2}\right)=\overline{xy}\times\overline{xz}-\overline{zy}\times\overline{x^2}\\ &\hat{\beta}=\frac{\overline{xy}\times\overline{xz}-\overline{zy}\times\overline{x^2}}{\overline{xz}^2-\overline{x^2}\times\overline{z^2}}\\ &\alpha\left(\overline{x^2}\times\overline{z^2}-\overline{xz}^2\right)=\overline{xy}\times\overline{z^2}-\overline{zy}\times\overline{xz}\\ &\hat{\alpha}=\frac{\overline{zy}\times\overline{xz}-\overline{xy}\times\overline{z^2}}{\overline{xz}^2-\overline{x^2}\times\overline{z^2}} \end{aligned} \]

  1. using the theorems from class.

\[ \pmb{X} = \begin{pmatrix} x_1 & z_1 \\ \vdots & \vdots \\ x_n & z_n \\ \end{pmatrix} \]

\[ \pmb{X'X} = \begin{pmatrix} \sum x_i^2 & \sum x_i z_i \\ \sum x_i z_i & \sum z_i^2 \end{pmatrix} \]

\[ (\pmb{X'X})^{-1} = \frac1{(\sum x_i^2)(\sum z_i^2)-(\sum x_i z_i)^2} \begin{pmatrix} \sum z_i^2 & -\sum x_i z_i \\ -\sum x_i z_i & \sum x_i^2 \end{pmatrix} \]

\[ \pmb{X'y} = \begin{pmatrix} x_1 & ... &x_n \\ z_n & ...& z_n \\ \end{pmatrix} \begin{pmatrix} y_1 \\ \vdots\\ y_n \end{pmatrix}= \begin{pmatrix} \sum x_iy_i \\ \sum z_iy_i \end{pmatrix} \]

\[ \hat{\pmb{\beta}} =(\pmb{X'X})^{-1}\pmb{X'y}= \\ \frac1{(\sum x_i^2)(\sum z_i^2)-(\sum x_i z_i)^2} \begin{pmatrix} \sum z_i^2 & -\sum x_i z_i \\ -\sum x_i z_i & \sum x_i^2 \end{pmatrix} \begin{pmatrix} \sum x_iy_i \\ \sum z_iy_i \end{pmatrix} = \\ \frac1{(\sum x_i^2)(\sum z_i^2)-(\sum x_i z_i)^2} \begin{pmatrix} (\sum z_i^2)(\sum x_iy_i)-(\sum x_iz_i)(\sum z_iy_i) \\ (\sum x_i^2)(\sum z_iy_i)-(\sum x_iz_i)(\sum x_iy_i) \end{pmatrix}=\\ \frac1{\overline{xz}^2-\overline{x^2}\times\overline{z^2}} \begin{pmatrix} \overline{xz}\times\overline{zy}-\overline{z^2}\times \overline{xy}\\ \overline{xy}\times\overline{xz}-\overline{x^2}\times\overline{zy} \end{pmatrix} \]

  1. Show that this model is a generalization of the simple regression model, and use the formulas found in a. to derive the least squares estimators of the simple regression model.

If we have \(z_i=1\) for all i, then the model becomes \(y_i=\beta+ \alpha x_i+\epsilon_i\), a simple regression model. Now

\[ \begin{aligned} &\hat{\pmb{\beta}} =\frac1{\overline{1x}^2-\overline{1^2}\times\overline{x^2}} \begin{pmatrix} \overline{1x}\times\overline{xy}-\overline{x^2}\times \overline{1y}\\ \overline{1y}\times\overline{1x}-\overline{1^2}\times\overline{xy} \end{pmatrix} = \\ &\frac1{\overline{x}^2-\overline{x^2}} \begin{pmatrix} \overline{x}\times\overline{xy}-\overline{x^2}\times \overline{y}\\ \overline{y}\times\overline{x}-\overline{xy} \end{pmatrix} \end{aligned} \] so

\[\hat{\beta_1}=\frac{\overline{xy}-\overline{y}\times\overline{x}}{\overline{x^2}-\overline{x}^2}\] which is the least squares regression estimate of the slope in terms of the means. (see (6.1.2))

and

\[ \begin{aligned} &\frac{\overline{x^2}\times \overline{y}-\overline{x}\times\overline{xy}}{\overline{x^2}-\overline{x}^2}=\\ &\frac{\overline{x^2}\times \overline{y}-\overline{x}\times\overline{xy}-\bar{y}\overline{x}^2+\bar{y}\overline{x}^2}{\overline{x^2}-\overline{x}^2}=\\ &\frac{\bar{y}\overline{x^2}-\bar{y}\overline{x}^2-\overline{xy}\bar{x}+\bar{y}\bar{x}\bar{x}}{\overline{x^2}-\overline{x}^2} = \\ &\frac{\bar{y}(\overline{x^2}-\overline{x}^2)-(\overline{xy}-\overline{y}\times\overline{x})\bar{x}}{\overline{x^2}-\overline{x}^2} = \\ &\left(\bar{y} -\frac{\overline{xy}-\overline{y}\times\overline{x}}{\overline{x^2}-\overline{x}^2}\right)\bar{x} = \\ &\bar{y} -\hat{\beta_1}\bar{x} = \hat{\beta_0} \\ \end{aligned} \]