Up to now we have assumed that the design matrix \(\pmb{X}\) was fixed. Often however the values of the predictor variables are themselves random. In fact, that was the case in the wine and houseprice examples. It turns out that treating a random X case as if the X’s were fixed is acceptable in many cases, in fact we will see that most of the results we have obtained so far still hold. Moreover, it often makes sense to analyze a regression problem conditional on the the X’s, in which case the predictors are treated as fixed although they originated in some random fashion.
The usual theoretical justification for treating a random predictor as fixed is that in terms of the parameter vector \(\pmb{\beta}\) the predictor random vector is an ancillary statistic, that is the distribution of \(\pmb{X}\) does not depend on \(\pmb{\beta}\).
If the x’s are to be treated as random, one clearly needs to specify the joint distribution of \((y, \pmb{x})\). As usual the most common choice is a multivariate normal distribution, in which case we need to study
\[ cov\begin{pmatrix} y \\ x_1 \\ \vdots \\ x_k \end{pmatrix} = \pmb{\Sigma}\]
We will assume that \(\pmb{y}\) and \(\pmb{X}\) have a joint multivariate normal distribution with mean vector
\[\pmb{\mu} = \begin{pmatrix} \mu_y \\ \pmb{\mu}_x \end{pmatrix}\]
and covariance matrix
\[\pmb{\Sigma} = \begin{pmatrix}\sigma_{yy} & \pmb{\sigma}_{yx}' \\ \pmb{\sigma}_{yx} & \pmb{\sigma}_{xx} \end{pmatrix}\]
By (5.2.13) we have
\[E[y|\pmb{x}] = \mu_y+\pmb{\sigma}_{yx}'\pmb{\Sigma}_{xx}^{-1}(\pmb{x}-\pmb{\mu}_x)=\beta_0+\pmb{\beta_1'x}\]
where
\[\beta_0=\mu_y-\pmb{\sigma}_{yx}'\pmb{\Sigma}^{-1}\pmb{\mu}_x\]
\[\pmb{\beta_1'}=\pmb{\Sigma}_{xx}^{-1}\pmb{\sigma}_{yx}\]
Also from (5.2.13) we have
\[var(y|\pmb{x}) = \pmb{\sigma}_{yy}-\pmb{\sigma}_{yx}'\pmb{\Sigma}_{xx}^{-1}\pmb{\sigma}_{yx}=\sigma^2\]
Note that under this model y is not only linear in \(\pmb{\beta}\) but also linear in the x’s, so this does not allow for a model (say) quadratic in x.
Under the multivariate normal model the maximum likelihood estimators are given by
\[\pmb{\hat{\mu}} = \begin{pmatrix} \bar{y} \\ \pmb{\bar{x}} \end{pmatrix}\]
and
\[\pmb{\hat{\Sigma}} = \frac{n-1}{n}\begin{pmatrix}s_{yy} & \pmb{s}_{yx}' \\ \pmb{s}_{yx} & \pmb{s}_{xx} \end{pmatrix}\]
proof omitted
(Invariance of MLEs)
Let g be some function. Under mild conditions on g, if \(\pmb{\hat{\theta}}\) is the mle of \(\pmb{\theta}\), then \(g(\pmb{\hat{\theta}})\) is the mle of \(g(\pmb{\theta})\).
proof any book on theory of statistics
The mle’s of \(\beta_0\), \(\pmb{\beta}_1\) and \(\sigma^2\) are given by
\[\hat{\beta}_0=\bar{y}-\pmb{s}_{yx}'\pmb{S}^{-1}\pmb{\bar{x}}\]
\[\pmb{\hat{\beta}_1'}=\pmb{S}_{xx}^{-1}\pmb{s}_{yx}\] and
\[\hat{\sigma}^2 = \frac{n-1}{n}\left(s_{yy}-\pmb{s}_{yx}'\pmb{S}_{xx}^{-1} \pmb{s}_{yx}\right)\] proof follows from the invariance of mle’s by applying the functions for the corresponding parameters above.
Notice that these estimators are the same as the least-squares estimators in the fixed x case. However, their distributions are no longer multivariate normal but multivariate t.
The F tests discussed in section 6.6 work equally well in the random x case since they are based on the conditional distributions.
The sample correlation matrix can be written as
\[ \pmb{R} = \begin{pmatrix} 1 & r_{y1}& ... & r_{yk}\\ r_{1y} & 1 & ... & r_{1k}\\ \vdots & \vdots& & ... & \vdots\\ r_{ky} & r_{k1} & ... & 1 \end{pmatrix}= \begin{pmatrix} 1 & \pmb{r}'_{yx} \\ \pmb{r}_{yx} & \pmb{R}_{xx} \end{pmatrix} \] here (for example)
\[r_{y1}=\frac{s_{y2}}{\sqrt{s^2_ys^2_2}}=\frac{\sum(y_i-\bar{y})(x_{i2}-\bar{x}_2)}{\sqrt{\sum(y_i-\bar{y})^2\sum(x_{i2}-\bar{x}_2)^2}}\]
and
\[r_{12}=\frac{s_{12}}{\sqrt{s^2_1s^2_2}}=\frac{\sum(x_{i1}-\bar{x}_1)(x_{i2}-\bar{x}_2)}{\sqrt{\sum(x_{i1}-\bar{x}_1)^2\sum(x_{i2}-\bar{x}_2)^2}}\]
we have \(\pmb{S=DRD}\), where \(\pmb{D}=[diag(S)]^{1/2}\)
\[ \pmb{D} = \begin{pmatrix} s_y & 0& ... & 0\\ 0 & \sqrt{s_{11}} & ... & 0\\ \vdots & \vdots& & ... & \vdots\\ 0 & 0 & ... & \sqrt{s_{kk}} \end{pmatrix}= \begin{pmatrix} s_y & \pmb{0}' \\ \pmb{0} & \pmb{D}_{xx} \end{pmatrix} \] and we can write
\[\pmb{\hat{\beta}}_1=s_y\pmb{D}^{-1}\pmb{R}_{xx}^{-1}\pmb{r}_{yx}\]
Let \(\pmb{x}\) be sample. The the z scores are defined as
\[\pmb{z}=\frac{\pmb{x}-\bar{x}}{{s_x}}\]
Recall that the model in centered form is
\[\hat{y}_i=\bar{y}+\sum_i \hat{\beta}_j(x_{ij}-\bar{x}_j)\]
and so
\[\frac{\hat{y}_i-\bar{y}}{s_y}=\sum_i \frac{s_j}{s_y}\hat{\beta}_j\left(\frac{x_{ij}-\bar{x}_j}{s_j}\right)\]
The coefficients \(\hat{\beta}_j^*=\frac{s_j}{s_y}\hat{\beta}_j\) are called the beta weights or beta coefficients. They can also be found as
\[\pmb{\hat{\beta}}_1^*=\frac1{s_y}\pmb{D}_x\pmb{\hat{\beta}}_1=\pmb{R}_{xx}^{-1}\pmb{r}_{yx}\]
For the houseprice data we find
A=as.matrix(houseprice)
y=A[, 1, drop=FALSE]
ybar=mean(y)
X=cbind(1, A[, -1])
xbar=apply(X[, -1], 2, mean)
sxx=cov(A[, -1])
syx=cov(A)[-1, 1]
betahat= solve(t(X)%*%X)%*%t(X)%*%y
round(c(betahat), 3)
## [1] -67.620 0.086 -26.493 -9.286 37.381
round(c(solve(sxx)%*%cbind(syx)), 3)
## [1] 0.086 -26.493 -9.286 37.381
round(ybar-rbind(syx)%*%solve(sxx)%*%xbar, 3)
## [,1]
## syx -67.62
The sample coefficient of determination is defined by
\[R^2=\frac{\pmb{s}'_{yx}\pmb{S}^{-1}_{xx}\pmb{s}_{yx}}{s_{yy}}\]
For the houseprice data we find
A=as.matrix(houseprice)
y=A[, 1, drop=FALSE]
sxx=cov(A[, -1])
tmp=cov(A)[, 1]
syy=tmp[1]
syx=tmp[-1]
round(rbind(syx)%*%solve(sxx)%*%cbind(syx)/syy, 3)
## syx
## syx 0.886
summary(lm(Price~., data=houseprice))
##
## Call:
## lm(formula = Price ~ ., data = houseprice)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.018 -5.943 1.860 5.947 30.955
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -67.61984 17.70818 -3.819 0.000882
## Sqfeet 0.08571 0.01076 7.966 4.62e-08
## Floors -26.49306 9.48952 -2.792 0.010363
## Bedrooms -9.28622 6.82985 -1.360 0.187121
## Baths 37.38067 12.26436 3.048 0.005709
##
## Residual standard error: 13.71 on 23 degrees of freedom
## Multiple R-squared: 0.8862, Adjusted R-squared: 0.8665
## F-statistic: 44.8 on 4 and 23 DF, p-value: 1.558e-10