Non-Full Rank Models

Introduction

In many experiments one of a (relatively few) treatments is applied to each subject, and the interest is in the effects of the treatments. For example, in an experiment in agriculture one might treat each plant with one of four different fertilizers. In a medical trial a new vaccince might be compared to an old vaccince as well as to a placebo and maybe even no treatment.

In an Analysis of Variance (ANOVA) the focus is on the mean responses per treatment.

One-Way Model

Example (7.1.1)

Chasnoff and others obtained several measures and responses for newborn babies whose mothers were classified by degree of cocaine use.

The study was conducted in the Perinatal Center for Chemical Dependence at Northwestern University Medical School. The measurement given here is the length of the newborn.

Source: Cocaine abuse during pregnancy: correlation between prenatal care and perinatal outcome Authors: SN MacGregor, LG Keith, JA Bachicha, and IJ Chasnoff Obstetrics and Gynecology 1989;74:882-885

Here is a look at the data

kable.nice(mothers[1:10, ], do.row.names = FALSE)
Status Length
Drug Free 44.3
Drug Free 45.3
Drug Free 46.9
Drug Free 47.0
Drug Free 47.2
Drug Free 47.8
Drug Free 47.8
Drug Free 48.5
Drug Free 48.8
Drug Free 49.6
nrow(mothers)
## [1] 94
table(mothers$Status)
## 
##       Drug Free First Trimester      Throughout 
##              39              19              36

so there were a total of 94 babies. 39 had mothers with no drug use, 19 had mothers who stopped the cocaine use early during pregnancy and 36 had mothers who continued to take cocaine until birth.

Basic Question: does the cocaine use of the mother influence the health of the baby?

A good way to start is with a multiple boxplot:

ggplot(data=mothers , aes(Status, Length)) +
  geom_boxplot()

which does look like that more drug use leads to smaller (and likely less healthy) babies.

Here is another way to summarize the data:

out <- matrix(0, 3, 3)
colnames(out) <- c("Size", "Mean", "SD")
rownames(out) <- unique(mothers$Status)
out[, 1] <- tapply(mothers$Length, 
                   mothers$Status, length)
out[, 2] <- round(tapply(mothers$Length, 
                         mothers$Status, mean), 2)
out[, 3] <- round(tapply(mothers$Length, 
                         mothers$Status, sd), 2)
kable.nice(out)
Size Mean SD
Drug Free 39 51.1 2.9
First Trimester 19 49.3 2.5
Throughout 36 48.0 3.6

and again it seems there is an effect.

How can we write this as a model? Typically this is done with

\[y_{ij}=\mu+\alpha_i+\epsilon_{ij}\]

with i=1,2,3 and j1=39, j2=19, j3=36

Here \(\mu\) is the overall mean (disregarding the treatments), \(\alpha_i\) is the mean effect due to treatment i and \(\epsilon_{ij}\) is the random effect.

We can again use matrix notation to write down this model: Let

\[\pmb{y}= \begin{pmatrix} y_{1:1} \\ \vdots \\ y_{1:39}\\y_{2:1}\\\vdots\\ y_{3:36} \end{pmatrix}\]

\[ \pmb{X} = \begin{pmatrix} 1 & 1& 0 & 0\\ \vdots & \vdots & \vdots & \vdots \\ 1 & 1 & 0 & 0\\ 1 & 0 & 1 & 0\\ \vdots & \vdots & \vdots & \vdots \\ 1 & 0 & 1 & 0\\ 1 & 0 & 0 & 1\\ \vdots & \vdots & \vdots & \vdots \\ 1 & 0 & 0 & 1\\ \end{pmatrix} \]

\[\pmb{\beta} = \begin{pmatrix} \mu \\ \alpha_1 \\ \alpha_2 \\ \alpha_3\end{pmatrix}\]

\[\pmb{\epsilon}= \begin{pmatrix} \epsilon_{1:1} \\ \vdots \\ \epsilon_{1:39}\\\epsilon_{2:1}\\\vdots\\ \epsilon_{3:36} \end{pmatrix}\]

then the model is

\[\pmb{y}=\pmb{X\beta}+\pmb{\epsilon}\]

and we see a model of the same form as in the regression problems before. The main difference is that now the design matrix \(\pmb{X}\) is always of the form as above, that is with 0’s and 1’s.

Here \(\pmb{X}\) is \(96\times 4\) matrix of rank 3 because the first column is equal to sum of the others, which are clearly linearly independent. Because \(\pmb{X}\) is not of full rank \(\pmb{X'X}\) is singular, \((\pmb{X'X})^{-1}\) does not exist and therefore \((\pmb{X'X})^{-1}\pmb{X'y}\) can not be used to estimate \(\pmb{\beta}\).

One reason for this is that the model as written does not determine the parameters uniquely. For example,

\[y_{11} = 10+2+\epsilon_{11} = 8+4+\epsilon_{11}\]

We say the model is overparametrized. Note that this is not an issue of the sample size, it will not go away if n is increased.

There are a number of ways to overcome this problem:

  1. reparametrize the problem with fewer, now unique, parameters
  2. place constraints on the parameters
  3. work with unique linear combinations of the parameters

Here are examples of these technics:

  1. use the model

\[y_{ij}=\mu_i+\epsilon_{ij}\]

  1. Add the condition \(\alpha_1+\alpha_2+\alpha_3=0\) to the model.

  2. rewrite the model in terms of \(\alpha_1-\alpha_2\) and \(\alpha_2-\alpha_3\), that is the changes from one treatment to the next. This makes most sense if (as is the case for our data) there is an ordering of the treatments.

Two-Way Model

Example (7.1.2)

Reference: Loven, Faith. (1981). A Study of the Interlist Equivalency of the CID W-22 Word List Presented in Quiet and in Noise. Unpublished MS Thesis, University of Iowa.

Description: Percent of a Standard 50-word list heard correctly in the presence of background noise. 24 subjects with normal hearing listened to standard audiology tapes of English words at low volume with a noisy background. They repeated the words and were scored correct or incorrect in their perception of the words. The order of list presentation was randomized.

The word lists are standard audiology tools for assessing hearing. They are calibrated to be equally difficult to perceive. However, the original calibration was performed with normal-hearing subjects and no noise background. The experimenter wished to determine whether the lists were still equally difficult to understand in the presence of a noisy background.

kable.nice(hearingaid[1:10, ], do.row.names = FALSE)
Subject List Score
1 1 28
2 1 24
3 1 32
4 1 30
5 1 34
6 1 30
7 1 36
8 1 32
9 1 48
10 1 32

so here the response variable is Score and there are two predictor variables, Subject and List. The interest is specifically in List.

Both Subject and List contain numbers, but these are really labels, so in R we need to turn them into factors:

tmp <- tapply(hearingaid$Score, hearingaid$List, mean)
hearingaid$List <- factor(hearingaid$List,              
                          levels=order(tmp), 
                          ordered = TRUE)
tmp <- tapply(hearingaid$Score, hearingaid$Subject, mean)
hearingaid$Subject <- factor(hearingaid$Subject, 
                             levels=order(tmp), 
                             ordered = TRUE)
ggplot(data=hearingaid, aes(List, Score)) +
         geom_boxplot()

Also the summary table for List:

sum.tbl <-
  data.frame(
    List=c(3, 4, 2, 1),
    n=as.integer(tapply(hearingaid$Score,
               hearingaid$List,length)),
    Mean=round(tapply(hearingaid$Score, 
                       hearingaid$List, mean), 1),
    Sd=round(tapply(hearingaid$Score, 
                    hearingaid$List, sd), 2)
)
rownames(sum.tbl) <- NULL
kable.nice(sum.tbl)
List n Mean Sd
1 3 24 25.2 8.32
2 4 24 25.6 7.78
3 2 24 29.7 8.06
4 1 24 32.8 7.41

Notice the ordering by mean response.

Analogously to the previous discussion we can write the model for this data as

\[y_{ij}=\mu+\alpha_i+\beta_j+\epsilon_{ij}\]

for i=1,..,4 and j=1,..,24

In matrix form we have

\[\pmb{y}= \begin{pmatrix} y_{1:1} \\ \vdots \\ y_{4:24} \end{pmatrix}\]

\[ \pmb{X} = \begin{pmatrix} 1 & 1& 0 & 0 & 0 & 1 & 0 & ... & 0\\ 1 & 1& 0 & 0 & 0 & 0 & 1 & ... & 0\\ \vdots & \vdots & \vdots & \vdots \\ 1 & 1& 0 & 0 & 0 & 0 & 0 & ... & 1\\ 1 & 0& 1 & 0 & 0 & 1 & 0 & ... & 0\\ \vdots & \vdots & \vdots & \vdots \\ 1 & 0& 0 & 0 & 1 & 0 & 0 & ... & 1\\ \end{pmatrix} \]

\[\pmb{\beta} = \begin{pmatrix} \mu & \alpha_1 & ... & \alpha_4 & \beta_1 & ... &\beta_{24}\end{pmatrix}'\]

\[\pmb{\epsilon}= \begin{pmatrix} \epsilon_{11} \\ \vdots \\ \epsilon_{4:24} \end{pmatrix}\]

then the model is

\[\pmb{y}=\pmb{X\beta}+\pmb{\epsilon}\]

How can we create such a matrix \(\pmb{X}\) in R? Note for any block with 1’s in columns 2 to 5 the following column form a diagonal matrix. Let’s illustrate this with i=3,j=4 instead of 24, so it is easier to see:

make.X=function(I, J) {
  X=NULL
  for(i in 1:I) {
    tmp=matrix(0, J, I)
    tmp[, i]=1
    X=rbind(X, cbind(tmp, diag(J)))  
  }
  X=cbind(1, X)
  X
}
make.X(3,4)
##       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
##  [1,]    1    1    0    0    1    0    0    0
##  [2,]    1    1    0    0    0    1    0    0
##  [3,]    1    1    0    0    0    0    1    0
##  [4,]    1    1    0    0    0    0    0    1
##  [5,]    1    0    1    0    1    0    0    0
##  [6,]    1    0    1    0    0    1    0    0
##  [7,]    1    0    1    0    0    0    1    0
##  [8,]    1    0    1    0    0    0    0    1
##  [9,]    1    0    0    1    1    0    0    0
## [10,]    1    0    0    1    0    1    0    0
## [11,]    1    0    0    1    0    0    1    0
## [12,]    1    0    0    1    0    0    0    1

The rank of our matrix \(\pmb{X}\) is

qr(make.X(4,24))$rank
## [1] 27

but there are 29 (=1+4+28) parameters, so again not all can be estimated.