One-way ANOVA

The One-Way Model

We have a model of the form

\[y_{ij}=\mu+\alpha_{i}+\epsilon_{ij}\]

i=1,..,k and j=1,..,ni

Also let \(n=\sum_i n_i\) be the sample size.

We have the assumptions

  1. \(E[\epsilon_{ij}]=0\) for all i,j
  2. \(var(\epsilon_{ij})=\sigma^2\) for all i,j
  3. \(cov(\epsilon_{ij},\epsilon_{rs})=0\) for all \((i,j) \ne (r,s)\)

Often we also have

  1. \(\epsilon_{ij}\sim N_n(0, \sigma^2)\)
    and
  2. \(\sum_i \alpha_i =0\)

The one-way layout is sometimes also called a completely randomized design.

Estimable Functions

For illustration let’s study the case k=3, ni=2. Then

\[ \pmb{y} = \begin{pmatrix} y_{11}\\ y_{12}\\ y_{21}\\ y_{22}\\ y_{31}\\ y_{32}\\ \end{pmatrix}= \begin{pmatrix} \mu+\alpha_{1}\\ \mu+\alpha_{1}\\ \mu+\alpha_{2}\\ \mu+\alpha_{2}\\ \mu+\alpha_{3}\\ \mu+\alpha_{3}\\ \end{pmatrix}+ \begin{pmatrix} \epsilon_{11}\\ \epsilon_{12}\\ \epsilon_{21}\\ \epsilon_{22}\\ \epsilon_{31}\\ \epsilon_{32}\\ \end{pmatrix}= \]

\[ \begin{pmatrix} 1 & 1& 0 & 0 \\ 1 & 1& 0 & 0 \\ 1 & 0& 1 & 0 \\ 1 & 0& 1 & 0 \\ 1 & 0& 0 & 1 \\ 1 & 0& 0 & 1 \\ \end{pmatrix} \begin{pmatrix} \mu \\ \alpha_1 \\ \alpha_2 \\ \alpha_2 \end{pmatrix} + \begin{pmatrix} \epsilon_{11}\\ \epsilon_{12}\\ \epsilon_{21}\\ \epsilon_{22}\\ \epsilon_{31}\\ \epsilon_{32}\\ \end{pmatrix} \]

and \(\pmb{X}\) is \(6\times 4\) of rank 3. Again the vectors \(\pmb{\beta}=\begin{pmatrix} \mu & \alpha_1 & \alpha_2 & \alpha_2 \end{pmatrix}'\). In general, \(\pmb{X}\) for a one-way balanced design is \(n\times (k+1)\) of rank k.

We have seen before that in general contrasts are estimable, that is \(\sum c_i\alpha_i\) is estimable if \(\sum c_i=0\).

If we add some side conditions and denote the constrained parameters as \(\mu^*, \alpha_1^*,..\alpha_k^*\), then these are estimable.

Theorem (7.4.1)

Under the side condition \(\sum \alpha_i^*=0\) the estimators are \(\mu^*=\bar{\mu}_{.}\) and \(\alpha_i^*=\mu_{i}-\bar{\mu}_{.}\), where \(\mu_i=\mu+\alpha_i\) and \(\bar{\mu}_{.}=\frac1{k}\sum_i \mu_i\).

proof

\[ \begin{aligned} &y_{ij}=\mu+\alpha_{i}+\epsilon_{ij} = \mu_i +\epsilon_{ij}\\ &\bar{y}_. = \frac1k\sum_i \mu_i = \frac1k\sum_i (\mu^*+\alpha_i^*) =\\ &\mu^*+\frac1k\sum_i \alpha_i^* = \mu^*\\ &\alpha_i^* = \mu_{i}-\bar{\mu}_{.} \end{aligned} \]

Parameter Estimation

The general one-way model has the design matrix

\[ \pmb{X} = \begin{pmatrix} \pmb{j}_{n_1} & \pmb{j}_{n_1} & \pmb{0} & ... & \pmb{0} \\ \pmb{j}_{n_2} & \pmb{0} & \pmb{j}_{n_3} & ... & \pmb{0} \\ \vdots & \vdots & \vdots & & \vdots \\ \pmb{j}_{n_k} & \pmb{0} & \pmb{0} & ... & \pmb{j}_{n_k} \\ \end{pmatrix} \] where \(\pmb{j}_{n_i}\) is the vector of 1’s of length ni.

The normal equations are

\[ \pmb{X'X\hat{\beta}=X'y} \\ \begin{pmatrix} n & n_1 & n_2 & n_3 & ... & n_k\\ n_1 & n_2 & 0 & 0 & ... & 0\\ n_2 & 0 & n_3 & 0 & ... & 0\\ \vdots&\vdots&\vdots&\vdots&...&\vdots \\ n_k & 0 & 0 &0 &... & n_k \end{pmatrix} \begin{pmatrix} \hat{\mu} \\ \hat{\alpha_1} \\ \vdots\\ \hat{\alpha}_k \end{pmatrix} = \begin{pmatrix} y_{..}\\ y_{1.}\\ \vdots\\ y_{k.}\\ \end{pmatrix} \]

Adding the usual side condition we have the system of equations

\[ \begin{aligned} &n\hat{\mu} +n_1\hat{\alpha}_1+..+n_k\hat{\alpha}_k = y_{..}\\ &n_i\hat{\mu}+n_i\hat{\alpha}_i =y_{i.} \text{; }i=1,..,k \\ &\sum n_i\hat{\alpha}_i=0 \end{aligned} \]

and so

\[ \begin{aligned} &y_{..}=\sum_{i=1}^k y_{i.}=\sum_{i=1}^k (n_i\hat{\mu}+n_i\hat{\alpha}_i) = n\hat{\mu}+\sum_{i=1}^kn_i\hat{\alpha}_i = n\hat{\mu}\\ &\hat{\mu} = y_{..}/n = \bar{y}_{..}\\ &\hat{\alpha}_i =y_{i.}/n_i-\hat{\mu} = \bar{y}_{i.}-\bar{y}_{..} \end{aligned} \]

Example (7.4.2)

In the mothers and cocaine use data we find:

y=mothers$Length
x=mothers$Status
ni=table(x)
n=sum(ni)
y.. = sum(y)
yi. = tapply(y, x, sum)
round(c(y../n, yi./ni-y../n), 3)
##                       Drug Free First Trimester      Throughout 
##          49.549           1.551          -0.249          -1.549

Example (7.4.3)

The flammability of fabric used in children’s sleepwear is tested by placing a flame under a piece of fabric until it ignites. The flame is then removed, and the fabric stops burning. The length of the charred portion of the fabric is measured. In the data set pieces of the same cloth were sent to five different laboratories, which then carried out this experiment eleven times.

Research Question: Are there differences between the labs?

kable.nice(head(flammability), do.row.names = FALSE)
Labs Length
Lab 1 2.9
Lab 1 3.1
Lab 1 3.1
Lab 1 3.7
Lab 1 3.1
Lab 1 4.2
table(flammability$Labs)
## 
## Lab 1 Lab 2 Lab 3 Lab 4 Lab 5 
##    11    11    11    11    11

so here we have a balanced one-way design with k=5 and ni=11, i=1,..,k. Now

y=flammability$Length
x=flammability$Labs
ni=rep(11, 5)
n=sum(ni)
k=5
y..=sum(y)
yi.=tapply(y, x, sum)
round(c(y../n, yi./ni-y../n), 3)
##         Lab 1  Lab 2  Lab 3  Lab 4  Lab 5 
##  3.376 -0.040  0.224 -0.076 -0.376  0.269

As previously discussed another approach to estimation is to use the generalized inverse. We can easily verify that a generalized inverse of \(\pmb{X'X}\) is given by

\[ (\pmb{X'X})^{-} = \begin{pmatrix} 0 & 0 & 0 & 0 & ... & 0\\ 0 & 1/n_1 & 0 & 0 & ... & 0\\ 0 & 0 & 1/n_2 & 0 & ... & 0\\ \vdots&\vdots&\vdots&\vdots&...&\vdots \\ 0 & 0 & 0 &0 &... & 1/n_k \end{pmatrix} \] and so a solution to the normal equations is given by

\[ \pmb{\hat{\beta}} = (\pmb{X'X})^{-}\pmb{X'y} =\\ \begin{pmatrix} 0 & 0 & 0 & 0 & ... & 0\\ 0 & 1/n_1 & 0 & 0 & ... & 0\\ 0 & 0 & 1/n_2 & 0 & ... & 0\\ \vdots&\vdots&\vdots&\vdots&...&\vdots \\ 0 & 0 & 0 &0 &... & 1/n_k \end{pmatrix} \begin{pmatrix} \pmb{j}_{n_1} & \pmb{j}_{n_1} & \pmb{0} & ... & \pmb{0} \\ \pmb{j}_{n_2} & \pmb{0} & \pmb{j}_{n_3} & ... & \pmb{0} \\ \vdots & \vdots & \vdots & & \vdots \\ \pmb{j}_{n_k} & \pmb{0} & \pmb{0} & ... & \pmb{j}_{n_k} \\ \end{pmatrix} \begin{pmatrix} y_{11} \\\vdots\\y_{kn} \end{pmatrix}= \begin{pmatrix} 0 \\\bar{y}_{1.} \\\vdots\\ \bar{y}_{k.} \end{pmatrix} \]


An estimator of \(\sigma^2\)

Definition (7.4.4)

An unbiased estimator for \(\sigma^2\) is given by \(s^2=\text{SSE}/[k(n-1])\), where

\[\text{SSE}=\pmb{y'y-\hat{\beta}X'y}=\pmb{y'[I-X(X'X)^{-}X']y}\] Alternative formulas are

\[s^2=\frac{\sum_{ij}y_{ij}^2-\sum_i y_{i.}^2/n_i}{n-k} = \frac{\sum_{ij} (y_{ij}-\bar{y}_{i.})^2}{n-k}\]

Example (7.4.5)

For the children’s ware data we find

n=55;k=5
ybar = tapply(flammability$Length, flammability$Labs, mean)
sse=sum( (flammability$Length-rep(ybar,each=11))^2 )
sse/(n-k)
## [1] 0.1646545

Hypothesis Testing

Say we want to test \(H_0:\mu_1=..=\mu_k\). As always for testing we now also assume

\[\pmb{y}\sim N_{n}(\pmb{X\beta}, \sigma^2\pmb{I})\]

Adding the usual side condition the null hypothesis becomes

\[H_0: \alpha_1=..=\alpha_k=0\]

Under the null hypothesis the reduced model is \(\pmb{y}=\mu\pmb{j}+\pmb{\epsilon}\).

For the full model we have

\[\text{SS}(\mu,\alpha) = \pmb{\hat{\beta}X'y} = \sum_i y^2_{i.}/n_i\]

and for the reduced model we find

\[\hat{\mu} = (\pmb{j_n'j_n})^{-1}\pmb{j_n'y}=\bar{y}_{..}\]

\[\text{SS}(\mu) = \hat{\mu}\pmb{j_n'y} = y^2_{..}/n \]

By the ANOVA table just before theorem (9.3.3) the sum of squares for the \(\alpha\)’s adjusted for \(\mu\) is given by

\[\text{SS}(\alpha|\mu) = \text{SS}(\alpha,\mu)-\text{SS}(\mu) = \sum_i y^2_{i.}/n_i-y^2_{..}/n\]

all this is summarized in the

ANOVA table for one-way design

\[ \begin{array}{cccc} \text{Source} & \text{df} & \text{SS} & \text{F} \\ \hline \\ \text{Treatments} & k-1 & \text{SS}(\alpha|\mu)=\sum_i y^2_{i.}/n_i-y^2_{..}/n & \frac{\text{SS}(\alpha|\mu)/(k-1)}{\text{SSE}/[k(n-1)]}\\ \text{Error} & n-k & \text{SSE}=\sum_{ij} (y_{ij}-\bar{y}_{i.})^2 & \\ \text{Total} & n-1 & \text{SST}=\sum_{ij}y^2_{ij} - y^2_{..}/n &\\ \hline \\ \end{array} \]

Example (7.4.6)

In the mothers and cocaine use data we find:

y=mothers$Length
x=mothers$Status
k=3
ni=table(x)
n=sum(ni)
y..=sum(y)
yi.=tapply(y, x, sum)
sse=sum(y^2)-sum(yi.^2/ni)
ssalpha=sum(yi.^2/ni)-y..^2/n
FTS=(ssalpha/(k-1))/(sse/(n-k))
round(c(ssalpha/c(1, k-1), sse/c(1, n-k), FTS, 1-pf(FTS, k-1, k*(n-1))), 3)
## [1] 181.375  90.687 885.580   9.732   9.319   0.000

or of course

summary(aov(y~x))
##             Df Sum Sq Mean Sq F value   Pr(>F)
## x            2  181.4   90.69   9.319 0.000208
## Residuals   91  885.6    9.73

and so we find that the babies do not have equal lengths.

Example (7.4.7)

For the children’s sleep ware data:

y=flammability$Length
x=flammability$Labs
ni=rep(11, 5)
n=sum(ni)
k=5
y..=sum(y)
yi.=tapply(y, x, sum)
sse=sum(y^2)-sum(yi.^2/ni)
ssalpha=sum(yi.^2/ni)-y..^2/n
FTS=(ssalpha/(k-1))/(sse/(n-k))
round(c(ssalpha/c(1, k-1), sse/c(1, n-k), FTS, 1-pf(FTS, k-1, k*(n-1))), 3)
## [1] 2.987 0.747 8.233 0.165 4.535 0.001

or of course

summary(aov(y~x))
##             Df Sum Sq Mean Sq F value  Pr(>F)
## x            4  2.987  0.7466   4.535 0.00334
## Residuals   50  8.233  0.1647

and so we find that the Labs do not have equal means.

Contrasts

In the case of a one-way model a contrast can be written in terms of the \(\alpha_i\)’s or the \(\mu_i\)’s:

\[\sum_i c_i\mu_i=\sum_i c_i(\mu+\alpha_i)=\mu\sum_i c_i+\sum_i c_i\alpha_i=\sum_i c_i\alpha_i\]

Tests of contrasts are essentially comparisons of means:

\(H_0: 2\mu_1-\mu_2-\mu_3=0\leftrightarrow H_0: \mu_1=\frac{\mu_2+\mu_3}2\)

To test these hypotheses we can use theorem (9.3.5). Here m=1 and the test statistic becomes

\[F =\frac{(\pmb{c'\hat{\beta}})[\pmb{c'(X'X)^{-}c]^{-1}c'\hat{\beta}}}{\text{SSE}/(n-k)} =\frac{(\sum_i c_i \bar{y_{i.}})^2}{s^2\sum_i c_i^2/n_i}\] where \(s^2=\text{SSE}/(n-k)\)

Example (7.4.8)

Let’s have a look at the children’s sleep ware data:

y=flammability$Length
x=factor(flammability$Labs)
ni=rep(11, 5)
n=sum(ni)
k=5
ybar=tapply(y, x, mean)
ybar
##    Lab 1    Lab 2    Lab 3    Lab 4    Lab 5 
## 3.336364 3.600000 3.300000 3.000000 3.645455

so it seems that mean of Lab 1 is equal to the mean of Lab 3 but Lab 4 is different from Lab 5.

Let’s test whether the mean of Lab 1 is equal to the mean of Lab 3. So we have \(H_0:\mu_1=\mu_3\) or \(H_0:\mu_1-\mu_3=0\) or

\[H_0: \begin{pmatrix} 1 & 0 & -1 & 0 & 0 \end{pmatrix}\begin{pmatrix} \mu_1 \\ \mu_2 \\ \mu_3 \\ \mu_4 \\ \mu_5 \end{pmatrix}=0\]

sse=sum( (y-rep(ybar,each=11))^2 )
cc=c(1, 0, -1, 0, 0)
num=sum(cc*ybar)^2
s2=sse/(n-k)
denom=s2*sum(cc^2/ni)
FTS=num/denom
round(c(FTS, 1-pf(FTS, 1, n-k)), 4)
## [1] 0.0442 0.8344

and we fail to reject the null hypothesis, the means of Lab 1 and Lab 3 are the same.

We can also use R to test for contrasts:

library(multcomp)
fit=aov(y~x)
fit.gh=glht(fit, linfct = mcp(x = cc))
summary(fit.gh)
## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: User-defined Contrasts
## 
## 
## Fit: aov(formula = y ~ x)
## 
## Linear Hypotheses:
##        Estimate Std. Error t value Pr(>|t|)
## 1 == 0  0.03636    0.17302    0.21    0.834
## (Adjusted p values reported -- single-step method)

Note that because the F distribution has 1 degree of freedom the F test is also a t-test, and

(0.21)^2
## [1] 0.0441

similarly testing Lab 4 vs. Lab 5 we find

cc=c(0, 0, 0, 1, -1)
num=sum(cc*ybar)^2
s2=sse/(n-k)
denom=s2*sum(cc^2/ni)
FTS=num/denom
round(c(FTS, 1-pf(FTS, 1, n-k)), 4)
## [1] 13.9162  0.0005

and now we reject the null hypothesis.

and again with Rs:

fit.gh=glht(fit, linfct = mcp(x = cc))
summary(fit.gh)
## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: User-defined Contrasts
## 
## 
## Fit: aov(formula = y ~ x)
## 
## Linear Hypotheses:
##        Estimate Std. Error t value Pr(>|t|)
## 1 == 0  -0.6455     0.1730   -3.73 0.000489
## (Adjusted p values reported -- single-step method)

Note that because the F distribution has 1 degree of freedom the F test is also a t-test, and

(-3.73)^2
## [1] 13.9129

Orthogonal Contrasts

Definition (7.4.9)

Two contrasts \(\pmb{c'_i\hat{\beta}}\) and \(\pmb{c'_j\hat{\beta}}\) are orthogonal iff \(\pmb{c'_ic_j}=0\).


Note that

\[cov(\pmb{c'_i\hat{\beta}},\pmb{c'_j\hat{\beta}})=\pmb{c'_i(X'X)^{-}c_j}=\\ \pmb{c'_i}diag(0,1/n,..,1/n)\pmb{c_j}=\pmb{c'_ic_j}/n=0\]

iff \(\pmb{c'_ic_j}=0\). But we also have the normal assumption, and so two contrasts are orthogonal iff \(\pmb{c'_ic_j}=0\).

(Here we added a 0 to all c’s)

A similar argument shows that the sums of squares \((\pmb{c'_i\hat{\beta}})^2/\pmb{c'_i}(X'X)^{-}c_i\) and \((\pmb{c'_j\hat{\beta}})^2/\pmb{c'_i}(X'X)^{-}c_j\) are also independent.

Theorem (7.4.10)

In the balanced one-way model, if \(\pmb{y}\sim N_{n}(\pmb{X\beta}, \sigma^2\pmb{I})\) and if \(H_0:\alpha_1=..=\alpha_k\) is written as \(\pmb{C\beta}=0\) where the rows of \(\pmb{C}=(c_i')\), i=1,..k-1, are mutually orthogonal contrasts, then

\[\text{SSH}=\pmb{(C\hat{\beta})'[C(X'X)^{-}C']C\hat{\beta}}=\sum_{i=1}^{k-1} \frac{(c'_i\hat{\beta})^2}{c'_i(X'X)^{-}c_i}\] and the sums of squares \(c'_i(X'X)^{-}c_i\) are independent.

proof

By the calculation above \(C(X'X)^{-}C'\) is a diagonal matrix with \(c'_i(X'X)^{-}c_i\) on the diagonal, which shows the second equality. The first follows from the independence of the orthogonal contrasts.

Notice a consequence of this theorem

Corollary (7.4.11)

Let \(F\) be the test statistic of the overall test, then

\[F=\frac{\text{SSH}/(k-1)}{s^2}=\frac1{k-1}\sum_{i=1}^{k-1} \frac{(c'_i\hat{\beta})^2}{c'_i(X'X)^{-}c_i}=\frac1{k-1}\sum_{i=1}^{k-1} F_i\] where \(F_i\) is the test statistic for the ith contrast.

Example (7.4.12)

For the children’s sleep ware data:

There are a number of standard orthogonal contrasts. One of them is

C=t(contr.helmert(5))
C
##       1  2  3  4 5
## [1,] -1  1  0  0 0
## [2,] -1 -1  2  0 0
## [3,] -1 -1 -1  3 0
## [4,] -1 -1 -1 -1 4

this compares 1-2, the means of 1,2 with 3, and so on.

C%*%t(C)
##      [,1] [,2] [,3] [,4]
## [1,]    2    0    0    0
## [2,]    0    6    0    0
## [3,]    0    0   12    0
## [4,]    0    0    0   20

shows that these are indeed orthogonal contrasts.

Let’s apply this to our data, but let’s order the the groups first by their means

x=factor(flammability$Labs, levels = levels(x)[order(ybar)], 
         ordered = TRUE)
ybar=sort(ybar)
FTS=rep(0, 4)
for(i in 1:4) {
   num=n*sum(C[i, ]*ybar)^2
   denom=s2*sum(C[i, ]^2)
   FTS[i]=num/denom
}
FTS
## [1] 15.031471  7.734283 37.691402 30.234099
pvalue=1-pf(FTS, 1, k*(n-1))
round(cbind(FTS, pvalue), 3)
##         FTS pvalue
## [1,] 15.031  0.000
## [2,]  7.734  0.006
## [3,] 37.691  0.000
## [4,] 30.234  0.000
FTS.all=sum(FTS)/(k-1)
round(c(FTS.all, 1-pf(FTS.all, k-1, k*(n-1))), 3)
## [1] 22.673  0.000

and we see that the overall test is the same as before, in (9.2.5)

sqrt(FTS)
## [1] 3.877044 2.781058 6.139332 5.498554
fit=aov(y~x)
fit.gh=glht(fit, linfct = mcp(x = C))
summary(fit.gh)
## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: User-defined Contrasts
## 
## 
## Fit: aov(formula = y ~ x)
## 
## Linear Hypotheses:
##        Estimate Std. Error t value Pr(>|t|)
## 1 == 0   0.3000     0.1730   1.734   0.3065
## 2 == 0   0.3727     0.2997   1.244   0.6210
## 3 == 0   1.1636     0.4238   2.746   0.0328
## 4 == 0   1.3455     0.5471   2.459   0.0672
## (Adjusted p values reported -- single-step method)

Notice the last line of the printout: Adjusted p values reported – single-step method): because these tests are independent on can use the Bonfferoni adjustment, so what is reported here is the family-wise error rate. However

round((k-1)*pvalue,4)
## [1] 0.0005 0.0232 0.0000 0.0000

shows that the method used by R is slightly different.