Solution to Homework 9

Problem 1

We have data \(X_1,..,X_n\sim N(\mu_x,\sigma)\) and \(Y_1,..,Y_m\sim N(\mu_y,\sigma)\), \(\sigma\) unknown. (This is the so called two sample problem). Say we want to test

\[H_0:\mu_x=\mu_y\text{ vs. }H_0:\mu_x\ne\mu_y\]

The classical test is based on the test statistic

\[T=\frac{\bar{x}-\bar{y}}{s_p\sqrt{1/n+1/m}}\] where \(s_p^2=\frac{(n-1)s_x^2+(m-1)s_y^2}{n+m-2}\) is called the pooled standard deviation. Under the null hypothesis \(T\sim t(n+m-2)\) and the the test rejects the null hypothesis if \(|T|>qt_{\alpha/2,n+m-2}\).

Derive the two sample problem as a special case of ANOVA.

We have the model \(\pmb{y=X\beta+\epsilon}\) where

\[ \pmb{X} = \begin{pmatrix} \pmb{j_n} & \pmb{j_n}& 0 \\ \pmb{j_m} & 0 & \pmb{j_m} \\ \end{pmatrix}\\ \pmb{X'X}= \begin{pmatrix} n+m & n & m \\ n & n & 0 \\ m & 0 & m \end{pmatrix}\\ (\pmb{X'X})^{-}= \begin{pmatrix} 0 & 1/n & 0 \\ 0 & 0 & 1/m \\ \end{pmatrix}\\ \pmb{\hat{\beta}} = \begin{pmatrix} \mu\\\alpha_1\\\alpha_2 \end{pmatrix}= (\pmb{X'X})^{-}\pmb{X'y}=\\ \begin{pmatrix} 0 & 1/n & 0 \\ 0 & 0 & 1/m \\ \end{pmatrix} \begin{pmatrix} \pmb{j_n} & \pmb{j_m} \\ \pmb{j_n} & 0 \\ 0 & \pmb{j_m} \end{pmatrix} \begin{pmatrix} \pmb{x} \\ \pmb{y} \end{pmatrix}=\\ \begin{pmatrix} 0 & 1/n & 0 \\ 0 & 0 & 1/m \\ \end{pmatrix} \begin{pmatrix} \sum x_i+\sum y_j \\ \sum x_i\\ \sum y_j \end{pmatrix} = \begin{pmatrix} 0 \\ \bar{x}\\ \bar{y} \end{pmatrix} \] The null hypothesis of the two sample problem is equivalent to

\[H_0:\alpha_1-\alpha_2=0\text{ vs. }H_0:\alpha_1-\alpha_2\ne0\] which is a contrast with \(\pmb{c} = \begin{pmatrix} 0 &1 &-1 \end{pmatrix}'\). We can therefore use theorem (9.3.5). We find the test statistic

\[F = \frac{(\bar{x}-\bar{y})^2}{s^2(1/n+1/m)}\] Now

\[ \begin{aligned} &(n+m-2)s^2= \sum_{i,j} (y_{ij}-\bar{y}_{i.})^2=\\ &\sum_{i} (x_{i}-\bar{x})^2+\sum_{j} (y_{j}-\bar{y})^2 = \\ &(n-1)s_x^2+(m-1)s_y^2=s_p^2 \end{aligned} \] and so

\[F = \frac{(\bar{x}-\bar{y})^2}{s_p^2(1/n+1/m)}\]

and under the null hypothesis \(F\sim F(1, n+m-2)\). Clearly \(T^2=F\), and so

\[ \begin{aligned} &\alpha =P(\text{reject }H_0\vert H_0\text{ true}) = \\ &P(F>qf_{\alpha,1,n+m-2}) = \\ &1-P(F<qf(\alpha,1,n+m-2)) = \\ &1-P(T^2<qf_{\alpha,1,n+m-2}) = \\ &1-P\left(-\sqrt{qf_{\alpha,1,n+m-2}}<|T|<\sqrt{qf_{\alpha,1,n+m-2}}\right) = \\ &1-P\left(-qt_{\alpha/2,n+m-2}<|T|<qt_{\alpha/2,n+m-2}\right) = \\ &P\left(|T|>qt_{\alpha/2,n+m-2}\right) \end{aligned} \] and so the two tests are identical.

Note this derivation can also be done by considering a contrast \(\pmb{c} =\begin{pmatrix} 1 & -1 \end{pmatrix}\)

Problem 2

In an experiment an industrial engineer studied the effect of the type of coating and its thickness on the durability of a certain type of paint. There were three types of coatings labeled 1, 2 and 3, and three thickness levels labeled thin, medium and thick. The durability was measured in days. The data is

	thin	medium	thick
1	166, 154, 155, 156, 149	167, 171, 166, 165, 185	181, 185, 178, 178, 174
2	219, 241, 216, 220, 220	263, 241, 246, 245, 224	242, 258, 257, 242, 250
3	277, 276, 277, 278, 280	309, 281, 309, 302, 314	350, 348, 359, 340, 342

Analyze this data to see what effect(s) if any the type of coating and the thickness have on the durability. Which factor-level combination(s) is/are statistically significantly the best?

Clearly we have a balanced two way design. After entering the data into R and sorting the variable type by the mean durability we start by drawing the boxplots:

ggplot(data=df , aes(Type, Durability)) +
  geom_boxplot()

ggplot(data=df , aes(Thickness, Durability)) +
  geom_boxplot()

and so it seems that type has a strong and thickness a weak positive relationship with durability.

Here are the summary statistics:

sum.tbl <-
  data.frame(
    Type=1:3,
    n=as.integer(tapply(df$Durability, df$Type, length)),
    Mean=round(tapply(df$Durability, df$Type, mean), 1),
    Sd=round(tapply(df$Durability, df$Type, sd), 2)
)
kable.nice(sum.tbl, do.row.names = FALSE)

Type	n	Mean	Sd
1	15	168.7	11.57
2	15	238.9	15.51
3	15	309.5	31.11

sum.tbl <-
  data.frame(
    Thickness=c("thin", "medium", "thick"),
    n=as.integer(tapply(df$Durability, df$Thickness, length)),
    Mean=round(tapply(df$Durability, df$Thickness, mean), 1),
    Sd=round(tapply(df$Durability, df$Thickness, sd), 2)
)
kable.nice(sum.tbl, do.row.names = FALSE)

Thickness	n	Mean	Sd
thin	15	218.9	51.87
medium	15	239.2	57.06
thick	15	258.9	71.82

Next we check for interaction:

mns = tapply(df$Durability, df[, -1], mean)
df1=data.frame(Type=rep(1:3, 3),
              Means=c(mns),
              Thickness = 
              factor(rep(c("thin", "medium", "thick"), each=3), 
                     levels=c("thin", "medium", "thick"), ordered = TRUE))
ggplot(data=df1, aes(Type, Means, color=Thickness)) +
  geom_point() +  geom_line()

which indicates some interaction. Let’s check:

fit=aov(Durability~.^2 , data=df)
summary(fit)

##                Df Sum Sq Mean Sq F value   Pr(>F)
## Type            2 148685   74342  946.37  < 2e-16
## Thickness       2  12001    6000   76.38 1.11e-13
## Type:Thickness  4   3959     990   12.60 1.65e-06
## Residuals      36   2828      79

and indeed the interaction term is significant. Therefore both factors are significant.

Let’s check the assumptions. Here is the residual vs. fits plot:

df <- data.frame(Residuals=resid(fit), 
            Fits = fitted(fit))
ggplot(data=df, aes(Fits, Residuals)) +
            geom_point() +
            geom_hline(yintercept = 0)

which shows that the equal variance assumption is justified.

The normal plot of residuals:

df <- data.frame(Residuals=resid(fit), 
            Fits = fitted(fit))
ggplot(data=df, aes(sample=Residuals)) +
           geom_qq() + geom_qq_line()

give some hint of non-normal residuals but is not bad enough to be problematic.

Finally we use Tukey to do pairwise comparisons. Because of the interaction this is done for the factor-level combinations:

tuk <- TukeyHSD(fit)[[3]]

Now the interaction plot shows that the best combination is “3-thick” and the second best is “3-medium”, Tukey yields

tuk["3:thick-3:medium", ]

##         diff          lwr          upr        p adj 
## 4.480000e+01 2.631799e+01 6.328201e+01 5.913594e-08

which shows that these two are stat. significantly different. Therefore “3-thick” is stat. significantly best.