Equal Variance

Example (7.3.1)

say we have samples \(X_1,.., X_n\) iid \(N(\mu_x,\sigma_x)\) and \(Y_1,.., Y_m\) iid \(N(\mu_y,\sigma_y)\) and we want to test

\[H_0: \sigma_x=\sigma_y\text{ vs }H_1: \sigma_x \ne \sigma_y\]

We will assume \(\mu_x\) and \(\mu_y\) are known, in which case we can assume \(\mu_x=\mu_y=0\).

Let’s derive the likelihood ratio test. We will do this in terms of the variances \(v_x = \sigma_x^2\) and \(v_y = \sigma_y^2\) , which is of course the same test.

First we find the joint density, using (2.3.14)

\[ \begin{aligned} &f(\pmb{x},\pmb{y}\vert v_x,v_y) = \\ &(2\pi v_x)^{-n/2}\exp \left\{-\frac1{2v_x}\sum x_i^2\right\}(2\pi v_y)^{-m/2}\exp \left\{-\frac1{2v_y}\sum y_i^2\right\} =\\ &(2\pi v_x)^{-n/2}\exp \left\{-\frac{nt_x}{2v_x}\right\}(2\pi v_y)^{-m/2}\exp \left\{-\frac{mt_y}{2v_y}\right\} \end{aligned} \]

where we define \(t_x=\frac1n \sum x_i^2\) and \(t_y=\frac1m \sum y_i^2\).

Now

\[ \begin{aligned} &l(v_x,v_y) =-\frac{n}2\log(2\pi v_x)-\frac{nt_x}{2v_x}-\frac{m}2\log(2\pi v_y)-\frac{nt_y}{2v_y} \\ &\frac{dl}{dv_x} = -\frac{n}{2v_x}+\frac{t_x}{2v_x^2}=0 \end{aligned} \] yields \(\hat{v}_x=t_x\), and similarly \(\hat{v}_y=t_y\)

Under \(H_0\) we have \(v_x=v_y=:v\), and so

\[L(v,v)=(2\pi v)^{-n/2}\exp \left\{-\frac{nt_x}{2v}\right\}(2\pi v)^{-m/2}\exp \left\{-\frac{mt_y}{2v}\right\}=\\ (2\pi v)^{-(n+m)/2}\exp \left\{-\frac{nt_x+mt_y}{2v}\right\}\]

\[l(v,v) =-\frac{n+m}2\log(2\pi v)-\frac{nt_x+mt_y}{2v}\]

and so \(\hat{v}=\frac{nt_x+mt_y}{n+m}\).

\[ \begin{aligned} &\lambda(\pmb{x},\pmb{y}) = \frac{L(\hat{v},\hat{v})}{L(\hat{v_x},\hat{v_y})}=\\ &\frac{(2\pi \hat{v})^{-(n+m)/2}\exp \left\{-\frac{nt_x+mt_y}{2\hat{v}}\right\}}{(2\pi \hat{v}_x)^{-n/2}\exp \left\{-\frac{nt_x}{2\hat{v}_x}\right\}(2\pi \hat{v}_y)^{-m/2}\exp \left\{-\frac{mt_y}{2\hat{v}_y}\right\}} = \\ &\frac{(2\pi \frac{nt_x+mt_y}{n+m})^{-(n+m)/2}\exp \left\{-\frac{nt_x+mt_y}{2\frac{nt_x+mt_y}{n+m}}\right\}}{(2\pi t_x)^{-n/2}\exp \left\{-\frac{nt_x}{2t_x}\right\}(2\pi t_y)^{-m/2}\exp \left\{-\frac{mt_y}{2t_y}\right\}} = \\ &\frac{(\frac{nt_x+mt_y}{n+m})^{-(n+m)/2}\exp \left\{-\frac{n+m}{2}\right\}}{(t_x)^{-n/2}\exp \left\{-\frac{n}{2}\right\}(t_y)^{-m/2}}\exp \left\{-\frac{m}{2}\right\} = \\ &\left(n+m\right)^{(n+m)/2}\frac{(nt_x+mt_y)^{-(n+m)/2}}{(t_x)^{-n/2}(t_y)^{-m/2}} = \\ &\left(n+m\right)^{(n+m)/2}\left(\frac{nt_x+mt_y}{t_x}\right)^{-n/2}\left(\frac{nt_x+mt_y}{t_y}\right)^{-m/2} =\\ &\left(n+m\right)^{(n+m)/2}\left(n+m(t_y/t_x)\right)^{-n/2}\left(n(t_x/t_y)+m\right)^{-m/2} =\\ &\left(n+m\right)^{(n+m)/2}\left(n+n(mt_y/nt_x)\right)^{-n/2}\left(m(nt_x/mt_y)+m\right)^{-m/2} =\\ &\frac{\left(n+m\right)^{(n+m)/2}}{n^{n/2}m^{m/2}}\left(1+1/F)\right)^{-n/2}\left(1+F\right)^{-m/2} \end{aligned} \]

where \(F=(nt_x)/(mt_y)\)

Now LRT is small is equivalent to F is small or large, as we can see here:

n<-10; m<-15
fun <- function(x)  
  (n+m)^((n+m)/2)/n^(n/2)/m^(m/2)*
   (1+1/x)^(-n/2)*(1+x)^(-m/2)
ggcurve(fun=fun, A=0.1, B=3)

and so under the null hypothesis

\[ \begin{aligned} &X_i\sim N(0,\sigma) \\ &X_i/\sigma \sim N(0,1) \\ &X_i^2/v\sim \chi^2(1) = \\ &nt_x/v =\sum_{i=1}^n X_i^2/v \sim \chi^2(n) \\ &F=(nt_x/v)/(mty/v)\sim F(n,m) \end{aligned} \]

and we reject the null if \(F<qf(\alpha/2,n,m)\) or \(F>qf(1-\alpha/2,n,m)\).

n <- 10; m <- 14
x <- rnorm(n, 0, 1)
y <- rnorm(m, 0, 1)
tx <- mean(x^2)
ty <- mean(y^2)
(n*tx)/(m*ty)
## [1] 1.131456
qf(c(0.025, 0.975), n, m)
## [1] 0.2816576 3.1468612
n <- 10; m <- 14
x <- rnorm(n, 0, 1)
y <- rnorm(m, 0, 3)
tx <- mean(x^2)
ty <- mean(y^2)
(n*tx)/(m*ty)
## [1] 0.2153921