Below is the data from a normal distribution with mean 0 and variance \(\theta\).
-3.79 -2.32 -2.3 -2.1 -2.03 -1.37 -1.33 -1.04 -0.76 -0.75 -0.68 -0.68 -0.67 -0.64 -0.57 -0.52 -0.5 -0.47 -0.45 -0.44 -0.4 -0.39 -0.38 -0.26 -0.24 -0.24 -0.23 -0.15 -0.12 -0.09 -0.01 0.01 0.04 0.05 0.08 0.16 0.17 0.27 0.5 0.66 0.71 0.77 0.78 0.87 1.07 1.31 1.9 2.52 2.69 2.94
\[H_0:\theta=\theta_0\text{ vs }H_a:\theta>\theta_0\]
based on the \(\chi^2\) approximation and apply it to the data with \(\theta_0=1,\alpha=0.05\) by finding the p-value.
\[ \begin{aligned} &f(\pmb{x}\vert \theta) =(2\pi\theta)^{-n/2}\exp \left\{-\frac1{2\theta}\sum_{i=1}^nx_i^2 \right\} \\ &l(\theta\vert\pmb{x}) = -\frac{n}2\log(2\pi\theta)-\frac1{2\theta}\sum_{i=1}^nx_i^2 \\ &l'(\theta\vert\pmb{x}) = -\frac{n}{2\theta}+\frac1{2\theta^2}\sum_{i=1}^nx_i^2=0\\ &\hat{\theta} = \overline{x^2}\\ \end{aligned} \]
so
\[ \begin{aligned} &(-2)\log \lambda({\pmb{x}}) = 2\left(l(\hat{\theta})-l(\hat{\hat{\theta}}) \right) =\\ &2\left(-\frac{n}2\log(2\pi\hat{\theta})-\frac1{2\hat{\theta}}\sum_{i=1}^nx_i^2+\frac{n}2\log(2\pi\theta_0)+\frac1{2\theta_0}\sum_{i=1}^nx_i^2 \right) = \\ &n\left[\log(\theta_0/\hat{\theta})\right]+\left[\frac1{\theta_0}-\frac1{\hat{\theta}}\right]\sum_{i=1}^nx_i^2 = \\ &n\left[\log(\theta_0/\overline{x^2})\right]+\left[\frac1{\theta_0}-\frac1{\overline{x^2}}\right]n\overline{x^2} = \\ &n\left[\log(\theta_0/\overline{x^2})\right]+\frac{n\overline{x^2}}{\theta_0}-n \end{aligned} \]
theta0=1
n=length(x)
x2bar=mean(x^2)
lrt=n*(log(theta0/x2bar))+(n*x2bar/theta0-n)
round(c(x2bar, lrt, 1-pchisq(lrt,1)), 3)
## [1] 1.508 4.862 0.027
0.0274<0.05, so we reject the null hypothesis.
Let’s draw \(-2\log\lambda(\pmb{x})\) as a function of \(\overline{x^2}\). Because we are testing \(H_a:\theta>\theta_0\) we are only interested in values of \(\overline{x^2}\) larger than \(\theta_0\):
curve(n*log(theta0/x)+n*x/theta0-n, theta0, 2)
so we see that the \(-2\log\lambda(\pmb{x})\) is large if \(\overline{x^2}\) is large. Therefore the equivalent test has the rejection region \(\{\overline{x^2}>c\}\), which is the same as \(\{\sum_i x_i^2>c\}\)
Now
\[ \begin{aligned} &X_i\sim N(0, \sqrt{\theta_0}) \\ &X_i/\sqrt{\theta_0}\sim N(0, 1) \\ &X_i^2/\theta_0\sim \chi^2(1) \\ &\frac1{\theta_0}\sum_i X_i^2\sim \chi^2(n) \end{aligned} \]
Let \(Z_0\sim\chi^2(n)\), then the critical value is
\[\alpha =P(Z_0>cr)=1-pchisq(cr,n)\\ cr=qchisq(1-\alpha, n)\]
and the p-value is
\[p =P\left(Z>\sum_i x_i^2)\right) =1-pchisq(\sum_i x_i^2,n)\]
1-pchisq(sum(x^2),n)
## [1] 0.01162301
Say \(X_i\sim N(0,\sqrt{\theta_1})\), then \(Z_1=\frac1{\theta_1}\sum_{i=1}^n X_i^2\sim\chi^2(n)\)
\[ \begin{aligned} &P(Z_1>cr) = P(Z_1/\theta_1>cr/\theta_1) = 1-pchisq(cr/\theta_1,n)\\ \end{aligned} \]
alpha=0.05
cr=qchisq(1-alpha, n)
curve(1-pchisq(cr/x, n), 1, 2.1,
lwd=2, col="blue", ylab="Power", xlab=expression(theta))
Note that \(X_1^2/\theta_0\sim \chi^2(1)\), and so
\[E[X_1^2]=E[X_1^2/\theta_0]\theta_0=\theta_0\]
\[var(X_1^2)=var(X_1^2/\theta_0)\theta^2_0=2\theta^2_0\] therefore by the clt:
\[T=\sqrt{n}\frac{\overline{X^2}-\theta_0}{\sqrt2 \theta_0}\sim N(0,1)\] \[ \begin{aligned} &\alpha =P(T>cr) \\ &cr =qnorm(1-\alpha) \end{aligned} \]
TS=sqrt(n)*(x2bar-1)/sqrt(2)
1-pnorm(TS)
## [1] 0.005535021
Say \(X_i\sim N(0, \theta_1)\)
\[ \begin{aligned} &P(T>cr) =P(\sqrt{n}\frac{\overline{X^2}-\theta_0}{\sqrt2 \theta_0}>cr)= \\ &P(\sqrt{n}\frac{\overline{X^2}-\theta_1+\theta_1-\theta_0}{\sqrt2 \theta_1}>cr\frac{\theta_0}{\theta_1}) = \\ &P(\sqrt{n}\frac{\overline{X^2}-\theta_1}{\sqrt2 \theta_1}>cr\frac{\theta_0}{\theta_1}-\sqrt{n}\frac{\theta_1-\theta_0}{\sqrt2 \theta_1}) = \\ &1-pnorm\left(cr\frac{\theta_0}{\theta_1}-\sqrt{n}\frac{\theta_1-\theta_0}{\sqrt2 \theta_1}\right) \end{aligned} \]
curve(1-pchisq(cr/x, n), 1, 2.1,
lwd=2, col="blue", ylab="Power", xlab=expression(theta))
cr1=qnorm(1-alpha)
fun=function(x) 1-pnorm(cr1/x-sqrt(n)*(x-1)/sqrt(2)/x)
curve(fun, 1, 2.1,
lwd=2, col="red", add=TRUE)
so the Wald type test is more powerful (for these n, \(\alpha\), etc)