A Longer Example - Continued

X₁, .., X_n iid F with f(x|a)=ax^a-1, x>0, a>0 (or simply X~Beta(a,1))

and we want to find a (1-α)100% condifence interval for a.

Let's again first consider the case n=1. Previously we had the following hypothesis test for H₀:a=a₀ vs H_a:a≠a₀: reject the null hypothesis if

X<(α/2)^1/a0 or X>(1-α/2)^1/a0

therefore we

for example if we observe X=0.45 a 95% CI for a is (0.032, 4.620)

This is illustrated here:

Here we split α 50-50 on the left and the the right. Is this optimal? Let's put λα on the left and (1-λ)α on the right for some 0≤λ≤1. Then

and now if we observe X=0.45 a 95% CI for a is (0, 3.75)

Note that these intervals always start at 0, which might not be a good idea for some experiments.

How about the case n=2?

We have qbeta2(α/2,a)=x which is equivalent to pbeta2(x,a)=α/2, and we need to solve this for a. Again this needs to be done numerically:

invbeta2 <-
function (x, alpha=0.05)
{
x=sum(x)
pbeta2=function(x,a) {integrate(dbeta2,0,x,a=a)$value}
low=0
repeat {
low=low+1
flow=pbeta2(x,low)
print(c(low,flow))
if(flow<1-alpha/2) break
}
high=low
low=low-1
repeat {
mid=(low+high)/2
fmid=pbeta2(x,mid)
print(c(mid,fmid))
if(abs(fmid-(1-alpha/2))<0.001) break
if(fmid>(1-alpha/2)) low=mid
else high=mid
}
L=mid
low=0
repeat {
low=low+1
flow=pbeta2(x,low)
print(c(low,flow))
if(flow<alpha/2) break
}
high=low
low=low-1
repeat {
mid=(low+high)/2
fmid=pbeta2(x,mid)
print(c(mid,fmid))
if(abs(fmid-alpha/2)<0.001) break
if(fmid>alpha/2) low=mid
else high=mid
}
H=mid
c(L,H)/2
}

For example, say we observe (x,y) = (0.31, 0.59) note that (x+y)/2 = 0.45, same as above with n=1. Then we find a 95% CI for a to be (0.039, 1.53)

Previously we also used simulation to estimate pbeta2(x,a) If we could not calculate it numerically we could use this as well, but clearly we are going to get routines that take take rather a long time to run.

And then n large:

so the acceptance region of the test is

to get a confidence interval we need to "invert the test". This means solving the double-inequality in the acceptance region for a (which now replaces a₀). The next graph shows the left and the right side of the double-inequality as a function of a, (blue=α/2, red=1-α/2, n=50,X̅=0.4)

and we get an interval (0.48, 1.45)

Formally we need to solve the equation

for z=z_α/2 and z=z_1-α/2. Let x=X̅, then

so now we have a cubic equation, which we can solve. In R this is done by the routine cuberoots. Of course there are generally three roots, but one of them is negative and the other two are our solutions. Note that z_α/2²=qnorm(1-α/2)²=qnorm(α/2)²=z_1-α/2² , so we only need to solve one equation.

beta3.ex(1) does the calculations.

How about the mle? This one is easy:

beta3.ex(2) does the work.

Bayesian Interval
Previously we saw that if we use a prior Exp(1) we get a posterior distribution a|x~Γ(n+1,1/(T+1)) using this we can find the equal-tail probability interval by solving

which is (almost) the same as the confidence interval based on the LRT test.

Which Interval is Better? First of frequentist confidence intervals and Bayesian credible intervals can not really be compared directly. We will just compare the two confidence intervals.

First we need to check that the two methods yield proper confidence intervals, that is that they have coverage. So if we calculate a 90% interval it really contains the true parameter 90% of the time. This is true for the LRT interval by construction because we could find the exact distribution of the LRT statistic and invert the interval analytically. The other test is a large sample test and needs to be checked via simulation. Here are some results (red=MM, blue=LRT)

so we see that the MM method has some over-coverage for small n. Over-coverage is acceptable but not really desirable because it generally mean larger intervals.

How can we choose between the two? We need a measure of performance for a confidence interval. Again we can consider the mean length of the interval E[U(X)-L(X)]. In the case of the MM method this will need to be found via simulation. As for the LRT method, we have

Here is a graph of the mean lengths:

Above we used the fact that aT has distribution that does not depend on the parameter a. We then found the interval by dividing the error probability α equally on the left and the right. But is that the best option? If we are interested in shortest-length intervals, can we find the intervals that are optimal? That is can we find L(T) and U(T) such that

minimize E[U(T)-L(T)] subject to P(L(T)<a<U(T))=1-α

Let's set q(z)=qgamma(z,n,1) and let's consider intervals of the form L(T)=q(z₁)/T and U(T)=q(1-z₂)/T. Clearly we need z₁+z₂=α , so we have z₂=1-α+z₁ or we just write

L(T)=q(z)/T and U(T)=q(1-α+z)/T

This has length

This can not be done analytically, and so again we need to resort to a numerical solution. In beta3.ex(3) i use a simple "grid-search" algorithm. It turns out that assigning a smaller part of α to the left side leads to shorter intervals.

How much better are these intervals? Here are the mean lengths (for a=1)

For example if n=2 (and a=1) the mean length of the equal-tail intervals is 5.33 but for the "optimal" length intervals it is 4.72, an improvement of about 10%