Processing math: 100%

Prediction Intervals

Frequentist Solution

There is a different kind of interval estimation problem where the interest is not in a parameter but in a future observation. These type of intervals are called prediction intervals.

Example (6.3.1)

In some experiment each day we collect an observation from a normal distribution with mean μ and standard deviation σ. Say so far we have data x1,..xn. Tomorrow we will carry out this experiment again and collect observation Y. Find a (1α)100% confidence interval for Y, that is find L and U such that

P(L(xx)<Y<U(xx))=0.9

Let’s consider the random variable Z=ˉXY. We know that Z has a normal distribution with mean 0 and

var(Z)=var(ˉXY)=var(ˉX)+var(Y)=var(1nni=1Xi)+var(Y)=1n2ni=1var(Xi)+var(Y)=1n2nσ2+σ2=(1+1/n)σ2

and so Z has standard deviation σ1+1/n. Therefore

1α=P(|Z/(σ1+1/n)|<zα/2)=P(|Z|<zα/2σ1+1/n)=P(zα/2σ1+1/n)<ˉXY<zα/2σ1+1/n)=P(ˉXzα/2σ1+1/n)<Y<ˉX+zα/2σ1+1/n)

and so a (1α)100% confidence interval for Y is given by

(ˉxzα/2σ1+1/n, ˉx+zα/2σ1+1/n)

As a numerical example consider:

mu=10;sigma=1;n=20;alpha=0.05
x=rnorm(n, mu, sigma)
xbar=mean(x)
round(xbar+c(-1,1)*qnorm(1-alpha/2)*sigma/sqrt(1+1/n), 2)
## [1]  8.02 11.85

Notice that this interval does not shrink to a point as n goes to infinity. This makes sense because Y is random and we can never expect to be able to predict it perfectly.

Notice that the derivation of the variance of Z did not depend on Z having a normal distribution. It holds whenever ˆx is used as an estimator.

Bayesian Solution

In a Bayesian setup one needs to find the posterior predictive distribution, which is defined as

f(y|xx)=p(y|θ,xx)p(θ|xx)dθ

Often one can ssume that the future observation is independent of the sample, and so this simplifies to

f(y|xx)=p(y|θ)p(θ|xx)dθ

Example (6.3.2)

Say X1,..,XnN(μ,σ), σ known, and π(μ)=1. Then we know from (3.2.6) that μ|X=xX=xN(ˉx,σ/n) and so

f(y|x)=p(y|μ)p(μ|xx)dμ=12πσ2exp{12σ2(yμ)2}12πσ2/nexp{12σ2/n(μˉx)2}dμ=12πσ2/nexp{12σ2[(yμ)2+n(μˉx)2]}dμ

Now in the brackets in the exponential we have

(μy)2+n(μˉx)2=μ22yμ+y2+nμ22nˉxμ+nˉx2=(n+1)μ22(y+nˉx)μ+(y2+nˉx2)=(n+1)(μ22y+nˉxn+1μ)+(y2+nˉx2)=(n+1)(μ22y+nˉxn+1μ+(y+nˉxn+1)2(y+nˉxn+1)2)+(y2+nˉx2)=(n+1)(μ22y+nˉxn+1μ+(y+nˉxn+1)2)(n+1)(y+nˉxn+1)2+(y2+nˉx2)=(n+1)(μy+nˉxn+1)2(y+nˉx)2n+1+(y2+nˉx2)=(n+1)(μy+nˉxn+1)2+n(yˉx)2n+1 and so

f(y|x)=p(y|μ)p(μ|xx)dμ=12πσ2/nexp{12σ2[(n+1)(μy+nˉxn+1)2+n(yˉx)2n+1]}dμ=12πσ2/n2πσ2n+1exp{12σ2[n(yˉx)2n+1]}×n+12πσ2exp{n+12σ2[(μny+ˉxn+1)2]}dμ=12πσ2/n2πσ2n+1exp{12σ2[n(yˉx)2n+1]}12πσ2(1+1/n)exp{(yˉx)22σ2(1+1/n)} so y|X=xX=xN(ˉx,(1+1/n)σ2). And so credible intervals based on the posterior predictive distribution are the same as the frequentist confidence intervals.