Conditional Expectation and Variance

Definition (1.9.1)

Say \(X|Y=y\) is a conditional r.v. with pdf \(f\). Then the conditional expectation of \(g(X)|Y=y\) is defined by

  • \(X\) discrete

\[E[X|Y=y]=\sum_x xf_{X|Y=y}(x|y)\]

  • \(X\) continuous

\[E[X|Y=y]= \int_{-\infty}^{\infty} xf_{X|Y=y}(x|y)dx\]

There is really nothing new here, a conditional expectation is simply the expectation of a conditional random variable.

Example (1.9.2)

say \((X,Y)\) is a discrete random vector with joint pdf

1 2
0 0.1 0.1
1 0.0 0.5
2 0.1 0.2

Let’s find \(E[X|Y=1]\). For that we need

\[f_{X|Y=1}(x|1)=\frac{f(x,1)}{f_Y(1)}\]

so

1 2
0 0.1 0.1
1 0.0 0.5
2 0.1 0.2
P(Y=y) 0.2 0.8

Now \(f_{X|Y=1}(0|1) = \frac{f(0,1)}{f_Y(1)} = \frac{0.1}{0.2}=0.5\), so

x P(X=k|Y=1)
0 0.5
2 0.5

and so

\[E[X|Y=1] = 0\times 0.5+2 \times 0.5 =1\]

Example (1.9.3)

Say \((X,Y)\) is a discrete rv with joint pdf \(f(x,y)=(1-p)^2p^x\), \(x,y \in \{0,1,..\}\), \(y \le x\), and \(0<p<1\). Find \(E[Y|X=x]\).

first we need the conditional density \(f_{Y|X=x}(y|x)\), and for that we need the marginal density \(f_X(x)\):

\[f_X(x)=\sum_{y=0}^x (1-p)^2p^x=(x+1)(1-p)^2p^x\]

\[f_{Y|X=x}(y|x)= \frac{(1-p)^2p^x}{(x+1)(1-p)^2p^x} =\frac{1}{x+1}\] for \(y=0,..,x\), so we find \(Y|X=x\sim U\{0,..,x\}\)

\[E[Y|X=x]=\sum_{y=0}^x y\frac1{x+1}=\frac1{x+1}\frac{x(x+1)}{2}=\frac{x}2\]

Example (1.9.4)

Say \(X\) and Y have a joint density \(f(x,y)=8xy\), \(0 \le x<y \le 1\).

We previously found \(f_Y(y) = 4y^3\), \(0<y<1\), and \(f_{X|Y=y}(x|y) = 2x/y^2\), \(0 \le x \le y\). So

\[E[X|Y=y]= \int_{-\infty}^{\infty} x f_{X|Y=y}(x|y)dx = \int_0^y x\times2x/y^2dx=2x^3/(3y^2)|_0^y=2y/3\]

Throughout this calculation we treated y as a constant. Now, though, we can change our point of view and consider \(E[X|Y=y] = 2y/3\) as a function of y:

\[g(y)=E[X|Y=y]=2y/3\]

What are the values of y? Well, they are the observations we might get from the rv. Y, so we can also write

\[g(Y)=E[X|Y=Y]=2Y/3\]

but \(Y\) is a rv, then so is \(2Y/3\), and we see that we can define a rv

\[Z=g(Y)=E[X|Y]\]

Recall that the expression \(f_{X|Y}\) does not make sense. Now we see that on the other hand the expression \(E[X|Y]\) makes perfectly good sense!

Let’s continue this example and find the conditional variance of \(X|Y=y\):

\[ \begin{aligned} &var(X|Y=y) =E[X^2|Y=y]-(E[X|Y=y])^2 \\ &E[X|Y=y]= \int_0^y x^2\times2x/y^2dx=x^4/(2y^2)|_0^y=y^2/2 \\ &var(X|Y=y) =y^2/2-(2y/3)^2 = y^{2}/18\\ \end{aligned} \]

and again we can consider the conditional variance of X|Y:

\[var(X|Y)=Y^2/18\]

Example (1.9.5)

An urn contains 2 white and 3 black balls. We pick two balls from the urn. Let \(X\) be denote the number of white balls chosen. An additional ball is drawn from the remaining three. Let Y equal 1 if the ball is white and 0 otherwise.

For example

\[f(0,0) = P(X=0,Y=0) = 3/5*2/4*1/3 = 1/10\]

(choose black-black-black)

The complete pdf is given by:

0 1
0 1/10 1/5
1 2/5 1/5
2 1/10 0

Now for the marginals we have, for example

\[f_X(0)=1/10+1/5=3/10\]

or in general:

x 0 1 2
P(X=x) 3/10 3/5 1/10

for \(Y\) we have

y 0 1
P(Y=y) 3/5 2/5

The conditional density of \(X|Y=0\) is

x 0 1 2
P(X=x|Y=0) 1/6 2/3 1/6

and so

\[E[X|Y=0] = 0\times 1/6+1\times 2/3+2·1/6 = 1.0\]

The conditional distribution of \(X|Y=1\) is

x 0 1 2
P(X=x|Y=1) 1/2 1/2 0

and so

\[E[X|Y=1] = 0\times 1/2+1\times1/2+2·0 = 1/2\]

Finally the conditional r.v. \(Z = E[X|Y]\) has pdf

z 1 1/2
P(Z=z) 3/5 2/5

with this we can find

\[E[Z] = E[E[X|Y]] = 1\times 3/5+1/2\times2/5 = 4/5\]

Theorem (1.9.6)

say \(X\) and \(Y\) are random variables. Then

\[E[X] = E\{E[X|Y]\}\]

and

\[var(X) = E[var(X|Y)] + var(E[X|Y])\]

(There is a simple explanation for this seemingly complicated formula!)

proof (for continuous \(X\) and \(Y\))

\[ \begin{aligned} &E \left\{E[X|Y] \right\} = \\ & \int_{-\infty}^{\infty} E[X|Y=y] f_Y(y) dy = \\ & \int_{-\infty}^{\infty} \left(\int_{-\infty}^{\infty}xf_{X|Y=y}(x|y)dx \right) f_Y(y) dy = \\ & \int_{-\infty}^{\infty} \int_{-\infty}^{\infty}x\frac{f(x,y)}{f_Y(y)} f_Y(y)dx dy = \\ & \int_{-\infty}^{\infty} \int_{-\infty}^{\infty}xf(x,y)dx dy = \\ & \int_{-\infty}^{\infty} x\left(\int_{-\infty}^{\infty}f(x,y)dy\right)dx = \\ & \int_{-\infty}^{\infty} xf_X(x)dx=E[X] \end{aligned} \] Also

\[ \begin{aligned} &var(E[X|Y]) =E \left\{E[X|Y]^2 \right\}- \left\{E[X|Y] \right\}^2=\\ &E \left\{E[X|Y]^2 \right\}-E[X]^2 \\ &E\left[var(X|Y)\right] = E\left\{E[X^2|Y]-E[X|Y]^2\right\}=\\ &E\left\{E[X^2|Y]\right\}-E\left\{E[X|Y]^2\right\} = \\ &E[X^2]-E\left\{E[X|Y]^2\right\} \\ &\\ &E\left[var(X|Y)\right]+var(E[X|Y])=\\ &E[X^2]-E\left\{E[X|Y]^2\right\} + E \left\{E[X|Y]^2 \right\}-E[X]^2 = \\ &E[X^2]-E[X]^2=var(X) \end{aligned} \]

Example (1.9.7)

above we found \(E\{E[X|Y]\} = 4/5\). Now

\[E[X] = 0\times 3/10 + 1\times 3/5 + 2\times 1/10 = 4/5\]

Example (1.9.8)

let’s say we have a continuous bivariate random vector with the joint pdf \(f(x,y) = c(x+2y)\) if \(0<x<2, 0<y<1\), 0 otherwise.

Now

\[ \begin{aligned} & 1=\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} f(x,y) dxdy = \\ &\int_0^2 \int_0^1 c(x+2y) dy dx = \\ &c\int_0^2 \left( xy+y^{2}\vert_0^1 \right. dx = \\ &c\int_0^2 \left( x+1\right) dx = \\ &c\left( x^2/2+x|_0^2\right) = \\ &c\left( 2+2 \right) =4c \end{aligned} \] \[ \begin{aligned} &f_X(x) = \int_0^1 \frac14(x+2y) dy=\frac14(xy+y^2)|_0^1=\frac{x+1}4;0<x<2 \\ &f_Y(y) = \int_0^2 \frac14(x+2y) dx=\frac14(x^2/2+2yx)|_0^2=\frac12+y;0<y<1 \\ &f_{X|Y=y}(x|y) =\frac{(x+2y)/4}{(x+1)/4}=\frac{x+2y}{x+1};0<y<1 \\ &E[X|Y=y] = \int_0^1 y\frac{x+2y}{x+1}dy=\frac{1}{x+1}(xy^2/2+2y^3/3|_0^1=\\ &\frac{x/2+2/3}{x+1}=\frac{3x+4}{6(x+1)};0<x<1 \end{aligned} \]
\[ \begin{aligned} &E\{E[Y|X]\} = E[Y] = \int_0^1 y(1/2+y)dy = y^2/4+y^3/3|_0^1 =\frac14+\frac13=\frac7{12}\\ &\\ &E\{E[Y|X]\} =\int_0^2 E[Y|X=x]f_{X}(x)dx= \\ &\int_0^2\frac{3x+4}{6(x+1)}\frac{x+1}4dx=\\ &\frac{1}{24}\int_0^2 3x+4dx=\frac{1}{24}(3x^2/2+4x|_0^2=\frac{1}{24}(6+8)=\frac7{12} \end{aligned} \]

Example (1.9.9)

say \(X\) has a density \(f_X(x) = (a+1)x^a\), \(0<x<1, a>1\) and \(Y|X=x \sim Exp(x)\). Find \(E[Y]\) and \(var(Y)\).

To find \(E[Y]\) we first need the density of \(Y\):

\[ \begin{aligned} &f(x,y) =f_X(x)f_{Y|X=x}(y|x)=(a+1)x^a\times xe^{-xy} \\ &f_Y(y) =\int_0^1 (a+1)x^{a+1}e^{-xy}dx \end{aligned} \]

and this integral can not be found explicitely, so this won’t work.

But

\[ \begin{aligned} &E[Y|X=x] = \frac1{x}\\ &var(Y|X=x) = \frac1{x^2}\\ &\\ &E[Y] = E[\{E[Y|X\}] = E\{\frac1{X}\} =\\ &\int_0^1\frac1{x}(a+1)x^a dx=\frac{a+1}{a}x^a|_0^1=\frac{a+1}{a}\\ &E\{\frac1{X^2}\}=\int_0^1\frac1{x^2}(a+1)x^a dx=\frac{a+1}{a-1}x^{a-1}|_0^1=\frac{a+1}{a-1} \end{aligned} \]

and so

\[ \begin{aligned} &var(Y) = E[var(Y|X)]+var(E[Y|X])=\\ &E[\frac1{X^2}] +var(\frac1{X}) = \\ &E[\frac1{X^2}] +E[\frac1{X^2}]-E[\frac1{X}]^2 = \\ &2\frac{a+1}{a-1}-(\frac{a+1}{a})^2 =\\ &\frac{a^3+a^2+a+1}{a^2(a-1)} \end{aligned} \]

Example (1.9.10)

Let \(X\sim U[0,1]\) and \(Y\) is a random variable with \(P(Y=0|X=x)=x^2, P(Y=1|X=x)=2x(1-x)\) and \(P(Y=2|X=x)=(1-x)^2\). We want to find the mean and variance of \(Y\).

First we find

\[ \begin{aligned} &E[Y^k|X=x] =0^k\times x^2+1^k\times2x(1-x)+2^k\times(1-x)^2= \\ &2x(1-x)+2^k(1-x)^2 = \left[2^k+(2-2^k)x\right](1-x)\\ &E[Y|X=x] = \left[2+(2-2)x\right](1-x)=2(1-x)\\ &E[Y^2|X=x] = \left[4+(2-4)x\right](1-x)=2(2-x)(1-x)\\ &var(Y|X=x)=E[Y^2|X=x]-E[Y|X=x]^2=\\ &2(2-x)(1-x)-[2(1-x)]^2=\\ &\left[4-2x-4+4x)\right](1-x)=2x(1-x) \end{aligned} \] and so

\[ \begin{aligned} &E[Y] =E\{E[Y|X\}=E\{2(1-X)\}=2(1-E[X])=1 \\ &var(Y) = E[var(Y|X)]+var(E[Y|X]) =\\ &E[2X(1-X)]+var(2(1-X)) = \\ &2E[X]-2E[X^2]+4var(1-X) = \\ &1-2(var(X)+E[X]^2)+\frac{4}{12} = \\ &1-2(\frac1{12}+(\frac12)^2)+\frac{1}{3} = \\ &1-\frac{1}{6}-\frac{1}2+\frac{1}{3} = \frac{2}{3} \end{aligned} \]

Let’s check:

n=1e4
x=runif(n)
y=0*x
for(i in seq_along(x))
   y[i]=sample(0:2,size=1,replace = TRUE,prob = c(x[i]^2,2*x[i]*(1-x[i]),(1-x[i])^2))
c(mean(y), var(y))
## [1] 0.9884000 0.6647319

Example (1.9.11)

Say X is a random variable with \(P(X=k)=\frac13;x=1,2,3\), and \(Y|X=x\sim U[0,x]\). In example 1.8.6a we found the covariance. Let’s do so again, now using the conditional expectation formula.

We already have \(E[X]=2\). Also note that \(E[Y|X=x]=x/2\). Now

\[ \begin{aligned} &E[Y] = E\{E[Y|X]\}\\ &E[Y|X=x] = \frac{x}2\\ &E[Y] = E\{\frac{X}2\} = E[X]/2=1\\ &E[XY] = E\{E[XY|X]\}=E\{XE[Y|X]\}=\\ &E\{X^2/2\}=\frac12 \sum_{x=1}^3 x^2/3=\frac16(1+4+9)=\frac{7}{3}\\ &cov(X,Y)=E[XY]-E[X]E[Y]=\frac{7}{3}-2\times 1=\frac{1}{3} \end{aligned} \]

and that is much simpler than the direct calculation!

Let’s find the correlation as well. Recall that \(var(Y|X=x)=x^2/12\), and so

\[ \begin{aligned} &var(X) =E[X^2]-E[X]^2=\frac{14}3-2^2=\frac{2}{3} \\ &\\ &var(Y) = E[var(Y|X)]+var(E[Y|X])\\ &\\ &var(E[Y|X]) = var(X/2)=var(X)/4=\frac{1}{6}\\ &\\ &E[var(Y|X)]=E[X^2/12]=E[X^2]/12=\frac{7}{18}\\ &\\ &var(Y) =\frac{7}{18}+\frac{1}{6}=\frac{5}{9} \end{aligned} \]

and so

\[ \begin{aligned} &cor(X,Y) =\frac{cov(X,Y)}{\sqrt{var(X)var(Y)}}= \\ &\frac{\frac13}{\sqrt{\frac23\frac59}}=\sqrt{3/10} \end{aligned} \]

R check:

n=1e4
x=sample(1:3, size=n,replace=TRUE)
y=runif(n,0,x)
c(cor(x,y), sqrt(0.3))
## [1] 0.5448797 0.5477226