Expectation

Expectations of Random Variables

The expectation (or expected value) of a random variable g(X) is defined by

\[ \begin{aligned} &\sum_x g(x)f(x) \text{ if X discrete} \\ &\int_{- \infty}^\infty g(x)f(x) dx \text{ if X continuous} \\ \end{aligned} \]

We use the notation Eg(X)

Example

we roll fair die until the first time we get a six. What is the expected number of rolls?

We saw that f(x) = 1/6*(5/6)x-1 if

Here we just have g(x)=x, so

\[ EX=\sum_{i=1}^\infty g(x_i)f(x_i) = \sum_{i=1}^\infty i \frac16 (\frac56)^{i-1} \]

How do we compute this sum? Here is a “standard” trick:

and so we find

\[EX=\frac16\frac1{(1-5/6)^2}=6\]

Example

X is said to have a uniform [A,B] distribution if f(x)=1/(B-A) for A

some special expectations are the mean of X defined by \(\mu=EX\) and the variance defined by \(\sigma^2=V(X)=E(X-\mu)^2\). Related to the variance is the standard deviation \(\sigma\), the square root of the variance.

Here are some formulas for expectations:

the last one is a useful formula for finding the variance and/or the standard deviation.

Example

find the mean and the standard deviation of a uniform [A,B] r.v.

and so \(\sigma=(B-A)/\sqrt{12}\)

Example

Find the mean and the standard deviaiton of an exponential rv with rate \(\lambda\).


One way to “link” probabilities and expectations is via the indicator function I_A defined as

\[ I_A(x)=\left\{ \begin{array}{cc} 1 & \text{if }x\in A \\ 0 & \text{if }x\notin A \end{array} \right. \]

because with this we have for a continuous r.v. X with density f:

\[EI_A(X)=\int_{-\infty}^\infty I_A(x)f(x)dx=\int_A f(x)dx=P(X \in A)\]

Expectations of Random Vectors

The definition of expectation easily generalizes to random vectors:

Example

Let (X,Y) be a discrete random vector with

\(f(x,y) = (1/2)^{x+y}, x \ge 1, y \ge 1\)

Find \(E[XY^2]\)

Covariance and Correlation

The covariance of two r.v. X and Y is defined by \(cov(X,Y)=E[(X-\mu_X)(Y-\mu_Y)]\)

The correlation of X and Y is defined by

\(cor(X,Y)=\frac{cov(X,Y)}{\sigma_X \sigma_Y}\)

Note cov(X,X) = V(X)

As with the variance we have a simpler formula for actual calculations:

\(cov(X,Y) = E(XY) - (EX)(EY)\)

Example

take the example of the sum and absolute value of the difference of two rolls of a die. What is the covariance of X and Y?

So we have

\(\mu_X = EX = 2*1/36 + 3*2/36 + ... + 12*1/36 = 7.0\)
\(\mu_Y = EY = 0*6/36 + 1*12/36 + ... + 5*2/36 = 70/36\)
\(EXY = 0*2*1/36 + 1*2*0/36 + .2*2*0/36.. + 5*12*0/36 = 490/36\)

and so

\(cov(X,Y) = EXY-EXEY = 490/36 - 7.0*70/36 = 0\)

Note that we previously saw that X and Y are not independent, so we here have an example that a covariance of 0 does not imply independence! It does work the other way around, though:

Theorem: If X and Y are independent, then cov(X,Y) = 0 ( = cor(X,Y))

proof (in the case of X and Y continuous):

\[ \begin{aligned} &EXY = \iint_{R^2} xyf(x,y) d(x,y) = \\ &\int_{-\infty}^\infty \int_{-\infty}^\infty xy f(x,y)dxdy = \\ &\int_{-\infty}^\infty \int_{-\infty}^\infty xy f_X(x)f_Y(y)dxdy = \\ &\int_{-\infty}^\infty yf_Y(y) \left(\int_{-\infty}^\infty x f_X(x)dx\right)dy = \\ & \left(\int_{-\infty}^\infty x f_X(x)dx \right)\left(\int_{-\infty}^\infty y f_Y(y)dy \right)= \\ &EXEY \end{aligned} \]

and so cov(X,Y) = EXY-EXEY = EXEY - EXEY = 0

Example

Consider again the example from before: we have continuous rv’s X and Y with joint density

\(f(x,y)=8xy, 0 \le x<y \le 1\)

Find the covariance and the correlation of X and Y.

We have seen before that \(f_Y(y)=4y^3, 0<y<1\), so

\(E[Y]=\int_{-\infty}^\infty yf_Y(y)dy\) = \(\int_0^1 y4y^3dy\) = \(4/5y^5|_0^1 = 4/5\)

Now

and

and so cov(X,Y)=4/9-8/15·4/5 = 12/675

Also

We saw above that E(X+Y) = EX + EY. How about V(X+Y)?

and if \(X \perp Y\) we have V(X+Y) = VX + VY

Conditional Expectation and Variance

Say X|Y=y is a conditional r.v. with density (pdf) f. Then the conditional expectation of X|Y=y is defined by

\[ \begin{aligned} &E[g(X)|Y=y]=\sum_x g(x)f_{X|Y=y}(x|y) \text{ if X discrete} \\ &E[g(X)|Y=y]=\int_{- \infty}^\infty g(x)f_{X|Y=y}(x|y) dx \text{ if X continuous} \\ \end{aligned} \]

Let E[X|Y] denote the function of the random variable Y whose value at Y=y is given by E[X|Y=y]. Note then Z=E[X|Y] is itself a random variable.

Example

An urn contains 2 white and 3 black balls. We pick two balls from the urn. Let X be denote the number of white balls chosen. An additional ball is drawn from the remaining three. Let Y equal 1 if the ball is white and 0 otherwise.

For example

\(f(0,0) = P(X=0,Y=0) = 3/5*2/4*1/3 = 1/10\).

The complete density is given by:

x=0 x=1 x=2
y=0 0.1 0.4 0.1
y=1 0.2 0.2 0.0
The marginals are given by
x P(X=x)
1 x=0 0.3
2 x=1 0.6
3 x=2 0.1
y P(Y=y)
1 y=0 0.6
2 y=1 0.4

The conditional distribution of X|Y=0 is

x P(X=x|Y=0)
1 0 1/6
2 1 2/3
3 2 1/6

and so \(E[X|Y=0] = 0*1/6+1*2/3+2*1/6 = 1.0\).

The conditional distribution of X|Y=1 is

x P(X=x|Y=1)
1 0 1/2
2 1 1/2
3 2 0

and so \(E[X|Y=1] = 0*1/2+1*1/2+2*0 = 1/2\).

Finally the conditional r.v. Z = E[X|Y] has density

z P(Z=z)
1 1 3/5
2 1/2 2/5

with this we can find \(E[Z] = E[E[X|Y]] = 1*3/5+1/2*2/5 = 4/5\).

How about using simulation to do these calculations? - program urn1

 urn1 <- function (n = 2, m = 3, draws = 2, B = 10000) {
    u <- c(rep("w", n), rep("b", m))
    x <- rep(0, B)
    y <- x
    for (i in 1:B) {
        z <- sample(u, draws + 1)
        y[i] <- ifelse(z[draws + 1] == "w", 1, 0)
        for (j in 1:draws) 
          x[i] <- x[i] + ifelse(z[j] == "w", 1, 0)
    }
    print("Joint pdf:")
    print(round(table(y, x)/B, 3))
    print("pdf of X:")
    print(round(table(x)/B, 3))
    print("pdf of Y:")
    print(round(table(y)/B, 3))
    print("pdf of X|Y=0:")
    x0 <- table(x[y == 0])/length(y[y == 0])
    print(round(x0, 3))
    print("E[X|Y=0]:")
    print(sum(c(0:draws) * x0))
    print("pdf of X|Y=1:")
    x1 <- table(x[y == 1])/length(y[y == 1])
    print(round(x1, 3))
    print("E[X|Y=1]:")
    print(sum(c(0:1) * x1))
    
 }
urn1()
## [1] "Joint pdf:"
##    x
## y       0     1     2
##   0 0.098 0.401 0.103
##   1 0.197 0.202 0.000
## [1] "pdf of X:"
## x
##     0     1     2 
## 0.294 0.603 0.103 
## [1] "pdf of Y:"
## y
##     0     1 
## 0.601 0.399 
## [1] "pdf of X|Y=0:"
## 
##     0     1     2 
## 0.163 0.666 0.171 
## [1] "E[X|Y=0]:"
## [1] 1.008
## [1] "pdf of X|Y=1:"
## 
##     0     1 
## 0.493 0.507 
## [1] "E[X|Y=1]:"
## [1] 0.5068

Example

Consider again the example from before: we have continuous rv’s X and Y with joint density \(f(x,y)=8xy, 0 \le x<y \le 1\). We have found \(f_Y(y) = 4y^3, 0<y<1\), and \(f_{X|Y=y}(x|y) = 2x/y^2, 0 \le x \le y\). So

Throughout this calculation we treated y as a constant. Now, though, we can change our point of view and consider \(E[X|Y=y] = 2y/3\) as a function of y:

\(g(y)=E[X|Y=y]=2y/3\)

What are the values of y? Well, they are the observations we might get from the rv. Y, so we can also write

\(g(Y)=E[X|Y=Y]=2Y/3\)

but Y is a rv, then so is 2Y/3, and we see that we can define a rv Z=g(Y)=E[X|Y].

Recall that the expression \(f_{X|Y}\) does not make sense. Now we see that on the other hand the expression E[X|Y] makes perfectly good sense!

There is a very useful formula for the expectation of conditional r.v.s:

\[E[E[X|Y]] = E[X]\]

\(E[X] = 0*3/10 + 1*3/5 + 2*1/10 = 4/5\).

There is a simple explanation for this seemingly complicated formula!

Here is a corresponding formula for the variance:

\[V(X) = E[V(X|Y)] + V[E(X|Y)]\]

Example

let’s say we have a continuous bivariate random vector with the joint pdf \(f(x,y) = c(x+2y)\) if \(0<x<2\) and \(0<y<1\), 0 otherwise.

Find c:

Find the marginal distribution of X

Find the marginal distribution of Y

Find the conditional pdf of Y|X=x

Note: this is a proper pdf for any fixed value of x

Find E[Y|X=x]

Let Z=E[Y|X]. Find E[Z]