The expectation (or expected value) of a random variable g(X) is defined by
\[ \begin{aligned} &\sum_x g(x)f(x) \text{ if X discrete} \\ &\int_{- \infty}^\infty g(x)f(x) dx \text{ if X continuous} \\ \end{aligned} \]
We use the notation Eg(X)
we roll fair die until the first time we get a six. What is the expected number of rolls?
We saw that f(x) = 1/6*(5/6)x-1 if
Here we just have g(x)=x, so
\[ EX=\sum_{i=1}^\infty g(x_i)f(x_i) = \sum_{i=1}^\infty i \frac16 (\frac56)^{i-1} \]
How do we compute this sum? Here is a “standard” trick:
and so we find
\[EX=\frac16\frac1{(1-5/6)^2}=6\]
X is said to have a uniform [A,B] distribution if f(x)=1/(B-A) for A
some special expectations are the mean of X defined by \(\mu=EX\) and the variance defined by \(\sigma^2=V(X)=E(X-\mu)^2\). Related to the variance is the standard deviation \(\sigma\), the square root of the variance.
Here are some formulas for expectations:
the last one is a useful formula for finding the variance and/or the standard deviation.
find the mean and the standard deviation of a uniform [A,B] r.v.
and so \(\sigma=(B-A)/\sqrt{12}\)
Find the mean and the standard deviaiton of an exponential rv with rate \(\lambda\).
One way to “link” probabilities and expectations is via the indicator function I_A defined as
\[ I_A(x)=\left\{ \begin{array}{cc} 1 & \text{if }x\in A \\ 0 & \text{if }x\notin A \end{array} \right. \]
because with this we have for a continuous r.v. X with density f:
\[EI_A(X)=\int_{-\infty}^\infty I_A(x)f(x)dx=\int_A f(x)dx=P(X \in A)\]
The definition of expectation easily generalizes to random vectors:
Let (X,Y) be a discrete random vector with
\(f(x,y) = (1/2)^{x+y}, x \ge 1, y \ge 1\)
Find \(E[XY^2]\)
The covariance of two r.v. X and Y is defined by \(cov(X,Y)=E[(X-\mu_X)(Y-\mu_Y)]\)
The correlation of X and Y is defined by
\(cor(X,Y)=\frac{cov(X,Y)}{\sigma_X \sigma_Y}\)
Note cov(X,X) = V(X)
As with the variance we have a simpler formula for actual calculations:
\(cov(X,Y) = E(XY) - (EX)(EY)\)
take the example of the sum and absolute value of the difference of two rolls of a die. What is the covariance of X and Y?
So we have
\(\mu_X = EX = 2*1/36 + 3*2/36 + ... + 12*1/36 = 7.0\)
\(\mu_Y = EY = 0*6/36 + 1*12/36 + ... + 5*2/36 = 70/36\)
\(EXY = 0*2*1/36 + 1*2*0/36 + .2*2*0/36.. + 5*12*0/36 = 490/36\)
and so
\(cov(X,Y) = EXY-EXEY = 490/36 - 7.0*70/36 = 0\)
Note that we previously saw that X and Y are not independent, so we here have an example that a covariance of 0 does not imply independence! It does work the other way around, though:
Theorem: If X and Y are independent, then cov(X,Y) = 0 ( = cor(X,Y))
proof (in the case of X and Y continuous):
\[ \begin{aligned} &EXY = \iint_{R^2} xyf(x,y) d(x,y) = \\ &\int_{-\infty}^\infty \int_{-\infty}^\infty xy f(x,y)dxdy = \\ &\int_{-\infty}^\infty \int_{-\infty}^\infty xy f_X(x)f_Y(y)dxdy = \\ &\int_{-\infty}^\infty yf_Y(y) \left(\int_{-\infty}^\infty x f_X(x)dx\right)dy = \\ & \left(\int_{-\infty}^\infty x f_X(x)dx \right)\left(\int_{-\infty}^\infty y f_Y(y)dy \right)= \\ &EXEY \end{aligned} \]
and so cov(X,Y) = EXY-EXEY = EXEY - EXEY = 0
Consider again the example from before: we have continuous rv’s X and Y with joint density
\(f(x,y)=8xy, 0 \le x<y \le 1\)
Find the covariance and the correlation of X and Y.
We have seen before that \(f_Y(y)=4y^3, 0<y<1\), so
\(E[Y]=\int_{-\infty}^\infty yf_Y(y)dy\) = \(\int_0^1 y4y^3dy\) = \(4/5y^5|_0^1 = 4/5\)
Now
and
and so cov(X,Y)=4/9-8/15·4/5 = 12/675
Also
We saw above that E(X+Y) = EX + EY. How about V(X+Y)?
and if \(X \perp Y\) we have V(X+Y) = VX + VY
Say X|Y=y is a conditional r.v. with density (pdf) f. Then the conditional expectation of X|Y=y is defined by
\[ \begin{aligned} &E[g(X)|Y=y]=\sum_x g(x)f_{X|Y=y}(x|y) \text{ if X discrete} \\ &E[g(X)|Y=y]=\int_{- \infty}^\infty g(x)f_{X|Y=y}(x|y) dx \text{ if X continuous} \\ \end{aligned} \]
Let E[X|Y] denote the function of the random variable Y whose value at Y=y is given by E[X|Y=y]. Note then Z=E[X|Y] is itself a random variable.
An urn contains 2 white and 3 black balls. We pick two balls from the urn. Let X be denote the number of white balls chosen. An additional ball is drawn from the remaining three. Let Y equal 1 if the ball is white and 0 otherwise.
For example
\(f(0,0) = P(X=0,Y=0) = 3/5*2/4*1/3 = 1/10\).
The complete density is given by:
x=0 | x=1 | x=2 | |
---|---|---|---|
y=0 | 0.1 | 0.4 | 0.1 |
y=1 | 0.2 | 0.2 | 0.0 |
x | P(X=x) | |
---|---|---|
1 | x=0 | 0.3 |
2 | x=1 | 0.6 |
3 | x=2 | 0.1 |
y | P(Y=y) | |
---|---|---|
1 | y=0 | 0.6 |
2 | y=1 | 0.4 |
The conditional distribution of X|Y=0 is
x | P(X=x|Y=0) | |
---|---|---|
1 | 0 | 1/6 |
2 | 1 | 2/3 |
3 | 2 | 1/6 |
and so \(E[X|Y=0] = 0*1/6+1*2/3+2*1/6 = 1.0\).
The conditional distribution of X|Y=1 is
x | P(X=x|Y=1) | |
---|---|---|
1 | 0 | 1/2 |
2 | 1 | 1/2 |
3 | 2 | 0 |
and so \(E[X|Y=1] = 0*1/2+1*1/2+2*0 = 1/2\).
Finally the conditional r.v. Z = E[X|Y] has density
z | P(Z=z) | |
---|---|---|
1 | 1 | 3/5 |
2 | 1/2 | 2/5 |
with this we can find \(E[Z] = E[E[X|Y]] = 1*3/5+1/2*2/5 = 4/5\).
How about using simulation to do these calculations? - program urn1
urn1 <- function (n = 2, m = 3, draws = 2, B = 10000) {
u <- c(rep("w", n), rep("b", m))
x <- rep(0, B)
y <- x
for (i in 1:B) {
z <- sample(u, draws + 1)
y[i] <- ifelse(z[draws + 1] == "w", 1, 0)
for (j in 1:draws)
x[i] <- x[i] + ifelse(z[j] == "w", 1, 0)
}
print("Joint pdf:")
print(round(table(y, x)/B, 3))
print("pdf of X:")
print(round(table(x)/B, 3))
print("pdf of Y:")
print(round(table(y)/B, 3))
print("pdf of X|Y=0:")
x0 <- table(x[y == 0])/length(y[y == 0])
print(round(x0, 3))
print("E[X|Y=0]:")
print(sum(c(0:draws) * x0))
print("pdf of X|Y=1:")
x1 <- table(x[y == 1])/length(y[y == 1])
print(round(x1, 3))
print("E[X|Y=1]:")
print(sum(c(0:1) * x1))
}
urn1()
## [1] "Joint pdf:"
## x
## y 0 1 2
## 0 0.098 0.401 0.103
## 1 0.197 0.202 0.000
## [1] "pdf of X:"
## x
## 0 1 2
## 0.294 0.603 0.103
## [1] "pdf of Y:"
## y
## 0 1
## 0.601 0.399
## [1] "pdf of X|Y=0:"
##
## 0 1 2
## 0.163 0.666 0.171
## [1] "E[X|Y=0]:"
## [1] 1.008
## [1] "pdf of X|Y=1:"
##
## 0 1
## 0.493 0.507
## [1] "E[X|Y=1]:"
## [1] 0.5068
Consider again the example from before: we have continuous rv’s X and Y with joint density \(f(x,y)=8xy, 0 \le x<y \le 1\). We have found \(f_Y(y) = 4y^3, 0<y<1\), and \(f_{X|Y=y}(x|y) = 2x/y^2, 0 \le x \le y\). So
Throughout this calculation we treated y as a constant. Now, though, we can change our point of view and consider \(E[X|Y=y] = 2y/3\) as a function of y:
\(g(y)=E[X|Y=y]=2y/3\)
What are the values of y? Well, they are the observations we might get from the rv. Y, so we can also write
\(g(Y)=E[X|Y=Y]=2Y/3\)
but Y is a rv, then so is 2Y/3, and we see that we can define a rv Z=g(Y)=E[X|Y].
Recall that the expression \(f_{X|Y}\) does not make sense. Now we see that on the other hand the expression E[X|Y] makes perfectly good sense!
There is a very useful formula for the expectation of conditional r.v.s:
\[E[E[X|Y]] = E[X]\]
\(E[X] = 0*3/10 + 1*3/5 + 2*1/10 = 4/5\).
There is a simple explanation for this seemingly complicated formula!
Here is a corresponding formula for the variance:
\[V(X) = E[V(X|Y)] + V[E(X|Y)]\]
let’s say we have a continuous bivariate random vector with the joint pdf \(f(x,y) = c(x+2y)\) if \(0<x<2\) and \(0<y<1\), 0 otherwise.
Find c:
Find the marginal distribution of X
Find the marginal distribution of Y
Find the conditional pdf of Y|X=x
Note: this is a proper pdf for any fixed value of x
Find E[Y|X=x]
Let Z=E[Y|X]. Find E[Z]