hw2sol

Solution to Homework 2

Problem 1

Proof the theorem from the class: Let \(X \sim \chi^2(n), Y \sim \chi^2(m)\) and independent. Let \(Z=(X/n)/(Y/m)\). Then \(Z \sim F(n,m)\)

\[ \begin{aligned} &F_Z(z) =P(Z<z) =P((X/n)/(Y/m)<z) = \\ &P(X<\frac{nz}{m}Y) = \\ &\int_{-\infty}^\infty P(X<\frac{nz}{m}Y\vert Y=y)f_Y(y)dy = \\ &\int_{-\infty}^\infty P(X<\frac{nz}{m}y)f_Y(y)dy = \\ &\int_{-\infty}^\infty F_X(\frac{nz}{m}y)f_Y(y)dy \end{aligned} \] so

\[ \begin{aligned} &f_Z(z) =\frac{d}{dz}F_Z(z)= \\ &\frac{d}{dz} \int_{-\infty}^\infty F_X(\frac{nz}{m}y)f_Y(y)dy = \\ &\int_{-\infty}^\infty \frac{d}{dz} F_X(\frac{nz}{m}y)f_Y(y)dy = \\ &\int_{-\infty}^\infty f_X(\frac{n}{m}yz)\frac{n}{m}yf_Y(y)dy = \\ &\int_{-\infty}^\infty \frac1{\Gamma(n/2)2^{n/2}}(\frac{n}{m}yz)^{n/2-1}e^{-(n/m)yz/2}\frac{n}{m}y\frac1{\Gamma(m/2)2^{m/2}}y^{m/2-1}e^{-y/2}dy=\\ &\frac1{\Gamma(n/2)2^{n/2}}(\frac{n}{m}z)^{n/2-1}(\frac{n}{m}z)^{n/2-1}\frac1{\Gamma(m/2)2^{m/2}}\int_{-\infty}^\infty y^{(n+m)/2-1}e^{y/[2/(1+nz/m)]}dy=\\ &\frac{\Gamma((n+m)/2)}{\Gamma(n/2)\Gamma(m/2)}z^{n/2-1}(\frac{n}{m})^{n/2}\left(\frac1{1+nz/m}\right)^{(n+m)/2}\cdot\\ &\int_{-\infty}^\infty \frac1{\Gamma((n+m)/2)\left(\frac2{1+nz/m}\right)^{(n+m)/2}}y^{(n+m)/2-1}e^{y/[2/(1+nz/m)]}dy=\\ &\frac{\Gamma((n+m)/2)}{\Gamma(n/2)\Gamma(m/2)}z^{n/2-1}(\frac{n}{m})^{n/2}\left(\frac1{1+nz/m}\right)^{(n+m)/2} \end{aligned} \]

because the integrand is the density of a \(Gamma((n+m)/2, (2/(1+nz/m))\) random variable and so integrates to 1.

Problem 2

Say \(X_1,...,X_n\) have a geometric rate p distribution and are independent. Let \(M=\max\{X_1,...,X_n\}\). Find the density of M.

\[ \begin{aligned} &P(X_i=k) =pq^{k-1},k=1,2,.. \\ &F_M(x) = P(M\le m) =\\ &P(\max\{x_1,..,X_n\}\le m)= \\ &P(X_1\le m,..,X_n\le m) = \\ &P(X_1\le m)..P(X_n\le m) = \\ &P(X_1\le m)^n =\\ &\left(\sum_{i=1}^m pq^{x_i -1} \right)^n = \\ &\left(p\sum_{k=0}^{m-1} q^k \right)^n = \\ &\left(p\frac{1-q^m}{1-q} \right)^n = \\ &(1-q^m)^n \\ &f_M(1) = F_M(1)=(1-q)^n = p^n\\ &f_M(m) = F_M(m)-F_m(m-1) = \\ &(1-q^m)^n-(1-q^{m-1})^n \\ \end{aligned} \]

Problem 3

Show that if \(X\sim t(n)\), then \(X^2\sim F(1, n)\).

First recall that

\[f_X(x\vert n) = \frac{\Gamma((n+1)/2)}{\Gamma(n/2)}\frac1{\sqrt{\pi n}}\frac1{(1+x^2/n)^{(n+1)/2}}\]

and if \(Y\sim F(1,n)\), then

\[f_Y(x) =\frac{\Gamma((1+n)/2)}{\Gamma(1/2)\Gamma(n/2)}(\frac{1}{n})^{1/2}\frac{x^{1/2-1}}{(1+x/n)^{(1+n)/2}}=\\ \frac{\Gamma((1+n)/2)}{\Gamma(n/2)}\frac{1}{\sqrt{\pi nx}}\frac{1}{(1+x/n)^{(1+n)/2}}\]

because\(\Gamma(1/2)=\sqrt{\pi}\). Now

\[ \begin{aligned} &F_{X^2}(t) =P(X^2<x) =P(-\sqrt x<X<\sqrt x)=\\ &P(X<\sqrt x)-P(X<-\sqrt x) =F_{X}(\sqrt x)-F_{X}(-\sqrt x)\\ &\text{ }\\ &f_{X^2}(x) = \frac{d}{dx} F_{X^2}(x)=\\ &\frac{d}{dx}\left[ F_{X}(\sqrt x)-F_{X}(-\sqrt x) \right]= \\ &f_{X}(\sqrt x)\frac1{2\sqrt x}-f_{X}(-\sqrt x)\frac1{-2\sqrt x} =\\ &f_{X}(\sqrt x)\frac1{\sqrt x} =\\ &\frac{\Gamma((n+1)/2)}{\Gamma(n/2)}\frac1{\sqrt{\pi n}}\frac1{(1+(\sqrt x)^2/n)^{(n+1)/2}}\frac1{\sqrt x} = \\ &\frac{\Gamma((n+1)/2)}{\Gamma(n/2)}\frac1{\sqrt{\pi nx}}\frac{1}{(1+ x/n)^{(n+1)/2}} \end{aligned} \]

because because the density of a t distribution is symmetric.

Here is another, shorter proof: Let \(Z\sim N(0,1)\) and \(T\sim \chi^2(n)\), then

\[ \begin{aligned} &F_{X^2}(t) =P(X^2<x) = P\left(\left[\frac{Z}{\sqrt{T/n}}\right]^2<x\right) = \\ &P\left(\frac{Z^2}{T/n}<x\right) = P\left(\frac{Z^2/1}{T/n}<x\right) =\\ &P(Y<x)=F_Y(x) \end{aligned} \]

Problem 4

Below is the data from a Binomial distribution Bin(10, p), already in the form of a table:

x	Freq
0	11
1	38
2	60
3	47
4	26
5	10
6	6
7	2

Draw the log-likelihood curve as a function of p.

Let \(x_i\), i=1,..,n be the original observations and let \(y_k=\sum_{i=1}^n I_{k}(x_i)\), that is the number of times k was observed, then

\[ \begin{aligned} &L(p) =\prod_{i=1}^n f(x_i|p) = \prod_{i=1}^n {10\choose{x_i}} p^{x_i}(1-p)^{10-x_i} =\\ &\prod_{k=0}^{10} \left[{10\choose{k}}p^{k}(1-p)^{10-k}\right]^{y_k} = \\ &\prod_{k=0}^{10} {10\choose{k}}^{y_k} p^{k{y_k}}(1-p)^{(10-k){y_k}}\\ &l(p) =\log L(p) = \\ &\sum_{k=0}^{10} y_k \log {10\choose{k}} +(\sum_{k=0}^{10} ky_k)\log p +(\sum_{k=0}^{10} (10-k)y_k)\log (1-p) \end{aligned} \]

k <- 0:7
y <- c(11, 38, 60, 47, 26, 10,  6,  2)
n <- sum(y)
curve(sum(y*log(choose(10, k))) +sum(k*y)*log(x)+sum((10-k)*y)*log(1-x), 0.1, 0.4,
      ylab="log-likelihood", xlab="p", size=2, col="blue")