Say we have a random vector \(\pmb{X}= (X_1,..,X_p)'\). Let \(\sigma_{ij}=cov(X_i, X_j)\), then
\[ \pmb{\Sigma} = \begin{pmatrix} \sigma_{11} & \sigma_{12} & ... & \sigma_{1p} \\ \sigma_{22} & \sigma_{22} & ... & \sigma_{2p} \\ \vdots & \vdots & &\vdots\\ \sigma_{p1} & \sigma_{p2} & ... & \sigma_{pp} \\ \end{pmatrix} \]
is called the variance-covariance matrix or also covariance matrix. By definition the covariance matrix is symmetric. If the \(X_i\)’s are continuous random variables and are linearly independent, that is there do not exist scalars s and t such that \(P(sX_i+tX_j=1)=1\), then \(\pmb{\Sigma}\) is positive definite, otherwise it is positive semi-definite.
Note that
\[var(X_i)=\sigma_{ii}=:\sigma_i^2\]
Let \(\pmb{Z}\) be a matrix of random variables, that is
\[ \pmb{Z} = \begin{pmatrix} Z_{11} & Z _{12} & ... & Z_{1p} \\ Z_{22} & Z_{22} & ... & Z_{2p} \\ \vdots & \vdots & &\vdots\\ Z_{p1} & Z_{p2} & ... & Z_{pp} \\ \end{pmatrix} \]
then its mean is defined by
\[ E[\pmb{Z}] = \begin{pmatrix} E[Z_{11}] & E[Z _{12}] & ... & E[Z_{1p}] \\ E[Z_{22}] & E[Z_{22}] & ... & E[Z_{2p}] \\ \vdots & \vdots & &\vdots\\ E[Z_{p1}] & E[Z_{p2}] & ... & E[Z_{pp}] \\ \end{pmatrix} \]
Let \(\pmb{X}= \begin{pmatrix} X_1 & ... & X_p \end{pmatrix}'\) and let \(\pmb{\mu}= \begin{pmatrix} E[X_1] & ... & E[X_p] \end{pmatrix}'\), then
\[\pmb{\Sigma} = E[ (\pmb{X}-\pmb{\mu})(\pmb{X}-\pmb{\mu})'] = E[\pmb{X}\pmb{X}']-\pmb{\mu}\pmb{\mu}'\]
proof
\[ \begin{aligned} &\pmb{\Sigma} = E\left[ (\pmb{X}-\pmb{\mu})(\pmb{X}-\pmb{\mu})'\right] = \\ &E\left[ \pmb{XX'}-\pmb{\mu X'} - \pmb{X\mu'}+\pmb{\mu\mu}'\right] = \\ &E[\pmb{XX'}]-E[\pmb{\mu X'}] - E[\pmb{X\mu'}]+\pmb{\mu\mu}' = \\ &E[\pmb{XX'}]-\pmb{\mu}E[\pmb{X'}] - E[\pmb{X}]\pmb{\mu}'+\pmb{\mu\mu}' = \\ &E[\pmb{XX'}]-\pmb{\mu}\pmb{\mu}' -\pmb{\mu}\pmb{\mu}'+\pmb{\mu\mu}' = \\ &E[\pmb{XX'}]-\pmb{\mu}\pmb{\mu}' \end{aligned} \]
\(\vert \pmb{\Sigma} \vert\) is called the generalized variance.
\((\pmb{X}-\pmb{\mu})'\pmb{\Sigma}^{-1}(\pmb{X}-\pmb{\mu})\) is called the standardized distance or Mahalanobis distance
Note that if \(\pmb{Z}=(\pmb{X}-\pmb{\mu})'\pmb{\Sigma}^{-1}(\pmb{X}-\pmb{\mu})\), then \(E[Z_i]=0\) and \(cov(Z_i,Z_j)=\delta_{ij}\).
The correlation matrix is defined by
\[ \pmb{P}_\rho = \begin{pmatrix} 1 & \rho_{12} & ... & \rho_{1p} \\ \rho_{22} & 1 & ... & \rho_{2p} \\ \vdots & \vdots & &\vdots\\ \rho_{p1} & \rho_{p2} & ... & 1 \\ \end{pmatrix} \]
where \(\rho_{ij}=\sigma_{ij}/\sigma_{ii}\sigma_{jj}\).
If we define
\[\pmb{D}_\sigma = [diag(\pmb{\Sigma})]^{1/2}=diag(\sigma_1,..,\sigma_p)\]
then
\[\pmb{\Sigma} = \pmb{D}_\sigma\pmb{P}_\rho\pmb{D}_\sigma\]
Suppose a random vector \(\pmb{V}\) is partioned as follows:
\[ \pmb{V} = \begin{pmatrix} \pmb{X}\\ \pmb{Y} \end{pmatrix}= \begin{pmatrix} X_1 \\ \vdots \\ X_p \\ Y_1 \\ \vdots \\ Y_q \end{pmatrix} \]
then
\[\pmb{\mu} = E[\pmb{V}] = \begin{pmatrix}\pmb{\mu}_x\\\pmb{\mu}_y\end{pmatrix}\]
and
\[\pmb{\Sigma} = cov[\pmb{V}] = \begin{pmatrix}\pmb{\Sigma}_{xx}&\pmb{\Sigma}_{xy}\\\pmb{\Sigma}_{yx}&\pmb{\Sigma}_{yy}\end{pmatrix}\]
where
\[\pmb{\Sigma}_{yx}=\pmb{\Sigma}_{xy}'\]
Let \(\pmb{X}=\begin{pmatrix} X_1 & ... & X_p \end{pmatrix}'\) be a random vector and \(\pmb{a}=\begin{pmatrix} a_1 & ... & a_p \end{pmatrix}'\) a vector of scalars. Let \(\pmb{Z}=\pmb{a}'\pmb{X}\).
\(\pmb{\mu}_z=E[\pmb{Z}]=E[\pmb{a}'\pmb{X}]=\pmb{a}'\pmb{\mu}\)
proof follows from the linearity of expectations.
Suppose \(\pmb{Y}\) is a random vector, \(\pmb{X}\) a random matrix, \(\pmb{a}\) and \(\pmb{b}\) vectors of constants and \(\pmb{A}\) and \(\pmb{B}\) matrices of constants. Then
proof follows from linearity of expectations
Let \(\pmb{Y}\) be a random vector with covariance matrix \(\pmb{\Sigma}\) and \(\pmb{a}\) a vector of constants, then
\[var(\pmb{a}'\pmb{Y})=\pmb{a}'\pmb{\Sigma}\pmb{a}\]
proof
\[ \begin{aligned} &var(\pmb{a}'\pmb{Y}) = \\ &E\left[ (\pmb{a}'\pmb{Y}-\pmb{a}'\pmb{\mu})^2 \right] = \\ &E\left[ (\pmb{a}'(\pmb{Y}-\pmb{\mu}))^2 \right] = \\ &E\left[ \pmb{a}'(\pmb{Y}-\pmb{\mu})\pmb{a}'(\pmb{Y}-\pmb{\mu}) \right] = \\ &E\left[ \pmb{a}'(\pmb{Y}-\pmb{\mu})(\pmb{Y}-\pmb{\mu})'\pmb{a} \right] = \\ &\pmb{a}'\pmb{\Sigma}\pmb{a} \end{aligned} \]
say
\[ \pmb{\Sigma} = \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix} \] and \(\pmb{a} = \begin{pmatrix} 1 \\ 1\end{pmatrix}\), then
\[ \begin{aligned} &\pmb{a'\Sigma a} = \begin{pmatrix} 1 & 1\end{pmatrix} \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix} \begin{pmatrix} 1 \\ 1\end{pmatrix}=\\ & \begin{pmatrix} 1 & 1\end{pmatrix} \begin{pmatrix} 3 \\3 \end{pmatrix} = 6 \\ \end{aligned} \] but also
\[ \begin{aligned} &\pmb{a'Y} = \begin{pmatrix} 1 & 1\end{pmatrix} \begin{pmatrix} Y_1\\Y_2\end{pmatrix} = Y_1+Y_2\\ &var(Y_1+Y_2) =var(Y_1)+var(Y_2)+2cov(Y_1,Y_2)=2+2+2\times 1=6 \\ \end{aligned} \]
\[cov(\pmb{a}'\pmb{Y}, \pmb{b}'\pmb{Y})=\pmb{a}'\pmb{\Sigma}\pmb{b}\]
proof omitted
Let \(\pmb{z=Ay}\) and \(\pmb{w=By}\), where \(\pmb{A}\) is a \(k\times p\) matrix of constants and \(\pmb{B}\) is a \(m\times p\) matrix of constants and \(\pmb{y}\) is a \(p\times 1\) random vector with covariance matrix \(\pmb{\Sigma}\). Then
\[cov(\pmb{Ay})=\pmb{A\Sigma A'}\]
\[cov(\pmb{z, w})=\pmb{A\Sigma B'}\] proof follows from the corollary above
say
\[ \pmb{\Sigma} = \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix}, \pmb{A} = \begin{pmatrix} 0 & 1 \\ 1 & 2 \end{pmatrix}, \pmb{B} = \begin{pmatrix} 1 & -1 \\ -1 & 2 \end{pmatrix} \] then
\[ \begin{aligned} &\pmb{A\Sigma B'} = \begin{pmatrix} 0 & 1 \\ 1 & 2 \end{pmatrix} \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix} \begin{pmatrix} 1 & -1 \\ -1 & 2 \end{pmatrix}=\\ &\begin{pmatrix} 0 & 1 \\ 1 & 2 \end{pmatrix} \begin{pmatrix} 1 & 0 \\ -1 & 3 \end{pmatrix}= \begin{pmatrix} -1 & 3 \\ -1 & 6 \end{pmatrix} \end{aligned} \] but also
\[ \begin{aligned} &\pmb{AY} = \begin{pmatrix} 0 & 1 \\ 1 & 2 \end{pmatrix} \begin{pmatrix} Y_1\\Y_2\end{pmatrix} = \begin{pmatrix} Y_2\\Y_1+2Y_2\end{pmatrix} \\ &\pmb{BY} = \begin{pmatrix} 1 & -1 \\ -1 & 2 \end{pmatrix} \begin{pmatrix} Y_1\\Y_2\end{pmatrix} = \begin{pmatrix} Y_1-Y_2\\-Y_1+2Y_2\end{pmatrix} \\ \end{aligned} \] and so
\[ \begin{aligned} &cov(Y_2,Y_1-Y_2) =cov(Y_1,Y_2)-var(Y_2)=1-2=-1 \\ &cov(Y_2,-Y_1+2Y_2) =-cov(Y_1,Y_2)+2var(Y_2)+=-1+2\times2=3 \\ &cov(Y_1+2Y_2,Y_1-Y_2) =\\ &var(Y_1)+2cov(Y_1,Y_2)-cov(Y_1,Y_2)-2var(Y_2)=2+2-1-2*2=-1 \\ &cov(Y_1+2Y_2,-Y_1+2Y_2) = \\ &-var(Y_1)+2cov(Y_1,Y_2)-2cov(Y_1,Y_2)+4var(Y_2)=-2+2-2+8=6 \\ \end{aligned} \]