Continuous-time Markov Chains

Definition (5.4.1)

Say \(\{X(t), t \ge 0\}\) is a continuous-time stochastic process taking values on the nonnegative integers. Then \(X(t),t\ge 0\) is a Markov chain if for all \(s,t \ge 0\), and nonnegative integers \(i,j, x(u), 0 \le u<s\) we have

\[P\left(X(t+s)=j|X(s)=i, X(u)=x(u);0\le u<s\right)=P\left(X(t+s)=j|X(s)=i\right)\]

The main result for such chains is

Theorem (5.4.2)

Let \(\{X(t),t \ge 0\}\) be a continuous-time Markov chain with \(X(0)=i\). Let \(T_i\) be the time the process stays in state i. Then \(T_i\) has an exponential distribution.

proof

\[ \begin{aligned} &P \left(T_i>s+t|T_i>s,X(0)=i \right) = \\ &P \left(X(u)=i,0<u<s+t|X(u)=i,0<u<s,X(0)=i \right) = \\ &P \left(X(u)=i,0<u<t|X(0)=i \right) = \\ &P(T_i>t) \end{aligned} \]

But this means that \(T_i\) is memoryless! Of course \(T_i\) is also non-negative and continuous, and therefore by theorem (2.2.3) \(T_i\) has to have an exponential distribution.


With this we have the following characterization of a continuous-time Markov chain:

  1. the amount of time spent in state i is an exponential distribution with mean \(v_i\).

  2. when the process leaves state i it next enters state j with some probability, say \(P_{ij}\).

So a continuous-time Markov chain is a process that moves from state to state in accordance with a discrete-space Markov chain, but also spends an exponentially distributed amount of time in each state.

Notice that this means in a continuous-time Markov chain we always have \(P_{kk}=0\), so the diagonal of the transition matrix is 0.


Let’s consider a finite-statespace continuous-time Markov chain, that is \(X(t)\in \{0,..,N\}\). Let

\[P_{ij} (t) = P(X(t)=j|X(0)=i)\]

then the the Markov property asserts that \(\{X(t),t \ge 0\}\) satisfies

  1. \(p_{ij}(t)\ge 0\)
  2. \(\sum_{j=0}^{N}p_{ij}(t)=1\)
  3. \(p_{ik}(s+t)=\sum_{j=0}^{N}p_{ij}(s)p_{jk}(t)\)
  4. \(\lim_{h\downarrow 0} p_{ij}(h) = I(i=j)\)

where c) follows from the Chapman-Kolmogorov equations.

d is not strictly a consequence of the Markov property but is usually a sensible additional condition.

Let P(t)=(pij) denote the matrix of transition probabilities at time t, so P is a matrix whose entries are functions of t.

Now c can be written as

\[P(s+t)=P(s)P(t)\text{ for all } t,s \ge 0\]

and d as

\[\lim_{h\rightarrow 0} P(h)=I\]

this implies that \(P(t)\) is (right)-continuous at time 0, meaning each entry is continuous at \(t=0\). Now

\[ \begin{aligned} &\lim_{h\downarrow 0} P(t+h) = \\ &\lim_{h\downarrow 0} P(t)P(h) = \\ &P(t)\lim_{h\downarrow 0} P(h) = P(t)\\ \end{aligned} \] let \(0<h<t\), then

\[P(t) =P(t-h+h)=P(t-h)P(h)\] by d if \(h\) is small \(P(h)\approx I\), so \(P(h)^{-1}\) exists and \(P(h)^{-1}\approx I\), therefore \(P(t-h)=P(t)P(h)^{-1}\) and

\[\lim_{h\downarrow 0} P(t-h)=P(t)\]

Therefore \(P(t)\) is continuous for all \(t \ge 0\). Actually, we have even more: let’s define

\[ \begin{aligned} &q_i := \lim_{h\downarrow 0}\frac{1-P_{ii}(h)}{h}<\infty\\ &q_{ij} := \lim_{h\downarrow 0}\frac{P_{ij}(h)}{h}<\infty \end{aligned} \] now \(\sum_{j=0}^N P_{ij}(h)=1\) and so \(\sum_{j=0,j\ne i}^N P_{ij}(h)=1-P_{ii}(h)\) and therefore

\[q_i=\sum_{j=0,j\ne i}^N q_{ij}\] which shows that \(P(t)\) is even differentiable.

The rates \(q_i\) and \(q_{ij}\) give us a second way to describe a Markov chain, called the infinitesimal description:

\[ \begin{aligned} &P(X(t+h)=j|X(t)=i) = q_{ij}h+o(h)\\ &P(X(t+h)=i|X(t)=i) = 1-q_{i}h+o(h) \end{aligned} \]

Define

\[ \pmb{A} = \begin{pmatrix} -q_0 & q_{01} & ... & q_{0N}\\ q_{10} & -q_1 & ... & q_{1N}\\ \vdots & \vdots & ... &\vdots\\ q_{N0} & ... & ...& -q_N \end{pmatrix} \] then

\[ \begin{aligned} &A=\lim_{h\downarrow 0} \frac{P(h)-I}{h} = P'(0)\\ &\frac{P(t+h)-P(t)}{h} = \frac{P(t)P(h)-P(t)}{h}=P(t)\frac{P(h)-I}{h}\\ &P'(t) = \lim_{h\downarrow 0} \frac{P(t+h)-P(t)}{h} = AP(t)\\ \end{aligned} \] This is differential equation whose solution is the exponential function, and so

Theorem (5.4.3)

\[P(t)=\exp\{At\}=I+\sum_{n=1}^\infty\frac{A^nt^n}{n!}\]

proof calculation above

Example (5.4.4)

Two-state Chain

say \(\{X(t), t \ge 0\}\) is a Markov chain with \(X(t) \in \{0,1\}\) and \[ \pmb{A} = \begin{pmatrix} -a & a \\ b & -b \end{pmatrix} \] We need \(A^n\), so

\[ \begin{aligned} &\begin{vmatrix} -a-\lambda & a \\ b & -b-\lambda \end{vmatrix} = \\ &(a+\lambda)(b+\lambda)-ab = \\ &a\lambda+b\lambda+\lambda^2=\\ &(a+b+\lambda)\lambda = 0\\ \end{aligned} \] so the eigenvalues are \(\lambda_1=0\) and \(\lambda_2=a+b\). But with an eigenvalue \(\lambda_1=0\) \(A\) is not diagonalizable. Howeve note that

\[ \pmb{A}^2 = \begin{pmatrix} -a & a \\ b & -b \end{pmatrix} \begin{pmatrix} -a & a \\ b & -b \end{pmatrix}=\\ \begin{pmatrix} a^+ab & -a^2-ab \\ -ab-b^2 & b^2+ab \end{pmatrix}=\\ -(a+b) \begin{pmatrix} -a & a \\ b & -b \end{pmatrix}=-(a+b)A \] and so \(A^{n}=(-1)^{n-1}(a+b)^{n-1}A\) and so

\[ \begin{aligned} &P(t)=I+\sum_{n=1}^\infty\frac{A^nt^n}{n!} = \\ &I+\sum_{n=1}^\infty\frac{(-1)^{n-1}(a+b)^{n-1}At^n}{n!} = \\ &I-\frac1{a+b}\sum_{n=1}^\infty\frac{(-1)^{n}(a+b)^{n}t^n}{n!}A = \\ &I-\frac1{a+b}\left[\sum_{n=0}^\infty\frac{(-1)^{n}(a+b)^{n}t^n}{n!}-1\right]A = \\ &I-\frac1{a+b}\left[\exp\{-(a+b)t\}-1\right]A = \\ &I+\frac1{a+b}A-A\frac{e^{-(a+b)t}}{a+b} \end{aligned} \]

It is easy to find the stationary distribution of a continuous-time discrete-space Markov chain in terms of the infinitesimal matrix. If all states communicate, that is if \(P_{ij}(t)>0\) for all \(i,j\) and some \(t>0\), then

\[\lim_{t\rightarrow \infty} P_{ij} (t) = \pi_j >0\]

exists, and

\[\lim_{t\rightarrow \infty} P'(t) = \lim_{t\rightarrow \infty} AP(t)=A\pi\]

but

\[\lim_{t\rightarrow \infty} P'(t) =0\] otherwise the chain would never leave \(i\), and so we have \(\pi A=0\) or

\[\pi_j q_j = \sum_{i \ne j} \pi_i q_{ij}\]

\(j=0,..,N\).

Example (5.4.5)

Redundancy

A company has a computer for its website. If the computer is down they can’t sell anything, so they have a backup, which takes over if the first computer is down. The operating computer fails after an exponentially distributed time (with rate \(\mu\)). Repair times are also exponentially distributed (with rate \(\lambda\)). Let’s assume that \(\mu\) is fixed but we have a choice of \(\lambda\) (by hiring more technicians). We want to make sure that in the long run at most 1% of the time both computers are down. How should we choose \(\lambda\)?

Let \(X(t)\) be the number of computers in operating condition at time t, so \(X(t) \in\{ 0, 1, 2\}\). Then \(X(t)\) is a Markov chain with infinitesimal matrix

\[ A= \begin{bmatrix} -\lambda & \lambda & 0 \\ \mu & -(\lambda+\mu) & \lambda \\ 0 & \mu & -\mu \\ \end{bmatrix} \]

What is the average “total downtime”, that is the time when neither computer is working? The system of equations for the stationary distribution is

\[ \begin{aligned} &\lambda\pi_0=\mu\pi_1 \\ &(\lambda+\mu)\pi_1 =\mu\pi_1+\lambda\pi_2 \\ &\mu \pi_2 =\lambda \pi_1 \\ &\pi_0+\pi_1+\pi_2=1\\ &\\ &\pi_1=\frac{\lambda}{\mu}\pi_0\text{ ; }\pi_2=\frac{\lambda}{\mu}\pi_1=(\frac{\lambda}{\mu})^2\pi_0\\ &1=\pi_0+\frac{\lambda}{\mu}\pi_0+(\frac{\lambda}{\mu})^2\pi_0= \left(1+\frac{\lambda}{\mu}+(\frac{\lambda}{\mu})^2 \right)\pi_0\\ &\pi_i=\frac{(\frac{\lambda}{\mu})^i}{1+\frac{\lambda}{\mu}+(\frac{\lambda}{\mu})^2}\\ &\lim_{t\rightarrow\infty}P(X(t)>0)=1-\pi_0=\frac{\frac{\lambda}{\mu}+(\frac{\lambda}{\mu})^2}{1+\frac{\lambda}{\mu}+(\frac{\lambda}{\mu})^2} \end{aligned} \] and we see that the probability only depends on the ratio \(\lambda/\mu\). Setting \(x=\lambda/\mu\), we find

\[ \begin{aligned} &p = \frac{x+x^2}{1+x+x^2}\\ &(1+x+x^2)p = x+x^2\\ &(1-p)x^2+(1-p)x-p = 0\\ &x_{1,2}= \left(1-p\pm\sqrt{(1-p)^2-4(1-p)(-p)} \right)/(2(1-p)) =\\ &\frac12 \left(-1\pm\sqrt{\frac{1+3p}{1-p}} \right) \end{aligned} \] We have $p=0.01, so \(x=-1.01\) or \(x=0.01\), and therefore

\[\lambda=0.01\mu\]

Continuous-time Markov Chains with Infinite Statespace

Example (5.4.6)

Pure Birth process

Many web pages have a counter that keeps track of the number of people who have visited the site. We can model such a counter as a Markov Chain called a “Pure Birth” process. At time 0 there have been 0 visitors. Say at time t there have been \(X(t)=n\). The counter stays at n for time T that has an exponential distribution with rate \(\lambda\).

Example (5.4.7)

Birth and Death Processes

Consider a system whose state at any time is the number of “people” in the system. Suppose if there are n people in the system then

  1. new arrivals enter the system at an exponential rate \(\lambda_n\) (“births”)

  2. people leave the system at and exponential rate \(\mu_n\) (“deaths”)

  3. births and deaths occur independently of each other

Thus a birth and death process is a Markov chain with state-space {0,1,..} and

  1. \(v_0 = \lambda_0\)
  2. \(v_i = \lambda_i + \mu_i\)
  3. P01 = 1
  4. \(P_{i,i+1} = \lambda_i /(\lambda_i + \mu_i )\)
  5. \(P_{i,i-1} = \mu_i /(\lambda_i + \mu_i )\)

where 4) is because we go from i to i+1 if there is a birth before a death. Let \(X \sim Exp(\lambda), Y \sim Exp(\mu)\) and \(X \perp Y\). Now

\[ \begin{aligned} &P(X<Y) = \\ &\int_0^\infty P \left(X<Y|Y=y \right)f_Y(y)dy = \\ &\int_0^\infty P \left(X<y \right)f_Y(y)dy = \\ &\int_0^\infty \left(1-e^{-\lambda y}\right)\mu e^{-\mu y}dy = \\ &\int_0^\infty\mu e^{-\mu y}dy-\frac{\mu}{\lambda+\mu}\int_0^\infty(\lambda+\mu) e^{-(\lambda+\mu) y}dy=\\ &1-\frac{\mu}{\lambda+\mu}=\frac{\lambda}{\lambda+\mu} \end{aligned} \]

Example (5.4.8)

A simple epidemic model

Consider a population of m individuals that at time 0 consists of 1 “infected” and m-1 “susceptible” (individuals that might get infected, maybe because they have not been immunized. Once infected an individual remains so forever and we suppose that in any time interval h any given infected person will cause, with probability \(\alpha h+o(h)\) any given susceptible to become infected. If we let \(X(t)\) denote the number of infected people in the population at time t, \(X(t),t\ge 0\) is a pure birth process with

\[\lambda_n =(m-n)n \alpha, n=1,..,m-1\]

because if there are n infected people the m-n uninfected ones get infected at a rate of \(n \alpha\).

Let \(T_i\) be the time to go from i to i+1 infected, and let T be the time until the total population is infected. Now \(T_i\sim Exp((m-n)n\alpha)\) and so

\[ \begin{aligned} &T = \sum_{i=1}^{m-1} T_i\\ &E[T] = E \left[ \sum_{i=1}^{m-1} T_i\right]=\\ &\sum_{i=1}^{m-1} E[T_i] = \sum_{i=1}^{m-1} \frac1{(m-i)i\alpha} =\\ &\frac1{m\alpha} \sum_{i=1}^{m-1}\left(\frac1{m-i}+\frac1{i} \right)=\\ &\frac2{m\alpha} \sum_{i=1}^{m-1}\frac1i=\\ &\frac2{m\alpha} \sum_{i=1}^{m-1}\frac1i(i+1-i)\approx\\ &\frac2{m\alpha} \int_1^{m-1}\frac1xdx=\\ &\frac2{m\alpha} \log x|_1^{m-1}=\frac{2\log(m-1)}{m\alpha} \end{aligned} \]