continuous.markov.chains

Continuous-time Markov Chains

Definition (5.4.1)

Say $\{X(t), t \ge 0\}$ is a continuous-time stochastic process taking values on the nonnegative integers. Then $X(t),t\ge 0$ is a Markov chain if for all $s,t \ge 0$, and nonnegative integers $i,j, x(u), 0 \le u<s$ we have

\[P\left(X(t+s)=j|X(s)=i, X(u)=x(u);0\le u<s\right)=P\left(X(t+s)=j|X(s)=i\right)\]

The main result for such chains is

Theorem (5.4.2)

Let $\{X(t),t \ge 0\}$ be a continuous-time Markov chain with $X(0)=i$. Let $T_i$ be the time the process stays in state i. Then $T_i$ has an exponential distribution.

proof

\[ \begin{aligned} &P \left(T_i>s+t|T_i>s,X(0)=i \right) = \\ &P \left(X(u)=i,0<u<s+t|X(u)=i,0<u<s,X(0)=i \right) = \\ &P \left(X(u)=i,0<u<t|X(0)=i \right) = \\ &P(T_i>t) \end{aligned} \]

But this means that $T_i$ is memoryless! Of course $T_i$ is also non-negative and continuous, and therefore by theorem (2.2.3) $T_i$ has to have an exponential distribution.

With this we have the following characterization of a continuous-time Markov chain:

the amount of time spent in state i is an exponential distribution with mean $v_i$.
when the process leaves state i it next enters state j with some probability, say $P_{ij}$.

So a continuous-time Markov chain is a process that moves from state to state in accordance with a discrete-space Markov chain, but also spends an exponentially distributed amount of time in each state.

Notice that this means in a continuous-time Markov chain we always have $P_{kk}=0$, so the diagonal of the transition matrix is 0.

Let’s consider a finite-statespace continuous-time Markov chain, that is $X(t)\in \{0,..,N\}$. Let

\[P_{ij} (t) = P(X(t)=j|X(0)=i)\]

then the the Markov property asserts that $\{X(t),t \ge 0\}$ satisfies

$p_{ij}(t)\ge 0$
$\sum_{j=0}^{N}p_{ij}(t)=1$
$p_{ik}(s+t)=\sum_{j=0}^{N}p_{ij}(s)p_{jk}(t)$
$\lim_{h\downarrow 0} p_{ij}(h) = I(i=j)$

where c) follows from the Chapman-Kolmogorov equations.

d is not strictly a consequence of the Markov property but is usually a sensible additional condition.

Let P(t)=(p_ij) denote the matrix of transition probabilities at time t, so P is a matrix whose entries are functions of t.

Now c can be written as

\[P(s+t)=P(s)P(t)\text{ for all } t,s \ge 0\]

and d as

\[\lim_{h\rightarrow 0} P(h)=I\]

this implies that $P(t)$ is (right)-continuous at time 0, meaning each entry is continuous at $t=0$. Now

\[ \begin{aligned} &\lim_{h\downarrow 0} P(t+h) = \\ &\lim_{h\downarrow 0} P(t)P(h) = \\ &P(t)\lim_{h\downarrow 0} P(h) = P(t)\\ \end{aligned} \] let $0<h<t$, then

\[P(t) =P(t-h+h)=P(t-h)P(h)\] by d if $h$ is small $P(h)\approx I$, so $P(h)^{-1}$ exists and $P(h)^{-1}\approx I$, therefore $P(t-h)=P(t)P(h)^{-1}$ and

\[\lim_{h\downarrow 0} P(t-h)=P(t)\]

Therefore $P(t)$ is continuous for all $t \ge 0$. Actually, we have even more: let’s define

\[ \begin{aligned} &q_i := \lim_{h\downarrow 0}\frac{1-P_{ii}(h)}{h}<\infty\\ &q_{ij} := \lim_{h\downarrow 0}\frac{P_{ij}(h)}{h}<\infty \end{aligned} \] now $\sum_{j=0}^N P_{ij}(h)=1$ and so $\sum_{j=0,j\ne i}^N P_{ij}(h)=1-P_{ii}(h)$ and therefore

\[q_i=\sum_{j=0,j\ne i}^N q_{ij}\] which shows that $P(t)$ is even differentiable.

The rates $q_i$ and $q_{ij}$ give us a second way to describe a Markov chain, called the infinitesimal description:

\[ \begin{aligned} &P(X(t+h)=j|X(t)=i) = q_{ij}h+o(h)\\ &P(X(t+h)=i|X(t)=i) = 1-q_{i}h+o(h) \end{aligned} \]

Define

\[ \pmb{A} = \begin{pmatrix} -q_0 & q_{01} & ... & q_{0N}\\ q_{10} & -q_1 & ... & q_{1N}\\ \vdots & \vdots & ... &\vdots\\ q_{N0} & ... & ...& -q_N \end{pmatrix} \] then

\[ \begin{aligned} &A=\lim_{h\downarrow 0} \frac{P(h)-I}{h} = P'(0)\\ &\frac{P(t+h)-P(t)}{h} = \frac{P(t)P(h)-P(t)}{h}=P(t)\frac{P(h)-I}{h}\\ &P'(t) = \lim_{h\downarrow 0} \frac{P(t+h)-P(t)}{h} = AP(t)\\ \end{aligned} \] This is differential equation whose solution is the exponential function, and so

Theorem (5.4.3)

\[P(t)=\exp\{At\}=I+\sum_{n=1}^\infty\frac{A^nt^n}{n!}\]

proof calculation above

Example (5.4.4)

Two-state Chain

say $\{X(t), t \ge 0\}$ is a Markov chain with $X(t) \in \{0,1\}$ and \[ \pmb{A} = \begin{pmatrix} -a & a \\ b & -b \end{pmatrix} \] We need $A^n$, so

\[ \begin{aligned} &\begin{vmatrix} -a-\lambda & a \\ b & -b-\lambda \end{vmatrix} = \\ &(a+\lambda)(b+\lambda)-ab = \\ &a\lambda+b\lambda+\lambda^2=\\ &(a+b+\lambda)\lambda = 0\\ \end{aligned} \] so the eigenvalues are $\lambda_1=0$ and $\lambda_2=a+b$. But with an eigenvalue $\lambda_1=0$ $A$ is not diagonalizable. Howeve note that

\[ \pmb{A}^2 = \begin{pmatrix} -a & a \\ b & -b \end{pmatrix} \begin{pmatrix} -a & a \\ b & -b \end{pmatrix}=\\ \begin{pmatrix} a^+ab & -a^2-ab \\ -ab-b^2 & b^2+ab \end{pmatrix}=\\ -(a+b) \begin{pmatrix} -a & a \\ b & -b \end{pmatrix}=-(a+b)A \] and so $A^{n}=(-1)^{n-1}(a+b)^{n-1}A$ and so

\[ \begin{aligned} &P(t)=I+\sum_{n=1}^\infty\frac{A^nt^n}{n!} = \\ &I+\sum_{n=1}^\infty\frac{(-1)^{n-1}(a+b)^{n-1}At^n}{n!} = \\ &I-\frac1{a+b}\sum_{n=1}^\infty\frac{(-1)^{n}(a+b)^{n}t^n}{n!}A = \\ &I-\frac1{a+b}\left[\sum_{n=0}^\infty\frac{(-1)^{n}(a+b)^{n}t^n}{n!}-1\right]A = \\ &I-\frac1{a+b}\left[\exp\{-(a+b)t\}-1\right]A = \\ &I+\frac1{a+b}A-A\frac{e^{-(a+b)t}}{a+b} \end{aligned} \] —

It is easy to find the stationary distribution of a continuous-time discrete-space Markov chain in terms of the infinitesimal matrix. If all states communicate, that is if $P_{ij}(t)>0$ for all $i,j$ and some $t>0$, then

\[\lim_{t\rightarrow \infty} P_{ij} (t) = \pi_j >0\]

exists, and

\[\lim_{t\rightarrow \infty} P'(t) = \lim_{t\rightarrow \infty} AP(t)=A\pi\]

but

\[\lim_{t\rightarrow \infty} P'(t) =0\] otherwise the chain would never leave $i$, and so we have $\pi A=0$ or

\[\pi_j q_j = \sum_{i \ne j} \pi_i q_{ij}\]

$j=0,..,N$.

Example (5.4.5)

Redundancy

A company has a computer for its website. If the computer is down they can’t sell anything, so they have a backup, which takes over if the first computer is down. The operating computer fails after an exponentially distributed time (with rate $\mu$). Repair times are also exponentially distributed (with rate $\lambda$). Let’s assume that $\mu$ is fixed but we have a choice of $\lambda$ (by hiring more technicians). We want to make sure that in the long run at most 1% of the time both computers are down. How should we choose $\lambda$?

Let $X(t)$ be the number of computers in operating condition at time t, so $X(t) \in\{ 0, 1, 2\}$. Then $X(t)$ is a Markov chain with infinitesimal matrix

\[ A= \begin{bmatrix} -\lambda & \lambda & 0 \\ \mu & -(\lambda+\mu) & \lambda \\ 0 & \mu & -\mu \\ \end{bmatrix} \]

What is the average “total downtime”, that is the time when neither computer is working? The system of equations for the stationary distribution is

\[ \begin{aligned} &\lambda\pi_0=\mu\pi_1 \\ &(\lambda+\mu)\pi_1 =\mu\pi_1+\lambda\pi_2 \\ &\mu \pi_2 =\lambda \pi_1 \\ &\pi_0+\pi_1+\pi_2=1\\ &\\ &\pi_1=\frac{\lambda}{\mu}\pi_0\text{ ; }\pi_2=\frac{\lambda}{\mu}\pi_1=(\frac{\lambda}{\mu})^2\pi_0\\ &1=\pi_0+\frac{\lambda}{\mu}\pi_0+(\frac{\lambda}{\mu})^2\pi_0= \left(1+\frac{\lambda}{\mu}+(\frac{\lambda}{\mu})^2 \right)\pi_0\\ &\pi_i=\frac{(\frac{\lambda}{\mu})^i}{1+\frac{\lambda}{\mu}+(\frac{\lambda}{\mu})^2}\\ &\lim_{t\rightarrow\infty}P(X(t)>0)=1-\pi_0=\frac{\frac{\lambda}{\mu}+(\frac{\lambda}{\mu})^2}{1+\frac{\lambda}{\mu}+(\frac{\lambda}{\mu})^2} \end{aligned} \] and we see that the probability only depends on the ratio $\lambda/\mu$. Setting $x=\lambda/\mu$, we find

\[ \begin{aligned} &p = \frac{x+x^2}{1+x+x^2}\\ &(1+x+x^2)p = x+x^2\\ &(1-p)x^2+(1-p)x-p = 0\\ &x_{1,2}= \left(1-p\pm\sqrt{(1-p)^2-4(1-p)(-p)} \right)/(2(1-p)) =\\ &\frac12 \left(-1\pm\sqrt{\frac{1+3p}{1-p}} \right) \end{aligned} \] We have $p=0.01, so $x=-1.01$ or $x=0.01$, and therefore

\[\lambda=0.01\mu\]

Continuous-time Markov Chains with Infinite Statespace

Example (5.4.6)

Pure Birth process

Many web pages have a counter that keeps track of the number of people who have visited the site. We can model such a counter as a Markov Chain called a “Pure Birth” process. At time 0 there have been 0 visitors. Say at time t there have been $X(t)=n$. The counter stays at n for time T that has an exponential distribution with rate $\lambda$.

Example (5.4.7)

Birth and Death Processes

Consider a system whose state at any time is the number of “people” in the system. Suppose if there are n people in the system then

new arrivals enter the system at an exponential rate $\lambda_n$ (“births”)
people leave the system at and exponential rate $\mu_n$ (“deaths”)
births and deaths occur independently of each other

Thus a birth and death process is a Markov chain with state-space {0,1,..} and

$v_0 = \lambda_0$
$v_i = \lambda_i + \mu_i$
P₀₁ = 1
$P_{i,i+1} = \lambda_i /(\lambda_i + \mu_i )$
$P_{i,i-1} = \mu_i /(\lambda_i + \mu_i )$

where 4) is because we go from i to i+1 if there is a birth before a death. Let $X \sim Exp(\lambda), Y \sim Exp(\mu)$ and $X \perp Y$. Now

\[ \begin{aligned} &P(X<Y) = \\ &\int_0^\infty P \left(X<Y|Y=y \right)f_Y(y)dy = \\ &\int_0^\infty P \left(X<y \right)f_Y(y)dy = \\ &\int_0^\infty \left(1-e^{-\lambda y}\right)\mu e^{-\mu y}dy = \\ &\int_0^\infty\mu e^{-\mu y}dy-\frac{\mu}{\lambda+\mu}\int_0^\infty(\lambda+\mu) e^{-(\lambda+\mu) y}dy=\\ &1-\frac{\mu}{\lambda+\mu}=\frac{\lambda}{\lambda+\mu} \end{aligned} \]

Example (5.4.8)

A simple epidemic model

Consider a population of m individuals that at time 0 consists of 1 “infected” and m-1 “susceptible” (individuals that might get infected, maybe because they have not been immunized. Once infected an individual remains so forever and we suppose that in any time interval h any given infected person will cause, with probability $\alpha h+o(h)$ any given susceptible to become infected. If we let $X(t)$ denote the number of infected people in the population at time t, $X(t),t\ge 0$ is a pure birth process with

\[\lambda_n =(m-n)n \alpha, n=1,..,m-1\]

because if there are n infected people the m-n uninfected ones get infected at a rate of $n \alpha$.

Let $T_i$ be the time to go from i to i+1 infected, and let T be the time until the total population is infected. Now $T_i\sim Exp((m-n)n\alpha)$ and so

\[ \begin{aligned} &T = \sum_{i=1}^{m-1} T_i\\ &E[T] = E \left[ \sum_{i=1}^{m-1} T_i\right]=\\ &\sum_{i=1}^{m-1} E[T_i] = \sum_{i=1}^{m-1} \frac1{(m-i)i\alpha} =\\ &\frac1{m\alpha} \sum_{i=1}^{m-1}\left(\frac1{m-i}+\frac1{i} \right)=\\ &\frac2{m\alpha} \sum_{i=1}^{m-1}\frac1i=\\ &\frac2{m\alpha} \sum_{i=1}^{m-1}\frac1i(i+1-i)\approx\\ &\frac2{m\alpha} \int_1^{m-1}\frac1xdx=\\ &\frac2{m\alpha} \log x|_1^{m-1}=\frac{2\log(m-1)}{m\alpha} \end{aligned} \]