accrej

Standard Methods

Random Numbers

Everything starts with generating X₁, X₂, .. iid U[0,1]. There are simply called random numbers. There are some ways to get these:

• random number tables
• numbers taken from things like the exact (computer) time
• quantum random number generators
• …

Currently none of these are practically useful (and affordable)

Pseudo-Random Numbers

These are numbers that look random, smell random …

Of course a computer can not do anything truly random, so all we can do is generate X₁, X₂, .. that appear to be iid U[0,1], so-called pseudo-random numbers.
Luckily, some people are really good at that!

Example

A linear congruential generator works as follows: start with a seed X₀, calculate
X_n+1=(aX_n+c) mod m
where a, b, c and m are chosen such that

• c and m are relatively prime
• a-1 is divisible by all prime factors of m
• a-1 is a multiple of 4 if m is a multiple of 4

A well known PRNG in Numerical Recipes in C uses a=1664525, b=0, c=1013904223 and m=2³²

Some computer programs (like R) have them already built in, most general computer languages (like C) do not. There are many excellent ones available on the Internet.

Some issues to be aware of:

All pseudo random number generators are cyclic, that is there is an N such that X₁=X_N, X₂=X_N+1 etc. For any decent method we have N in the billions. For the one above N=m=2³²

All pseudo random number generators have a SEED, usually an integer. If you want to generate the same sequence you can do this by specifying this SEED.
in R if you want to generate the same sequence again then use the command set.seed(SEED) where SEED is an integer.

There are often subtle differences between compilers so don’t expect the same program to generate the same sequence on different computers.

Built in Routines

As a statistics program R has routine built in to generate most standard rv’s. They usually start with r and then the name of the distribution, for example rnorm

Generating Discrete Random Variables - The Inverse Transform Method

Say we want to generate a random variable X from a distribution with density f(x_j) = P(X=x_j) = p_j , j=1,2,…
Here is a simple algorithm to do this:

Step 1: generate U~U[0,1], set j=1, p=p₁
Step 2: if U<p, set X=x_j, done
Step 3: set j=j+1, p=p+p_j, goto Step 2

Why this works:

Here we have p₁ < U < p₁ + p₂, so we set X=x₂

Now P(X=x₂) = P(p₁ < U < p₁ + p₂) = p₂

Here is a very simple routine that implements this algorithm:

gendisc <- function (n, x, p) {  
  y = rep(0, n)  
  m = length(x)  
  cdf = cumsum(p)  
  for (i in 1:n) {  
    U = runif(1)  
    for (j in 1:m) {  
      if (U < cdf[j]) {  
          y[i] = x[j]  
          break  
      }  
    }  
  }  
  y  
}

Example

let’s generate data from the following distribution:

tbl <- source("tables/accrej.R")[[1]]
kable.nice(tbl[[1]], do.row.names = FALSE)

x	1	2	3	4	5	6	7	8	9	10
p	0.01	0.02	0.1	0.04	0.3	0.22	0.1	0.01	0.01	0.19

x=1:10  
p=c(1,2,10,4,30,22,10,1,1,19)/100  
table(gendisc(10000, x, p))/10000

## 
##      1      2      3      4      5      6      7      8      9 
## 0.0112 0.0215 0.1009 0.0397 0.2903 0.2281 0.0947 0.0083 0.0097 
##     10 
## 0.1956

If the rv has an infinite sample space we first need to find a “maximal” value:

Example

generate data from a Geom(0.1)

Let’s say we want to generate 10000 observations. Then we need x such that P(max{X_i;i=1,..,10000)>x)<ε (say 10^-5). So

and with ε=0.0001, n=10000 and p=0.1 we find x=197

x <- gendisc(10000,1:197, 0.1*0.9^(0:196))
head(table(x))/10000

## x
##      1      2      3      4      5      6 
## 0.1006 0.0872 0.0791 0.0720 0.0639 0.0634

The Accept-Reject Algorithm

Suppose we have a method for generating a random variable having density {q_j, j=1,2,..} and we want to generate r.v. from a distribution with density {p_j, j=1,2,..}.We can do this by first simulating a r.v. Y from {q_j} and then “accepting” this simulated value with probability proportional to p_Y/q_Y.

Specifically, let c be a constant such that p_j/q_j ≤ c for all j such that p_j > 0. Then

Step 1: generate Y from density {q_j}
Step 2: generate U~U[0,1]
Step 3: If U < p_Y/(cq_Y), set X = Y and stop. Otherwise go back to 1

Example :

Say we want to generate a r.v X with values x in {1,3,5,7} and probabilities p=c(0.1,0.5,0.1,0.3). First we need a r.v. Y which is easy to generate and takes 4 values (not necessarily the same as in X though!). We can use for this the r.v. that chooses a number from 1 to 4 at random, using the sample(1:4,1) command. This has density q=(1/4,1/4,1/4,1/4)., so

p/q=(0.4,2,0.4,1.2)

and if we set c=2 (=max{p/q}) we have

p_j/q_j ≤ c for all j.

with this the accept-reject algorithm for this problem is implemented in

accrej1 <- 
function (n)
{
    x=c(1,3,5,7)
    p=c(0.1,0.5,0.1,0.3)
    z <- rep(0, n)
    m <- length(x)
    q <- rep(1/4, 4)
    for (i in 1:n) {
        for (j in 1:100) {
            U <- runif(1)
            Y <- sample(1:4, 1)
            if (U < (p[Y]/(2 * q[Y]))) {
                z[i] <- x[Y]
                break
            }
        }
    }
    z
}

Why this works: if our “candidate” r.v Y picks a value y which has a high probability in p, it will often be accepted and we get many of these. If on the other hand Y picks a value which has a low probability in p, it will rarely be accepted and we get only a few of those. The method is illustrated in

accrej.ill <- function (n) 
{
    x <- c(1, 3, 5, 7)
    p <- c(0.1, 0.5, 0.1, 0.3)
    q <- rep(1/4, 4)
    X <- rep(0, n)
    plot(c(0, 8), c(0, 1), type = "n", xlab = "x", ylab = "P/(cq)")
    segments(x - 0.2, 2 * p, x + 0.2, 2 * p, lwd = 3)
    for (i in 1:n) {
        for (j in 1:100) {
            Y <- sample(1:4, 1)
            U <- runif(1)
            if (U < p[Y]/(2 * q[Y])) {
                X[i] <- x[Y]
                points(x[Y], U, pch = "x")
                break
            }
            else {
                points(x[Y], U, pch = "o")
            }
        }
    }
    X
}
accrej.ill(10)

##  [1] 3 3 5 3 3 3 3 7 3 7

Theorem
The accept-reject algorithm generates a r.v. X such that P(X=x_j) = p_j.
In addition, the number of iterations of the algorithm needed until X is found is a geometric r.v. with mean c.

proof

Now each iteration is a Bernoulli trial with success probability 1/c, and successive trials are independent. Therefore the number of trials needed until the first success is a geometric r.v. with mean c. Also

Example:

say we want to generate r.v’s X with P(X=x)=cxp^x, x=1,2,.. First we need c:

Now what can we use as the reference distribution Y? We need a distribution with values on {1,2,..}, and an obvious choice is Y~Geom(r). Then

How to choose r? Clearly we need p<s<1, otherwise p_x/q_x→∞ . How about s=(p+1)/2? Then r=1-(p+1)/2 and

p_x/q_x= [q²/r]x(2p/(p+1))^x-1

what is max{ p_x/q_x;x=1,2..}? let’s use R to find out:

p=0.5;q=1-p;x=1:20;plot(x,q^{2/(1-(p+1)/2)*x*(2*p/(p+1))}(x-1))

and repeating this with different values of p shows that there is a unique maximum. We can find c by calculating p_x/q_x until

p_x/q_x>p_x+1/q_x+1

find c as described above, set r=1-(p+1)/2 and then

generate Y~Geom(r)
generate U~U[0,1]
if U<[q²/(rc)]Y(2p/(p+1))^Y-1 set X=Y , otherwise try again

accrej.geom <- function (p)
{
    x=1:20
    plot(x,(1-p)^2/(1-(p+1)/2)*x*(2*p/(p+1))^(x-1)) 
    x=1
    a1=(1-p)^2/(1-(p+1)/2)*x*(2*p/(p+1))^(x-1)
    repeat {
        x=x+1
        a2=(1-p)^2/(1-(p+1)/2)*x*(2*p/(p+1))^(x-1)
        if(a1>a2) {
            cc=a1
            break
        }
        a1=a2
    }        
    print(c(x-1,a1))
    n=10000
    x <- rep(0, n)
    for (i in 1:n) {
        repeat {
            U <- runif(1)
            Y <- rgeom(1,1-(p+1)/2)
            if (U < (1-p)^2/(1-(p+1)/2)/cc*Y*(2*p/(p+1))^(Y-1)) {
                x[i] <- Y
                break
            }
        }
    }
    z=table(x)/n
    x=as.numeric(names(z))
    A=matrix(0,2,length(x))
    dimnames(A)[[2]]=x
    A[1,]=z
    A[2,]=(1-p)^2*x*p^(x-1)
    round(A,4)
}
accrej.geom(0.3)

## [1] 1.0 1.4

##          1      2      3      4      5      6      7     8
## [1,] 0.485 0.2964 0.1317 0.0550 0.0206 0.0078 0.0024 8e-04
## [2,] 0.490 0.2940 0.1323 0.0529 0.0198 0.0071 0.0025 9e-04
##          9    10
## [1,] 2e-04 1e-04
## [2,] 3e-04 1e-04

Continuous Distributions

This is very similar (actually the same) as the method for discrete r.v. Assume want to generate a r.v. X with density f. We have a way to generate a r.v. Y with density g. Let c be a constant such that f(x)/g(x)≤ c for all x. Then the accept-reject algorithm is as follows:
Step 1: generate Y from pdf g
Step 2: generate U~U[0,1]
Step 3: If U < f(Y)/(cg(Y)), set X = Y and stop. Otherwise go back to 1

We have the same theorem as for the discrete case:
Theorem
The accept-reject algorithm generates a r.v. X with density f.
In addition, the number of iterations of the algorithm needed until X is found is a geometric r.v. with mean c.

proof same as above.

Note: we do have to be careful because of course g(x) can be 0. This is ok as long as f(x) is 0 as well but not if f(x)>0. Basically we need Y to live on the same set as X. (We say X and Y have the same support)

Example

generate a r.v. X with density f(x)=6x(1-x) 0<x<1.
Here we can use Y~U[0,1], with g(x) = 1. Let’s find max{f(x)/g(x) | 0<x<1}. Taking derivatives we find d/dx {6x(1-x)} = 6-12x = 0, x=1/2. This is a maximum (?) so we have max{f(x)/g(x) | 0<x<1} = f(1/2) = 3/2 = c
Note: f(x)/(cg(x)) = 6x(1-x)/(3/2*1) = 4x(1-x)
The routine is implemented in

accrej3 <- function (n, findc = F, Show = F) 
{
    if (findc) {
        x <- seq(0, 1, length = 100)
        plot(x, 6 * x * (1 - x), type = "l")
        return(max(6 * x * (1 - x)))
    }
    X <- rep(0, n)
    for (i in 1:n) {
        for (j in 1:100) {
            Y <- runif(1)
            if (runif(1) <= 4 * Y * (1 - Y)) {
                X[i] <- Y
                break
            }
        }
    }
    if (Show) {
        hist(X, probability = T)
        x <- seq(0, 1, length = 100)
        lines(x, 6 * x * (1 - x))
    }
    X
}
hist(accrej3(1e4), 50, freq=FALSE,
     main="", xlab="", ylab="")
curve(6*x*(1-x), 0, 1, lwd=2, col="blue", add=TRUE)

why the accept-reject algorithm works is illustrated on this Example in accrej1.ill

accrej1.ill <- function (n) 
{
    f <- function(x) {
        6 * x * (1 - x)
    }
    g <- function(x) {
        1
    }
    c <- 3/2
    x <- seq(0, 1, length = 100)
    plot(x, f(x)/(c * g(x)), xlab = "x", ylab = "f/(cg)", type = "l", 
        xlim = c(0, 1), ylim = c(0, 1))
    for(i in 1:n) {
      Y <- runif(1)
      U <- runif(1)
      if (U <= f(Y)/(c * g(Y))) {
        points(Y, U, pch = "X", col = 3)
      }
      else {
        points(Y, U, pch = "0", col = 2)
      }
    }
}
accrej1.ill(20)

Example

generate a r.v. X ~ G(3/2, 1)

Now f(x)>0 for x>0, so we need a Y that “lives” on (0, ∞) and that we already know how to generate, for example an exponential r.v.

What value of λ should we use? Note that EX = 3/2, EY = λ, so maybe λ=3/2 is a good idea.
with this we need to find c. Again we will try to find max{f(x)/g(x)}. We have

and so

Above we picked λ=3/2 because this matched up the means of X and Y, a reasonable choice. But is it an optimal choice? Let’s see. Optimal here means a choice of λ that minimizes c. Now if we repeat the above calculation with λ instead of 3/2 we find

This is the minimum (?) and so our choice was in fact optimal.

Generating Random Vectors

The Accept-Reject algorithm can be used here as well:

Example

generate rv from the joint density f(x,y)=6x, 0≤x<y≤1, 0 otherwise.

(X,Y) takes values in [0,1]x[0,1], so our proposal rv has to do the same. Let’s use (U₁,U₂) uniform on [0,1]x[0,1]. Then

c = max{f(x,y)/g(x,y)} =max{6x;0<x,y<1} =6, so f(x,y)/[cg(x,y)]=x if x<y, 0 otherwise

and so the algorithm is

Step 1: generate (U₁,U₂) from U[0,1]
Step 2: generate U~U[0,1]
Step 3: If U < U₁<U₂ set (X,Y) = (U₁,U₂) and stop. Otherwise go back to 1

How can we verify that this actually generates the correct data? Here is one check. recall

f_X(x)=6x(1-x) 0<x<1

f_Y(y)=3y² 0<y<1

accrej6 generates the data, draws the histogram of X and Y and adds the curves.

accrej.vec <- function (which=1,n=1000)
{
    par(mfrow=c(2,2))
    x <- matrix(0, n,2)
    if(which==1) {
      for (i in 1:n) {
        repeat {
            U <- runif(3)
            if (U[1] < U[2]& U[2]<U[3]) {
                x[i,] <- U[2:3]
                break
            }
        }
      }
    }  
    else {
          x[,2]=rbeta(n,3,1)
          x[,1]=x[,2]*rbeta(n,2,1)
    }      
    plot(x)
    z=seq(0,1,length=200)
    hist(x[,1],breaks=100,freq=F)
    lines(z,6*z*(1-z),lwd=2)
    hist(x[,2],breaks=100,freq=F)
    lines(z,3*z^2,lwd=2)
 
}
accrej.vec()

Generating Stochastic Processes

There is not much new here. In general stochastic processes have to be generated “one step at a time”, because often X_n+1 depends on X_n.

Example

Consider a Markov Chain {X_n,n≥0} with state space S={1,2,3} and transition matrix

Use simulation to verify that the stationary distribution is π^T=(q/2,1/2,p/2)

mc1 <- function (p, B=10000) 
{
      finalstate=rep(0,B)
      n=sample(c(21:30),size=B,replace=T)
      for(i in 1:B) {
          x=0
          for(j in 2:n[i]) {
              if(x==1) x=sample(c(0,2),size=1,prob=c(1-p,p))
              else x=1
          }    
          finalstate[i]=x
      }    
      print(finalstate[1:10])
      print("pi - Simulation")
      print(table(finalstate)/B)
      print("pi - True")      
      c(1-p,1,p)/2
}
mc1(0.2)

##  [1] 0 2 0 1 1 0 0 0 1 1
## [1] "pi - Simulation"
## finalstate
##      0      1      2 
## 0.3983 0.5038 0.0979 
## [1] "pi - True"

## [1] 0.4 0.5 0.1

Note: because the chain is periodic we need to check the final states at a random time.

Example

Consider the following: process: it starts at 0 at time 0, stays for an exponential time with rate 1, then jumps to 1, stays for an exponential time with rate 1, then jumps to 2 and so on. In homework 9 we showed that

P(t)_0n = tⁿe^-t/n!

write a simulation that verifies this.

Generating a realization of this process is easy:

x=cumsum(rexp(n,1))

Now we need to check where the process is at time t.

length(x[x<t])

will tell us

mc2 checks all “reasonable” values of n for a fixed t.

 mc2 <- function (t,B=10000) 
{
      n=round(2*t)
      counter=rep(0,n)
      for(i in 1:B) {
          x=cumsum(rexp(3*round(t),1))
          for(j in 1:n) {
              if(length(x[x<t])==j) counter[j]=counter[j]+1
          }    
      }    
      cbind(counter/B,t^c(1:n)*exp(-t)/factorial(c(1:n)))
 }
mc2(2)

##        [,1]       [,2]
## [1,] 0.2793 0.27067057
## [2,] 0.2726 0.27067057
## [3,] 0.1762 0.18044704
## [4,] 0.0877 0.09022352