A gentle introduction to MCMC

A markov chain monte carlo process can be described as person exploring an unknown landscape in complete darkness, trying on the hand to stay on as high altitudes as possible, and on the other hand to look for places with even higher altitude by walking around. The person effectively has no memory, a step is evaluated only against the current position. If the new position is higher, the step is taken, if it is not, then it is most often not taken, but occassionally the step is taken even if it leads to a lower altitude. By accepting worse positions the person gains the possibility to find even higher places beyond the valley. If worse positions were never accepted, the first local maxima found would be the end of the exploration.

Markov chain monte carlo processes can be used to find optimal parameter estimates in regression analysis. In that case, the log-likelihood of the current parameter estimates, given the observed data, corresponds to the altitude of the current position. (For each parameter to estimate, the explored space needs one dimension, and then there is one dimension for the log-likelihood, so a random walk to explore two parameters would result in a three dimensional space).

Below is the source code and an animation that visualises a random walk, where x and y are changed randomly (the change in x and change in y for each step is drawn from a normal distribution with mean 0 and standard deviation 0.5), while z is evaluated ("observed") at each position.

x <- y <- seq(from = -10, to = 10, by = 0.45)
z <- outer(x, y, function(x, y) { sin(sqrt(x^2+ y^2))/sqrt(x^2+ y^2) } )
res <- persp(x, y, z, zlim = c(-0.4, 1.5), phi = 30, theta = 90)
my.x <- -1.45456472411752 # runif(n = 1, min = -10, max = 10)
my.y <- 7.12612085510045 # runif(n = 1, min = -10, max = 10)
my.z <- sin(sqrt(my.x^2+ my.y^2))/sqrt(my.x^2+ my.y^2)
n.iter <- 5000
my.chain <- matrix(NA, ncol = n.iter, nrow = 3)
my.delta.x <- rnorm(n = n.iter, mean = 0, sd = 0.5)
my.delta.y <- rnorm(n = n.iter, mean = 0, sd = 0.5)
proposed.x <- proposed.y <- 0
for(i in 1:n.iter){
    proposed.x <- my.x + my.delta.x[i]
    proposed.y <- my.y + my.delta.y[i]
    proposed.z <- sin(sqrt(proposed.x^2+ proposed.y^2))/
                      sqrt(proposed.x^2+ proposed.y^2)
    if(min(proposed.z, my.z) < 0){
        offset <- abs(min(proposed.z, my.z)) + 0.2
    } else {
        offset <- 0.2
    }
    if((proposed.x < max(x) & proposed.x > min(x) & proposed.y < max(y)
        & proposed.y > min(y))
     & (proposed.z > my.z | sample(x = c(TRUE, FALSE), size = 1,
              prob = c((offset + proposed.z)/(offset + my.z),
                       (offset + my.z)/(offset + proposed.z))))){
        my.x <- proposed.x
        my.y <- proposed.y
        my.z <- proposed.z
        my.chain[, i] <- c(my.x, my.y, my.z)
    } else {
        my.chain[, i] <- c(my.x, my.y, NA)
    }
}
moves <- my.chain[, which(is.na(my.chain[3, ]) == FALSE)]
image.counter <- 0
worm.positions <- matrix(NA, nrow=5, ncol=2)
known.z <- matrix(NA, ncol=ncol(z), nrow=nrow(z))
update.known.z <- function(known.z, this.grid.cell){
    known.z[c(this.grid.cell[['x0']], this.grid.cell[['x1']]),
            c(this.grid.cell[['y0']], this.grid.cell[['y1']])] <-
          z[c(this.grid.cell[['x0']], this.grid.cell[['x1']]),
            c(this.grid.cell[['y0']], this.grid.cell[['y1']])]
    return(known.z)
}
for(i in 1:2500){
    n.steps <- 10 # each move is divided into ten steps (to get smooth motion in the video).
    local.x <- seq(from = moves[1, i], to = moves[1, i+1], length.out = n.steps)
    local.y <- seq(from = moves[2, i], to = moves[2, i+1], length.out = n.steps)
    local.z <- sin(sqrt(local.x^2+ local.y^2))/sqrt(local.x^2+ local.y^2)
    for(j in 1:n.steps){
        image.counter <- image.counter + 1
        par(new=TRUE)
        png(filename=paste(image.counter, "png", sep = "."), width=768, height=768)
        this.grid.cell <- c(y0=tail(which(y <= local.y[j]), 1),
                            y1=head(which(y >= local.y[j]), 1),
                            x0=tail(which(x <= local.x[j]), 1),
                            x1=head(which(x >= local.x[j]), 1))
        known.z <- update.known.z(known.z, this.grid.cell)
        new.res <- persp(x, y, known.z, zlim = c(-0.4, 1.5), phi = 30, theta = 90,
                                                            box=FALSE, axes=FALSE)
        worm.positions[2:5, ] <- worm.positions[1:4, ] # shift
        worm.positions[1, ] <- unlist(trans3d(x=local.x[j],
                                      y=local.y[j], z=local.z[j], pmat = new.res))
        points(x=worm.positions[, 1], y=worm.positions[, 2],
               col = rgb(red = 1, green = 0, blue = 0, alpha = 0.5), pch = 20)
        dev.off()
    }
}

The image-files can be put together into a movie with the following command:

avconv -framerate 25 -i %d.png -c:v libx264 mcmc.mov

Here is an example:

comments powered by Disqus

Back to the index

Blog roll

R-bloggers, Debian Weekly

Last modified: oktober 12, 2017