Metropolis-Hastings Algorithm

A Markov Chain Monte Carlo (MCMC) algorithm used to generate samples from a target distribution \(\pi\) on some state space \(\mathcal{X}\). At each step \(i\) of the algorithm, we generate a sample (proposal) from some proposal distribution, i.e. \(x_i \sim \pi_p(x_i|x_{i-1})\), also called the kernel/transition kernel, which satisfies the property that the support of the target distribution is contained it its support, i.e. \(\text{Support}(\pi) \subseteq \cup_{x} \text{Support}(\pi_p(\cdot |x))\). This proposed value, which proposes to change the state from \(x_{i-1}\) to \(x_i\), is then accepted or rejected with a certain probability. \(A = \min(1, \frac{\tilde{\pi}(x_i)\pi_p(x_{i-1}|x_{i})}{\tilde{\pi}(x_{i-1})\pi_p(x_{i}|x_{i-1})})\), and \(\pi(x)= \frac{\tilde{\pi}(x)}{Z_\pi}\).

To get an intuition for this probability, first assume that the proposal distribution is symmetric, then we have \(A = \min(1, \frac{\tilde{\pi}(x_i)}{\tilde{\pi(x_{i-1})}})\), meaning the new state \(x_i\) is accepted if the odds of being in the proposed value are higher, but also, if the opposite holds, there is a non-zero probability of accepting the new state \(x_i\) even if \(\tilde{\pi}(x_i)<\tilde{\pi}(x_{i-1})\). Intuitively it is easy to see that the chain thus will spend most of its time in states with higher probability mass.

When \(\pi_p\) is not symmetric, the general form of \(A\) includes the Hastings correction which corrects for the fact that the proposal favours certain states by definition of being not symmetric.

The algorithm is summarized below.

Initialize X0 for i=1,2,3… X=Xi Xprop = Q(Xprop|Xi).sample A = min(1, P(Xprop)Q(Xi|Xprop)/P(Xi)Q(Xprop|Xi)) U = Uniform[0,1].sample if U<A: Xi+1 = Xprop else: Xi+1 = Xi

Thoughts

Need to finish "details" on detailed balance.
Read MacKay Section 29.32 again.