Gibbs Sampler

An algorithm to sample from some joint distribution \(\pi\) over \(p\) variables by iteratively sampling from each variable conditioned on all other variables, i.e. from \(\pi(X_i|X_{-i})\) where \(X_{-i} = \{X\}_{i=1}^p ∕ X_i\). \(\pi(X_i|X_{-i})\) is called the full conditional distribution for variable \(X_i\).

The Gibbs sampler is a special case of the Metropolis-Hastings Algorithm with the proposal distribution being \(\pi_p(x'|x)=\pi(x'_i|x_{-i})\mathbb{I}(x'_{-i} = x_{-i})\). This gives a 100% acceptance probability, since \[ \alpha = \frac{\pi(x')\pi_p(x|x')}{\pi(x)\pi_p(x'|x)} = \frac{\pi(x'_i|x'_{-i})\pi(x'_{-i})\pi(x_i|x'_{-i})}{\pi(x_i|x_{-i})\pi(x_{-i})\pi(x'_i|x_{-i})}=\frac{\pi(x'_i|x'_{-i})\pi(x'_{-i})\pi(x_i|x'_{-i})}{\pi(x_i|x'_{-i})\pi(x'_{-i})\pi(x'_i|x'_{-i})}=1\]

If sampling from the full conditionals are difficult, we can use the Metropolis-Hastings Algorithm - this is called Metropolis within Gibbs.

Analytically marginalizing out unknown quantities can give better results* - this is called the collapsed Gibbs sampler.

Thoughts

Does the proposal define a probability distribution for a fixed \(x\)?
* Expand a bit more here - is it better in terms of sample efficiency and/or lower variance and if so why.