# Importance Sampling

Let $$X$$ be a random variable whose distribution $$\pi$$ we call the target distribution, and $$\varphi$$ a function, called the target function. Importance sampling is used to compute integrals of the form:

$\mathbb{E}[\varphi(x)] = \int \varphi(x)\pi(x) \: dx$

In general it is assumed that generating samples from $$\pi$$ is hard, so let $$\pi_p$$ be a proposal distribution which has positive support over $$\pi$$ and we can easily generate samples from. One idea to compute the expectation above is to rewrite the integral above as: $\mathbb{E}[\varphi(x)] = \int \varphi(x)\frac{\pi(x)}{\pi_p(x)}\pi_p(x) \: dx = \int \varphi(x)w(x)\pi_p(x) \: dx$ Since we can sample from $$\pi_p$$ we can compute a Monte Carlo estimate, the importance sampling estimator, of the above as $$\mathbb{E}[\varphi(x)] \frac{1}{N}\approx \sum_{n=1}^N w(x_n)\varphi(x_n)$$ where $$x_n$$ are sampled from $$\pi_p$$ and $$w(x_n) = \frac{\pi(x_n)}{\pi_p(x_n)}$$ are called the importance weights.

Above, we assumed that we can evaluate the target distribution $$\pi$$ exactly, however in practice we may only be able to evaluate an unnormalized density $$\tilde{\pi}$$ such that $$\pi(x) = \frac{\tilde{\pi}(x)}{Z_\pi} = \frac{\tilde{\pi}(x)}{\int \tilde{\pi}(x)\: dx}$$. The resulting estimator is called the self-normalized importance sampling estimator. By substitution, we see that we have: $\mathbb{E}[\varphi(x)] = \int \varphi(x)\pi(x) \: dx = \int \frac{\varphi(x)\tilde{\pi}(x)}{\int \tilde{\pi}(x)\: dx} \: dx = \frac{\int \varphi(x)\tilde{\pi}(x)\:dx}{\int \tilde{\pi}(x)\: dx} = \frac{\int \varphi(x)\frac{\tilde{\pi}(x)}{\pi_p(x)}\pi_p(x) \:dx}{\int \frac{\tilde{\pi}(x)}{\pi_p(x)}\pi_p(x)\: dx}$ And the final Monte Carlo estimate is given as $$\mathbb{E}[\varphi(x)] \approx \frac{\frac{1}{N}\sum_{n=1}^N w(x_n)\varphi(x_n)}{\frac{1}{N}\sum_{n=1}^N w(x_n)}$$ where $$x_n$$ are sampled from $$\pi_p$$ and $$w(x_n) = \frac{\tilde{\pi}(x_n)}{\pi_p(x_n)}$$ are called the unnormalized weights. Unlike the direct version, the self-normalized estimator is biased, but consistent i.e. tends to the true value as $$N\rightarrow \infty$$. Note that the self-normalized estimator approximates $$Z_\pi = \int \tilde{\pi}(x) \: dx \approx \frac{1}{N}\sum_{n=1}^N w(x_n)$$ as a byproduct.

## Thoughts

• Put more details on the proposal distribution.

Created: 2022-03-13 Sun 21:45

Validate