Optimal Scaling of Discrete Approximations to Langevin Diffusions Gareth O. Roberts and Jeffrey S. Rosenthal University of Cambridge Research Report #95-11 ABSTRACT This paper contains theoretical results related to the practical implementation of certain Metropolis-Hastings algorithms (Metropolis et al., 1953; Hastings, 1970; Smith and Roberts, 1993) as used to explore probability distributions, for example in Bayesian statistics. Specifically, we consider issues related to discrete approximations to Langevin diffusions, as proposed in Grenander and Miller (1994), Phillips and Smith (1994), and Roberts and Tweedie (1995). In recent work of Roberts, Gelman and Gilks (1994), the problem of optimal scaling of proposal variances for random-walk Metropolis algorithms was considered. It was proved that, for Gaussian proposals and certain target distributions, the asymptotic acceptance probability should be tuned to be approximately $0.234$ for optimal performance of the algorithm. Furthermore, it was shown that the proposal variance should scale as $O(n^{-1})$ as the dimension $n \to \infty$. The paper thus provided a useful heuristic for running Metropolis algorithms efficiently. However, this result does not apply to more general Hastings algorithms. It is clear that if the proposal density makes use of the structure of the target density, then a higher acceptance probability is likely to be preferred. In this paper we propose a similar study for a class of algorithms given by discrete approximations to Langevin diffusions. A Langevin diffusion for a multivariate probability density function $\pi$ (with respect to Lebesgue measure) is the unique (up to a speed factor) diffusion which is reversible with respect to $\pi$. It makes use of the gradient of $\pi$ to move more often in directions in which $\pi$ is increasing. Thus, a discrete approximation to a Langevin diffusion should have an optimal acceptance probability which is larger than the $0.234$ figure for random-walk proposals. Our main results may be summarized as follows. For discrete approximations to Langevin diffusions for certain target distributions $\pi$, the optimal asymptotic acceptance probability in high dimensions is approximately $0.574$. Furthermore, the proposal variance should scale as $n^{-1/3}$. Therefore, Langevin algorithms are considerably more efficient than random-walk based Metropolis methods; the optimal proposal variances and acceptance probabilities are both substantially larger. However, care must be taken in the interpretation of these results, since Langevin algorithms can have unfortunate properties such as sub-geometric rate of convergence; see for example Roberts and Tweedie (1995). We note also that, while $0.574$ is the optimal acceptance probability, the speed of the algorithm remains relatively high for acceptance probabilities between, say, $0.4$ and $0.8$. A graph of the speed of the algorithm, as a function of the acceptance probability, is given in Figure 1.