×

MCMC methods for functions: modifying old algorithms to make them faster. (English) Zbl 1331.62132

Summary: Many problems arising in applications result in the need to probe a probability distribution for functions. Examples include Bayesian nonparametric statistics and conditioned diffusion processes. Standard MCMC algorithms typically become arbitrarily slow under the mesh refinement dictated by nonparametric description of the unknown function. We describe an approach to modifying a whole range of MCMC methods, applicable whenever the target measure has density with respect to a Gaussian process or Gaussian random field reference measure, which ensures that their speed of convergence is robust under mesh refinement. Gaussian processes or random fields are fields whose marginal distributions, when evaluated at any finite set of \(N\) points, are \(\mathbb{R}^{N}\)-valued Gaussians. The algorithmic approach that we describe is applicable not only when the desired probability measure has density with respect to a Gaussian process or Gaussian random field reference measure, but also to some useful non-Gaussian reference measures constructed through random truncation. In the applications of interest the data is often sparse and the prior specification is an essential part of the overall modelling strategy. These Gaussian-based reference measures are a very flexible modelling tool, finding wide-ranging application. Examples are shown in density estimation, data assimilation in fluid mechanics, subsurface geophysics and image registration. The key design principle is to formulate the MCMC method so that it is, in principle, applicable for functions; this may be achieved by use of proposals based on carefully chosen time-discretizations of stochastic dynamical systems which exactly preserve the Gaussian reference measure. Taking this approach leads to many new algorithms which can be implemented via minor modification of existing algorithms, yet which show enormous speed-up on a wide range of applied problems.

MSC:

62F15 Bayesian inference
65C05 Monte Carlo methods
65C40 Numerical analysis or methods applied to Markov chains
60G60 Random fields
60J22 Computational methods in Markov chains
62G05 Nonparametric estimation
62G07 Density estimation

Software:

GMRFLib
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] Adams, R. P., Murray, I. and Mackay, D. J. C. (2009). The Gaussian process density sampler. In Advances in Neural Information Processing Systems 21.
[2] Adler, R. J. (2010). The Geometry of Random Fields . SIAM, Philadeliphia, PA. · Zbl 1182.60017
[3] Bennett, A. F. (2002). Inverse Modeling of the Ocean and Atmosphere . Cambridge Univ. Press, Cambridge. · Zbl 1043.86003
[4] Beskos, A., Roberts, G., Stuart, A. and Voss, J. (2008). MCMC methods for diffusion bridges. Stoch. Dyn. 8 319-350. · Zbl 1159.65007
[5] Beskos, A., Pinski, F. J., Sanz-Serna, J. M. and Stuart, A. M. (2011). Hybrid Monte Carlo on Hilbert spaces. Stochastic Process. Appl. 121 2201-2230. · Zbl 1235.60091
[6] Beskos, A., Pillai, N. S., Roberts, G. O., Sanz-Serna, J. M. and Stuart, A. M. (2013). Optimal tuning of hybrid Monte Carlo. Bernoulli . To appear. Available at . · Zbl 1287.60090
[7] Cotter, C. J. (2008). The variational particle-mesh method for matching curves. J. Phys. A 41 344003, 18. · Zbl 1153.37010
[8] Cotter, S. L. (2010). Applications of MCMC methods on function spaces. Ph.D. thesis, Univ. Warwick.
[9] Cotter, C. J., Cotter, S. L. and Vialard, F. X. (2013). Bayesian data assimilation in shape registration. Inverse Problems 29 045011. · Zbl 1273.65013
[10] Cotter, S. L., Dashti, M. and Stuart, A. M. (2012). Variational data assimilation using targetted random walks. Internat. J. Numer. Methods Fluids 68 403-421. · Zbl 1426.76629
[11] Cotter, S. L., Dashti, M., Robinson, J. C. and Stuart, A. M. (2009). Bayesian inverse problems for functions and applications to fluid mechanics. Inverse Problems 25 115008, 43. · Zbl 1228.35269
[12] Da Prato, G. and Zabczyk, J. (1992). Stochastic Equations in Infinite Dimensions. Encyclopedia of Mathematics and Its Applications 44 . Cambridge Univ. Press, Cambridge. · Zbl 0761.60052
[13] Dashti, M., Harris, S. and Stuart, A. (2012). Besov priors for Bayesian inverse problems. Inverse Probl. Imaging 6 183-200. · Zbl 1243.62032
[14] Diaconis, P. (1988). Bayesian numerical analysis. In Statistical Decision Theory and Related Topics , IV , Vol. 1 ( West Lafayette , Ind. , 1986) 163-175. Springer, New York.
[15] Duane, S., Kennedy, A. D., Pendleton, B. and Roweth, D. (1987). Hybrid Monte Carlo. Phys. Lett. B 195 216-222.
[16] Girolami, M. and Calderhead, B. (2011). Riemann manifold Langevin and Hamiltonian Monte Carlo methods (with discussion). J. R. Stat. Soc. Ser. B Stat. Methodol. 73 123-214.
[17] Glaunes, J., Trouvé, A. and Younes, L. (2004). Diffeomorphic matching of distributions: A new approach for unlabelled point-sets and sub-manifolds matching. In Computer Vision and Pattern Recognition , 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on 2 712-718. IEEE.
[18] Hairer, M., Stuart, A. M. and Voss, J. (2007). Analysis of SPDEs arising in path sampling. II. The nonlinear case. Ann. Appl. Probab. 17 1657-1706. · Zbl 1140.60329
[19] Hairer, M., Stuart, A. and Voß, J. (2009). Sampling conditioned diffusions. In Trends in Stochastic Analysis. London Mathematical Society Lecture Note Series 353 159-185. Cambridge Univ. Press, Cambridge. · Zbl 1203.60100
[20] Hairer, M., Stuart, A. and Voss, J. (2011). Signal processing problems on function space: Bayesian formulation, stochastic PDEs and effective MCMC methods. In The Oxford Handbook of Nonlinear Filtering (D. Crisan and B. Rozovsky, eds.) 833-873. Oxford Univ. Press, Oxford. · Zbl 1262.94012
[21] Hairer, M., Stuart, A. M. and Vollmer, S. (2013). Spectral gaps for a Metropolis-Hastings algorithm in infinite dimensions. Available at . · Zbl 1307.65002
[22] Hairer, M., Stuart, A. M., Voss, J. and Wiberg, P. (2005). Analysis of SPDEs arising in path sampling. I. The Gaussian case. Commun. Math. Sci. 3 587-603. · Zbl 1138.60326
[23] Hills, S. E. and Smith, A. F. M. (1992). Parameterization issues in Bayesian inference. In Bayesian Statistics , 4 ( PeñíScola , 1991) 227-246. Oxford Univ. Press, New York.
[24] Hjort, N. L., Holmes, C., Müller, P. and Walker, S. G., eds. (2010). Bayesian Nonparametrics. Cambridge Series in Statistical and Probabilistic Mathematics 28 . Cambridge Univ. Press, Cambridge. · Zbl 1192.62080
[25] Iserles, A. (2004). A First Course in the Numerical Analysis of Differential Equations . Cambridge Univ. Press, Cambridge. · Zbl 1171.65060
[26] Kalnay, E. (2003). Atmospheric Modeling , Data Assimilation and Predictability . Cambridge Univ. Press, Cambridge.
[27] Lemm, J. C. (2003). Bayesian Field Theory . Johns Hopkins Univ. Press, Baltimore, MD. · Zbl 1033.62001
[28] Liu, J. S. (2001). Monte Carlo Strategies in Scientific Computing . Springer, New York. · Zbl 0991.65001
[29] Mattingly, J. C., Pillai, N. S. and Stuart, A. M. (2012). Diffusion limits of the random walk Metropolis algorithm in high dimensions. Ann. Appl. Probab. 22 881-930. · Zbl 1254.60081
[30] McLaughlin, D. and Townley, L. R. (1996). A reassessment of the groundwater inverse problem. Water Res. Res. 32 1131-1161.
[31] Miller, M. T. and Younes, L. (2001). Group actions, homeomorphisms, and matching: A general framework. Int. J. Comput. Vis. 41 61-84. · Zbl 1012.68714
[32] Neal, R. M. (1996). Bayesian Learning for Neural Networks . Springer, New York. · Zbl 0888.62021
[33] Neal, R. M. (1998). Regression and classification using Gaussian process priors. Available at .
[34] Neal, R. M. (2011). MCMC using Hamiltonian dynamics. In Handbook of Markov Chain Monte Carlo 113-162. CRC Press, Boca Raton, FL. · Zbl 1229.65018
[35] O’Hagan, A., Kennedy, M. C. and Oakley, J. E. (1999). Uncertainty analysis and other inference tools for complex computer codes. In Bayesian Statistics , 6 ( Alcoceber , 1998) 503-524. Oxford Univ. Press, New York. · Zbl 1175.62028
[36] Pillai, N. S., Stuart, A. M. and Thiéry, A. H. (2012). Optimal scaling and diffusion limits for the Langevin algorithm in high dimensions. Ann. Appl. Probab. 22 2320-2356. · Zbl 1272.60053
[37] Pillai, N. S., Stuart, A. M. and Thiery, A. H. (2012). On the random walk Metropolis algorithm for Gaussian random field priors and gradient flow. Available at . · Zbl 1272.60053
[38] Richtmyer, D. and Morton, K. W. (1967). Difference Methods for Initial Value Problems . Wiley, New York. · Zbl 0155.47502
[39] Robert, C. P. and Casella, G. (1999). Monte Carlo Statistical Methods . Springer, New York. · Zbl 0935.62005
[40] Roberts, G. O., Gelman, A. and Gilks, W. R. (1997). Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann. Appl. Probab. 7 110-120. · Zbl 0876.60015
[41] Roberts, G. O. and Rosenthal, J. S. (1998). Optimal scaling of discrete approximations to Langevin diffusions. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 255-268. · Zbl 0913.60060
[42] Roberts, G. O. and Rosenthal, J. S. (2001). Optimal scaling for various Metropolis-Hastings algorithms. Statist. Sci. 16 351-367. · Zbl 1127.65305
[43] Roberts, G. O. and Stramer, O. (2001). On inference for partially observed nonlinear diffusion models using the Metropolis-Hastings algorithm. Biometrika 88 603-621. · Zbl 0985.62066
[44] Roberts, G. O. and Tweedie, R. L. (1996). Exponential convergence of Langevin distributions and their discrete approximations. Bernoulli 2 341-363. · Zbl 0870.60027
[45] Rue, H. and Held, L. (2005). Gaussian Markov Random Fields : Theory and Applications. Monographs on Statistics and Applied Probability 104 . Chapman & Hall/CRC, Boca Raton, FL. · Zbl 1093.60003
[46] Smith, A. F. M. and Roberts, G. O. (1993). Bayesian computation via the Gibbs sampler and related Markov chain Monte Carlo methods. J. R. Stat. Soc. Ser. B Stat. Methodol. 55 3-23. · Zbl 0779.62030
[47] Sokal, A. D. (1989). Monte Carlo methods in statistical mechanics: Foundations and new algorithms, Univ. Lausanne, Bâtiment des sciences de physique Troisième Cycle de la physique en Suisse romande. · Zbl 0890.65006
[48] Stein, M. L. (1999). Interpolation of Spatial Data : Some Theory for Kriging . Springer, New York. · Zbl 0924.62100
[49] Stuart, A. M. (2010). Inverse problems: A Bayesian perspective. Acta Numer. 19 451-559. · Zbl 1242.65142
[50] Stuart, A. M., Voss, J. and Wiberg, P. (2004). Fast communication conditional path sampling of SDEs and the Langevin MCMC method. Commun. Math. Sci. 2 685-697. · Zbl 1082.65004
[51] Tierney, L. (1998). A note on Metropolis-Hastings kernels for general state spaces. Ann. Appl. Probab. 8 1-9. · Zbl 0935.60053
[52] Vaillant, M. and Glaunes, J. (2005). Surface matching via currents. In Information Processing in Medical Imaging 381-392. Springer, Berlin.
[53] van der Meulen, F., Schauer, M. and van Zanten, H. (2013). Reversible jump MCMC for nonparametric drift estimation for diffusion processes. Comput. Statist. Data Anal. · Zbl 1471.62197
[54] Zhao, L. H. (2000). Bayesian aspects of some nonparametric problems. Ann. Statist. 28 532-552. · Zbl 1010.62025
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.