×

Hyperparameter estimation in Bayesian MAP estimation: parameterizations and consistency. (English) Zbl 1441.62084

Summary: The Bayesian formulation of inverse problems is attractive for three primary reasons: it provides a clear modelling framework; it allows for principled learning of hyperparameters; and it can provide uncertainty quantification. The posterior distribution may in principle be sampled by means of MCMC or SMC methods, but for many problems it is computationally infeasible to do so. In this situation maximum a posteriori (MAP) estimators are often sought. Whilst these are relatively cheap to compute, and have an attractive variational formulation, a key drawback is their lack of invariance under change of parameterization; it is important to study MAP estimators, however, because they provide a link with classical optimization approaches to inverse problems and the Bayesian link may be used to improve upon classical optimization approaches. The lack of invariance of MAP estimators under change of parameterization is a particularly significant issue when hierarchical priors are employed to learn hyperparameters. In this paper we study the effect of the choice of parameterization on MAP estimators when a conditionally Gaussian hierarchical prior distribution is employed. Specifically we consider the centred parameterization, the natural parameterization in which the unknown state is solved for directly, and the noncentred parameterization, which works with a whitened Gaussian as the unknown state variable, and arises naturally when considering dimension-robust MCMC algorithms; MAP estimation is well-defined in the nonparametric setting only for the noncentred parameterization. However, we show that MAP estimates based on the noncentred parameterization are not consistent as estimators of hyperparameters; conversely, we show that limits of finite-dimensional centred MAP estimators are consistent as the dimension tends to infinity. We also consider empirical Bayesian hyperparameter estimation, show consistency of these estimates, and demonstrate that they are more robust with respect to noise than centred MAP estimates. An underpinning concept throughout is that hyperparameters may only be recovered up to measure equivalence, a well-known phenomenon in the context of the Ornstein-Uhlenbeck process. The applicability of the results is demonstrated concretely with the study of hierarchical Whittle-Matérn and ARD priors.

MSC:

62G05 Nonparametric estimation
62C10 Bayesian problems; characterization of Bayes procedures
62G20 Asymptotic properties of nonparametric inference
45Q05 Inverse problems for integral equations

Software:

PMTK
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Agapiou, Sergios; Bardsley, Johnathan M.; Papaspiliopoulos, Omiros; Stuart, Andrew M., Analysis of the Gibbs sampler for hierarchical inverse problems, SIAM/ASA J. Uncertain. Quantif., 2, 1, 511-544 (2014) · Zbl 1308.62097 · doi:10.1137/130944229
[2] Agapiou, Sergios; Burger, Martin; Dashti, Masoumeh; Helin, Tapio, Sparsity-promoting and edge-preserving maximum a posteriori estimators in non-parametric Bayesian inverse problems, Inverse Probl., 34, 4 (2018) · Zbl 06866427
[3] Agapiou, Sergios; Dashti, Masoumeh; Helin, Tapio, Rates of contraction of posterior distributions based on \(p \)-exponential priors (2018) · Zbl 1469.62217
[4] Agapiou, Sergios; Larsson, Stig; Stuart, Andrew M., Posterior contraction rates for the Bayesian approach to linear ill-posed inverse problems, Stochastic Processes Appl., 123, 10, 3828-3860 (2013) · Zbl 1284.62289 · doi:10.1016/j.spa.2013.05.001
[5] Agapiou, Sergios; Mathé, Peter, New Trends in Parameter Identification for Mathematical Models, Posterior Contraction in Bayesian Inverse Problems Under Gaussian Priors, 1-29 (2018), Springer · Zbl 1405.62051
[6] Agapiou, Sergios; Stuart, Andrew M.; Zhang, Yuan-Xiang, Bayesian posterior contraction rates for linear severely ill-posed inverse problems, J. Inverse Ill-Posed Probl., 22, 3, 297-321 (2014) · Zbl 1288.62036
[7] Berger, James O., Statistical Decision Theory and Bayesian Analysis (2013), Springer
[8] Beskos, Alexandros; Jasra, Ajay; Muzaffer, Ege A.; Stuart, Andrew M., Sequential Monte Carlo methods for Bayesian elliptic inverse problems, Stat. Comput., 25, 4, 727-737 (2015) · Zbl 1331.65012 · doi:10.1007/s11222-015-9556-7
[9] Beskos, Alexandros; Roberts, Gareth; Stuart, Andrew M.; Voss, Jochen, MCMC methods for diffusion bridges, Stoch. Dyn., 8, 3, 319-350 (2008) · Zbl 1159.65007 · doi:10.1142/S0219493708002378
[10] Chada, Neil K.; Iglesias, Marco A.; Roininen, Lassi; Stuart, Andrew M., Parameterizations for ensemble Kalman inversion, Inverse Probl., 34, 5 (2018) · Zbl 1515.62041
[11] Chen, Victor; Dunlop, Matthew M.; Papaspiliopoulos, Omiros; Stuart, Andrew M., Dimension-Robust MCMC in Bayesian Inverse Problems (2018)
[12] Clason, Christian; Helin, Tapio; Kretschmann, Remo; Piiroinen, Petteri, Generalized modes in Bayesian inverse problems, SIAM/ASA J. Uncertain. Quantif., 7, 2, 652-684 (2019) · Zbl 1430.62053 · doi:10.1137/18M1191804
[13] Cotter, Simon L.; Roberts, Gareth; Stuart, Andrew M.; White, David, MCMC methods for functions: modifying old algorithms to make them faster, Stat. Sci., 28, 3, 424-446 (2013) · Zbl 1331.62132 · doi:10.1214/13-STS421
[14] Daon, Yair; Stadler, Georg, Mitigating the Influence of the Boundary on PDE-based Covariance Operators, Inverse Probl. Imaging, 12, 5, 1083-1102 (2018) · Zbl 1401.65043
[15] Dashti, Masoumeh; Law, Kody JH; Stuart, Andrew M.; Voss, Jochen, MAP estimators and their consistency in Bayesian nonparametric inverse problems, Inverse Probl., 29, 9 (2013) · Zbl 1281.62089
[16] Dashti, Masoumeh; Stuart, Andrew M., The Bayesian approach to inverse problems, 311-428 (2017), Springer
[17] Dunlop, Matthew M.; Iglesias, Marco A.; Stuart, Andrew M., Hierarchical Bayesian level set inversion, Stat. Comput., 1-30 (2016) · Zbl 1384.62084
[18] Franklin, Joel N., Well-posed stochastic extensions of ill-posed linear problems, J. Math. Anal. Appl., 31, 3, 682-716 (1970) · Zbl 0198.20601 · doi:10.1016/0022-247X(70)90017-X
[19] Gloter, Arnaud; Hoffmann, Marc, Estimation of the Hurst parameter from discrete noisy data, Ann. Stat., 35, 5, 1947-1974 (2007) · Zbl 1126.62073 · doi:10.1214/009053607000000316
[20] Gugushvili, Shota; van der Vaart, Aad W.; Yan, Dong, Bayesian inverse problems with partial observations, Trans. A. Razmadze Math. Inst., 172, 3, 388-403 (2018) · Zbl 1422.62139 · doi:10.1016/j.trmi.2018.09.002
[21] Gugushvili, Shota; van der Vaart, Aad W.; Yan, Dong, Bayesian linear inverse problems in regularity scales (2018)
[22] Helin, Tapio; Burger, Martin, Maximum a posteriori probability estimates in infinite-dimensional Bayesian inverse problems, Inverse Probl., 31, 8 (2015) · Zbl 1325.62058
[23] Helin, Tapio; Lassas, Matti, Hierarchical models in statistical inverse problems and the Mumford-Shah functional, Inverse Probl., 27, 1 (2010)
[24] Kaipio, Jari; Somersalo, Erkki, Statistical and Computational Inverse Problems, 160 (2006), Springer · Zbl 1068.65022
[25] Khristenko, Ustim; Scarabosio, Laura; Swierczynski, Piotr; Ullmann, Elisabeth; Wohlmuth, Barbara, Analysis of boundary effects on PDE-based sampling of Whittle-Matérn random fields, SIAM/ASA J. Uncertain. Quantif., 7, 3, 948-974 (2019) · Zbl 07118418 · doi:10.1137/18M1215700
[26] Knapik, Bartek T.; Szabó, Botond T.; van der Vaart, Aad W.; van Zanten, J. Harry, Bayes procedures for adaptive inference in inverse problems for the white noise model, Probab. Theory Relat. Fields, 164, 3-4, 771-813 (2016) · Zbl 1334.62039 · doi:10.1007/s00440-015-0619-7
[27] Knapik, Bartek T.; van der Vaart, Aad W.; van Zanten, J. Harry, Bayesian recovery of the initial condition for the heat equation, Commun. Stat., Theory Methods, 42, 7, 1294-1313 (2013) · Zbl 1347.62057 · doi:10.1080/03610926.2012.681417
[28] Knapik, Bartek T.; van der Vaart, Aad W.; van Zanten, J. Harry, Bayesian inverse problems with Gaussian priors, Ann. Stat., 39, 5, 2626-2657 (2011) · Zbl 1232.62079 · doi:10.1214/11-AOS920
[29] Lasanen, Sari, Non-Gaussian statistical inverse problems. Part I: Posterior distributions, Inverse Probl. Imaging, 6, 2, 215-266 (2012) · Zbl 1263.62041 · doi:10.3934/ipi.2012.6.215
[30] Lasanen, Sari, Non-Gaussian statistical inverse problems. Part II: Posterior convergence for approximated unknowns., Inverse Probl. Imaging, 6, 2, 267-287 (2012) · Zbl 1263.62042
[31] Lehtinen, Markku S.; Paivarinta, Lassi; Somersalo, Erkki, Linear inverse problems for generalised random variables, Inverse Probl., 5, 4, 599-612 (1989) · Zbl 0681.60015
[32] Lindgren, Finn; Rue, Håvard; Lindström, Johan, An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach, J. R. Stat. Soc., Ser. B, Stat. Methodol., 73, 4, 423-498 (2011) · Zbl 1274.62360 · doi:10.1111/j.1467-9868.2011.00777.x
[33] Murphy, Kevin P., Machine Learning: A Probabilistic Perspective (2012), MIT Press · Zbl 1295.68003
[34] Neal, Radford M., Bayesian Learning for Neural Networks (1995) · Zbl 0888.62021
[35] Neal, Radford M., Monte Carlo implementation of Gaussian process models for Bayesian regression and classification (1997)
[36] Nickl, Richard, Bernstein-von Mises theorems for statistical inverse problems I: Schrödinger equation (2017) · Zbl 1445.62099
[37] Nickl, Richard; Ray, Kolyan, Nonparametric statistical inference for drift vector fields of multi-dimensional diffusions (2018) · Zbl 1450.62041
[38] Nickl, Richard; Söhl, Jakob, Bernstein-von Mises theorems for statistical inverse problems II: compound Poisson processes, Electron. J. Stat., 13, 2, 3513-3571 (2019) · Zbl 1429.62168 · doi:10.1214/19-ejs1609
[39] Nickl, Richard; van de Geer, Sara; Wang, Sven, Convergence rates for Penalised Least Squares Estimators in PDE-constrained regression problems (2018) · Zbl 1436.62163
[40] Owhadi, Houman; Scovel, Clint; Sullivan, Tim, On the brittleness of Bayesian inference, SIAM Rev., 57, 4, 566-582 (2015) · Zbl 1341.62094 · doi:10.1137/130938633
[41] Papaspiliopoulos, Omiros; Roberts, Gareth; Sköld, Martin, A general framework for the parametrization of hierarchical models, Stat. Sci., 59-73 (2007) · Zbl 1246.62195 · doi:10.1214/088342307000000014
[42] Ray, Kolyan, Bayesian inverse problems with non-conjugate priors, Electron. J. Stat., 7, 2516-2549 (2013) · Zbl 1294.62107
[43] Roberts, Gareth; Stramer, Osnat, On inference for partially observed nonlinear diffusion models using the Metropolis-Hastings algorithm, Biometrika, 88, 3, 603-621 (2001) · Zbl 0985.62066 · doi:10.1093/biomet/88.3.603
[44] Roininen, Lassi; Huttunen, Janne M. J.; Lasanen, Sari, Whittle-Matérn priors for Bayesian statistical inversion with applications in electrical impedance tomography, Inverse Probl. Imaging, 8, 2, 561-586 (2014) · Zbl 1302.65245 · doi:10.3934/ipi.2014.8.561
[45] Stuart, Andrew M., Acta Numerica, 19, Inverse problems: a Bayesian perspective, 451-559 (2010), Cambridge University Press · Zbl 1242.65142 · doi:10.1017/S0962492910000061
[46] van der Vaart, Aad W.; Wellner, Jon A., Weak convergence and empirical processes, Weak convergence, 16-28 (1996), Springer · Zbl 0862.60002 · doi:10.1007/978-1-4757-2545-2_3
[47] van Zanten, J. Harry, A Note on Consistent Estimation of Multivariate Parameters in Ergodic Diffusion Models, Scand. J. Stat., 28, 4, 617-623 (2001) · Zbl 1010.62075 · doi:10.1111/1467-9469.00258
[48] Yu, Yaming; Meng, Xiao-Li, To center or not to center: that is not the question – an Ancillarity-Sufficiency Interweaving Strategy (ASIS) for boosting MCMC efficiency, J. Comput. Graph. Stat., 20, 3, 531-570 (2011)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.