×

Estimating Bayes factors via thermodynamic integration and population MCMC. (English) Zbl 1453.62055

Summary: A Bayesian approach to model comparison based on the integrated or marginal likelihood is considered, and applications to linear regression models and nonlinear ordinary differential equation (ODE) models are used as the setting in which to elucidate and further develop existing statistical methodology. The focus is on two methods of marginal likelihood estimation. First, a statistical failure of the widely employed Posterior Harmonic Mean estimator is highlighted. It is demonstrated that there is a systematic bias capable of significantly skewing Bayes factor estimates, which has not previously been highlighted in the literature. Second, a detailed study of the recently proposed Thermodynamic Integral estimator is presented, which characterises the error associated with its discrete form. An experimental study using analytically tractable linear regression models highlights substantial differences with recently published results regarding optimal discretisation. Finally, with the insights gained, it is demonstrated how Population MCMC and thermodynamic integration methods may be elegantly combined to estimate Bayes factors accurately enough to discriminate between nonlinear models based on systems of ODEs, which has important application in describing the behaviour of complex processes arising in a wide variety of research areas, such as Systems Biology, Computational Ecology and Chemical Engineering.

MSC:

62-08 Computational methods for problems pertaining to statistics
62F15 Bayesian inference
65C40 Numerical analysis or methods applied to Markov chains

Software:

BioBayes
PDFBibTeX XMLCite
Full Text: DOI Link

References:

[1] Del Moral, P.; Doucet, A.; Jasra, A., Sequential Monte Carlo samplers, Journal of the Royal Statistical Society B, 68, 3, 411-436 (2006) · Zbl 1105.62034
[2] Del Moral, P.; Doucet, A.; Jasra, A., (Bayesian Statistics. Bayesian Statistics, Ch. Sequential Monte Carlo for Bayesian Computation (2007), Oxford University Press), 1-34
[3] El Adlouni, S.; Favre, C.; Bobee, B., Comparison of methodologies to assess the convergence of Markov Chain Monte Carlo methods, Computational Statistics and Data Analysis, 50, 10, 2685-2701 (2006) · Zbl 1445.62005
[4] Friel, N.; Pettitt, A., Marginal likelihood estimation via power posteriors, Journal of the Royal Statistical Society: Series B, 70, 3, 589-607 (2008) · Zbl 05563360
[5] Gamerman, D., Markov Chain Monte Carlo: Stochastic Simulation for Bayesian Inference (2002), Chapman and Hall/CRC
[6] Gelman, A.; Meng, X., Simulating normalizing constants: From importance sampling to bridge sampling to path sampling, Statistical Science, 13, 2, 163-185 (1998) · Zbl 0966.65004
[7] Gelman, A.; Rubin, D. B., Inference from iterative simulation using multiple sequences, Statistical Science, 7, 457-472 (1992) · Zbl 1386.65060
[8] Golightly, A.; Wilkinson, D., Bayesian inference for nonlinear multivariate diffusion models observed with error, Computational Statistics and Data Analysis, 52, 3, 1674-1693 (2007) · Zbl 1452.62603
[9] Goodwin, B., Oscillatory behavior in enzymatic control processes, Advances in Enzyme Regulation, 3, 425-438 (1965)
[10] Iba, Y., Population Monte Carlo algorithms, Transactions of the Japanese Society of Artificial Intelligence, 16, 279-286 (2000)
[11] Jasra, A.; Stephens, D.; Holmes, C., On population-based simulation for static inference, Statistics and Computing, 17, 263-279 (2007)
[12] Kass, R.; Raftery, A., Bayes factors, American Statistical Association, 90, 430, 773-795 (1995) · Zbl 0846.62028
[13] Lartillot, N.; Philippe, H., Computing Bayes factors using thermodynamic integration, Systematic Biology, 55, 2, 195-207 (2006)
[14] Laskey, K.; Myers, J., Population Markov Chain Monte Carlo, Machine Learning, 50, 175-196 (2003) · Zbl 1028.68167
[15] Liang, F.; Wong, W., Real-parameter evolutionary Monte Carlo with applications to Bayesian mixture models, American Statistical Association, 96, 454, 653-666 (2001) · Zbl 1017.62022
[16] Locke, J.; Millar, A.; Turner, M., Modelling genetic networks with noisy and varied experimental data: The circadian clock in arabidopsis thaliana, Journal of Theoretical Biology, 234, 383-393 (2005) · Zbl 1445.92014
[17] McCulloch, R.E., Rossi, P.E., Bayes factors for nonlinear hypotheses and likelihood distributions, Tech. Rep. Technical Report 101, Statistics Research Center, University of Chicago, Graduate School of Business, 1991; McCulloch, R.E., Rossi, P.E., Bayes factors for nonlinear hypotheses and likelihood distributions, Tech. Rep. Technical Report 101, Statistics Research Center, University of Chicago, Graduate School of Business, 1991 · Zbl 0850.62284
[18] Neal, R., Annealed importance sampling, Statistics and Computing, 11, 125-139 (2001)
[19] Newton, M.; Raftery, A., Approximate Bayesian inference with the weighted likelihood bootstrap, Journal of the Royal Statistical Society: Series B, 56, 1, 3-48 (1994) · Zbl 0788.62026
[20] Raftery, A.; Newton, M.; Satagopan, J.; Krivitsky, P., Estimating the integrated likelihood via posterior simulation using the harmonic mean identity, Bayesian Statistics, 8, 1-45 (2007) · Zbl 1252.62038
[21] Robert, C.; Casella, G., Monte Carlo Statistical Methods (2004), Springer · Zbl 1096.62003
[22] Vyshemirsky, V.; Girolami, M. A., Bayesian ranking of biochemical system models, Bioinformatics, 24, 6, 833-839 (2008)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.