×

Sequential design with mutual information for computer experiments (MICE): emulation of a tsunami model. (English) Zbl 1349.62364

Summary: Computer simulators can be computationally intensive to run over a large number of input values, as required for optimization and various uncertainty quantification tasks. The standard paradigm for the design and analysis of computer experiments is to employ Gaussian random fields to model computer simulators. Gaussian process models are trained on input-output data obtained from simulation runs at various input values. Following this approach, we propose a sequential design algorithm MICE (mutual information for computer experiments) that adaptively selects the input values at which to run the computer simulator in order to maximize the expected information gain (mutual information) over the input space. The superior computational efficiency of the MICE algorithm compared to other algorithms is demonstrated by test functions and by a tsunami simulator with overall gains of up to 20% in that case.

MSC:

62L05 Sequential statistical design
62K99 Design of statistical experiments
62M20 Inference from stochastic processes and prediction
86A05 Hydrology, hydrography, oceanography
86A15 Seismology (including tsunami modeling), earthquakes
65Y20 Complexity and performance of numerical algorithms
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] S. Ba and V.R. Joseph, {\it Multi-layer designs for computer experiments}, J. Amer. Statist. Assoc., 106 (2011), pp. 1139-1149. · Zbl 1229.62102
[2] E.N. Ben-Ari and D.M. Steinberg, {\it Modeling data from computer experiments: An empirical comparison of kriging with MARS and projection pursuit regression}, Qual. Eng., 19 (2007), pp. 327-338.
[3] I. Bilionis and N. Zabaras, {\it Multi-output local Gaussian process regression: Applications to uncertainty quantification}, J. Comput. Phys., 231 (2012), pp. 5718-5746. · Zbl 1277.60066
[4] W.F. Caselton and J.V. Zidek, {\it Optimal monitoring network designs}, Statist. Probab. Lett., 2 (1984), pp. 223-227. · Zbl 0547.94002
[5] J.A. Christen and B. Sansó, {\it Advances in the sequential design of computer experiments based on active learning}, Comm. Statist. Theory Methods, 40 (2011), pp. 4467-4483. · Zbl 1318.62264
[6] E. Contal, V. Perchet, and N. Vayatis, {\it Gaussian process optimization with mutual information}, in Proceedings of the 31st International Conference on Machine Learning (ICML), Beijing, 2014, pp. 253-261.
[7] T.M. Cover and J.A. Thomas, {\it Elements of Information Theory}, 2nd ed., Wiley Ser. Telecom., Wiley-Interscience, [John Wiley & Sons], Hoboken, NJ, 2006. · Zbl 1140.94001
[8] C. Currin, T.J. Mitchell, M.D. Morris, and D. Ylvisaker, {\it A Bayesian Approach to the Design and Analysis of Computer Experiments}, Tech. report ORNL-6498, Oak Ridge National Laboratory, Oak Ridge, TN, 1988. Available online at http://web.ornl.gov/ webworks/cpr/rpt/6863.pdf.
[9] G.M. Dancik and K.S. Dorman, {\it mlegp: Statistical analysis for computer models of biological systems using R}, Bioinformatics, 24 (2008), pp. 1966-1967.
[10] K. Deb and R.B. Agrawal, {\it Simulated binary crossover for continuous search space}, Complex Syst., 9 (1995), pp. 115-148. · Zbl 0843.68023
[11] D. Dutykh, R. Poncet, and F. Dias, {\it The VOLNA code for the numerical modeling of tsunami waves: Generation, propagation and inundation}, Eur. J. Mech. B Fluids, 30 (2011), pp. 598-615. · Zbl 1258.76036
[12] A. Genz, {\it An adaptive numerical integration algorithm for simplices}, in Computing in the 90s, Lecture Notes in Comput. Sci. 507, Springer-Verlag, New York, 1991, pp. 279-285.
[13] R.B. Gramacy and H.K.H. Lee, {\it Adaptive design and analysis of supercomputer experiments}, Technometrics, 51 (2009), pp. 130-145.
[14] R.B. Gramacy and H.K.H. Lee, {\it Cases for the nugget in modeling computer experiments}, Stat. Comput., 22 (2012), pp. 713-722. · Zbl 1252.62098
[15] M.S. Handcock and M.L. Stein, {\it A Bayesian analysis of kriging}, Technometrics, 35 (1993), pp. 403-410.
[16] X. Huan and Y.M. Marzouk, {\it Simulation-based optimal Bayesian experimental design for nonlinear systems}, J. Comput. Phys., 232 (2013), pp. 288-317.
[17] D.R. Jones, {\it A taxonomy of global optimization methods based on response surfaces}, J. Global Optim., 21 (2001), pp. 345-383. · Zbl 1172.90492
[18] C.W. Ko, J. Lee, and M. Queyranne, {\it An exact algorithm for maximum entropy sampling}, Oper. Res., 43 (1995), pp. 684-691. · Zbl 0857.90069
[19] A. Krause, A. Singh, and C. Guestrin, {\it Near-optimal sensor placements in Gaussian processes: Theory, efficient algorithms and empirical studies}, J. Mach. Learn. Res., 9 (2008), pp. 235-284. · Zbl 1225.68192
[20] A.M. Kupresanin and G. Johannesson, {\it Comparison of Sequential Designs of Computer Experiments in High Dimensions}, Tech. report LLNL-TR-491692, Lawrence Livermore National Laboratory, Livermore, CA, 2011. Available online at https://e-reports-ext.llnl.gov/pdf/502919.pdf.
[21] C.Q. Lam and W.I. Notz, {\it Sequential adaptive designs in computer experiments for response surface model fit}, Stat. Appl., 6 (2008), pp. 207-233.
[22] D.V. Lindley, {\it On a measure of the information provided by an experiment}, Ann. Math. Statist., 27 (1956), pp. 986-1005. · Zbl 0073.14103
[23] D. Maljovec, B. Wang, A. Kupresanin, G. Johannesson, V. Pascucci, and P.-T. Bremer, {\it Adaptive sampling with topological scores}, Int. J. Uncertain. Quantif., 3 (2013), pp. 119-141. · Zbl 07694402
[24] G.L. Nemhauser, L.A. Wolsey, and M.L. Fisher, {\it An analysis of approximations for maximizing submodular set functions}, Math. Program., 14 (1978), pp. 265-294. · Zbl 0374.90045
[25] C.Y. Peng and J. Wu, {\it On the choice of nugget in kriging modeling for deterministic computer experiments}, J. Comput. Graph. Statist., 23 (2014), pp. 151-168.
[26] P. Ranjan, R. Haynes, and R. Karsten, {\it A computationally stable approach to Gaussian process interpolation of deterministic computer simulation data}, Technometrics, 53 (2011), pp. 366-378.
[27] C. Rasmussen and C. Williams, {\it Gaussian Processes for Machine Learning}, MIT Press, Cambridge, MA, 2006. · Zbl 1177.68165
[28] G.K. Robinson, {\it That BLUP is a good thing: The estimation of random effects}, Statist. Sci., 6 (1991), pp. 15-32. · Zbl 0955.62500
[29] O. Roustant, D. Ginsbourger, and Y. Deville, {\it DiceKriging, DiceOptim: Two R packages for the analysis of computer experiments by kriging-based metamodeling and optimization}, J. Stat. Softw., 51 (2012), pp. 1-55.
[30] J. Sacks, W.J. Welch, T.J. Mitchell, and H.P. Wynn, {\it Design and analysis of computer experiments}, Statist. Sci., 4 (1989), pp. 409-423. · Zbl 0955.62619
[31] T.J. Santner, B.J. Williams, and W.I. Notz, {\it The Design and Analysis of Computer Experiments}, Springer-Verlag, New York, 2003. · Zbl 1041.62068
[32] A. Sarri, S. Guillas, and F. Dias, {\it Statistical emulation of a tsunami model for sensitivity analysis and uncertainty quantification}, Nat. Hazards Earth Syst. Sci., 12 (2012), pp. 2003-2018.
[33] M. Schonlau, {\it Computer Experiments and Global Optimization}, Ph.D. thesis, University of Waterloo, Waterloo, Ontario, 1998.
[34] S. Seo, M. Wallat, T. Graepel, and K. Obermayer, {\it Gaussian process regression: Active data selection and test point rejection}, in Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, Vol. 3, Como, Italy, 2000, pp. 241-246.
[35] M.C. Shewry and H.P. Wynn, {\it Maximum entropy sampling}, J. Appl. Statist., 14 (1987), pp. 165-170.
[36] T.W. Simpson, D.K.J. Lin, and W. Chen, {\it Sampling strategies for computer experiments: Design and analysis}, Int. J. Reliab. Appl., 2 (2001), pp. 209-240.
[37] I. Sraj, K.T. Mandli, O.M. Knio, C.N. Dawson, and I. Hoteit, {\it Uncertainty quantification and inference of Manning’s friction coefficients using DART buoy data during the Tōhoku tsunami}, Ocean Model., 83 (2014), pp. 82-97.
[38] T.S. Stefanakis, E. Contal, N. Vayatis, F. Dias, and C.E. Synolakis, {\it Can small islands protect nearby coasts from tsunamis? An active experimental design approach}, Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci., 470 (2014), 20140575. · Zbl 1371.86015
[39] M.L. Stein, {\it Interpolation of Spatial Data: Some Theory for Kriging}, Springer-Verlag, New York, 1999. · Zbl 0924.62100
[40] D.L. Zimmerman and N. Cressie, {\it Mean squared prediction error in the spatial linear model with estimated covariance parameters}, Ann. Inst. Statist. Math., 44 (1992), pp. 27-43. · Zbl 0760.62090
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.