×

zbMATH — the first resource for mathematics

Optimal design for correlated processes with input-dependent noise. (English) Zbl 06975448
Summary: Optimal design for parameter estimation in Gaussian process regression models with input-dependent noise is examined. The motivation stems from the area of computer experiments, where computationally demanding simulators are approximated using Gaussian process emulators to act as statistical surrogates. In the case of stochastic simulators, which produce a random output for a given set of model inputs, repeated evaluations are useful, supporting the use of replicate observations in the experimental design. The findings are also applicable to the wider context of experimental design for Gaussian process regression and kriging. Designs are proposed with the aim of minimising the variance of the Gaussian process parameter estimates. A heteroscedastic Gaussian process model is presented which allows for an experimental design technique based on an extension of Fisher information to heteroscedastic models. It is empirically shown that the error of the approximation of the parameter variance by the inverse of the Fisher information is reduced as the number of replicated points is increased. Through a series of simulation experiments on both synthetic data and a systems biology stochastic simulator, optimal designs with replicate observations are shown to outperform space-filling designs both with and without replicate observations. Guidance is provided on best practice for optimal experimental design for stochastic response models.

MSC:
62 Statistics
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Abt, M.; Welch, W. J., Fisher information and maximum likelihood estimation of covariance parameters in Gaussian stochastic processes, Canadian Journal of Statistics, 26, 127-137, (1998) · Zbl 0899.62124
[2] (Atkinson, A. C.; Donev, A. N., Optimum Experimental Designs, (1992), Oxford University Press) · Zbl 0829.62070
[3] Baldi Antognini, A.; Zagoraiou, M., Exact optimal designs for computer experiments via Kriging metamodelling, Journal of Statistical Planning and Inference, 140, 22, 2607-2617, (2010) · Zbl 1188.62214
[4] Baran, S.; Sikolya, K.; Stehlík, M., On the optimal designs for prediction of Ornstein-Uhlenbeck sheets, Statistics & Probability Letters, 83, 6, 1580-1587, (2013) · Zbl 1356.62096
[5] Bastos, L. S.; O’Hagan, A., Diagnostics for Gaussian process emulators, Technometrics, (2009)
[6] Boukouvalas, A., 2011. Emulation of random output simulators. Ph.D. Thesis. Aston University. Available at www.wiki.aston.ac.uk/foswiki/pub/AlexisBoukouvalas/WebHome/thesis.pdf.
[7] Boukouvalas, A.; Cornford, D.; Stehlík, M., Notes on optimal design for correlated processes with input-dependent noise. technical report. non-linear complexity group, (2013), Aston University, https://wiki.aston.ac.uk/AlexisBoukouvalas · Zbl 06975448
[8] Dette, H.; Pepelyshev, A.; Zhigljavsky, A., Nearly universally optimal designs for models with correlated observations, Computational Statistics and Data Analysis, (2013) · Zbl 06975449
[9] Diggle, P. J.; Moyeed, R. A.; Tawn, J. A., Model-based geostatistics, Applied Statistics, 47, 299-350, (1998) · Zbl 0904.62119
[10] Fedorov, V.; Müller, W., Optimum design for correlated fields via covariance kernel expansions, (mODa 8—Advances in Model-Oriented Design and Analysis Contributions to Statistics, (2007)), 57-66
[11] Green, R. H., Sampling design and statistical methods for environmental biologists, (1979), Wiley
[12] Henderson, D. A.; Boys, R. J.; Krishnan, K. J.; Lawless, C.; Wilkinson, D. J., Bayesian emulation and calibration of a stochastic computer model of mitochondrial DNA deletions in substantia nigra neurons, Journal of the American Statistical Association, 104, 485, 76-87, (2009) · Zbl 1388.92007
[13] Kiselák, J.; Stehlík, M., Equidistant and D-optimal designs for parameters of Ornstein-Uhlenbeck process, Statistics & Probability Letters, 78, 12, 1388-1396, (2008) · Zbl 1152.62049
[14] Krause, A.; Guestrin, C., Nonmyopic active learning of Gaussian processes: an exploration-exploitation approach, (ICML’07: Proceedings of the 24th International Conference on Machine Learning, (2007), ACM New York, NY, USA), 449-456
[15] Mardia, K. V.; Marshall, R. J., Maximum likelihood estimation of models for residual covariance in spatial regression, Biometrika, 71, 135-146, (1984) · Zbl 0542.62079
[16] McGree, J. M.; Eccleston, J. A.; Duffull, S. B., Compound optimal design criteria for non-linear models, Journal of Biopharmaceutical Statistics, 18, 646-661, (2008)
[17] Müller, W.; Stehlík, M., Issues in the optimal design of computer simulation experiments, Applied Stochastic Models in Business and Industry, 25, 2, 163-177, (2009) · Zbl 1224.62020
[18] Müller, W. G.; Stehlík, M., Compound optimal spatial designs, Environmetrics, 21, 3-4, 354-364, (2010)
[19] Müller, W. G.; Zimmerman, D. L., Optimal design for variogram estimation, Environmetrics, 10, 23-37, (1993)
[20] Pázman, A., Correlated optimum design with parameterized covariance function: justification of the Fisher information matrix and of the method of virtual noise. technical report 5, department of statistics and mathematics, (2004), Wirtschaftsuniversitat Wien, June
[21] Pázman, A., Criteria for optimal design of small-sample experiments with correlated observations, Kybernetika, 43, 4, 453-462, (2007) · Zbl 1134.62055
[22] Pettitt, A. N.; McBratney, A. B., Sampling designs for estimating spatial variance components, Applied Statistics, 42, 1, 185-209, (1993) · Zbl 0825.86019
[23] Pronzato, Luc; Müller, Werner G., Design of computer experiments: space filling and beyond, Statistics and Computing, 22, 3, 681-701, (2012) · Zbl 1252.62080
[24] Rasmussen, C. E.; Williams, C. K.I., Gaussian processes for machine learning, (2006), MIT Press · Zbl 1177.68165
[25] Rasouli, Soora, Timmermans, Harry, 2012. Using emulators to approximate predicted performance indicators in complex micro-simulation and multi-agent models of travel demand. In: 4th Transportation Research Board Conference on Innovations in Travel Modeling.
[26] Rodríguez-Díaz, J. M.; Santos-Martín, M. T.; Waldl, H.; Stehlík, M., Filling and D-optimal designs for the correlated generalized exponential models, Chemometrics and Intelligent Laboratory Systems, 114, 0, 10-18, (2012)
[27] Sacks, J.; Welch, W. J.; Mitchell, T. J.; Wynn, H. P., Design and analysis of computer experiments, Statistical Science, 4, 409-435, (1989) · Zbl 0955.62619
[28] (Stein, M. L., Interpolation of Spatial Data: Some Theory for Kriging, (1999), Springer-Verlag New York) · Zbl 0924.62100
[29] Stein, M. L., Statistical interpolation of spatial data: some theory for Kriging, (1999), Springer · Zbl 0924.62100
[30] Tack, L.; Goos, P.; Vandebroek, M., Efficient Bayesian designs under heteroscedasticity, Journal of Statistical Planning and Inference, 104, 2, 469-483, (2002) · Zbl 0992.62070
[31] Uddin, N., Mv-optimal block designs for correlated errors, Statistics & Probability Letters, 78, 2926-2931, (2008) · Zbl 1148.62062
[32] Vernon, I. R.; Goldstein, M., A Bayes linear approach to systems biology. MUCM technical report 10/10, (2010), Durham University
[33] Wilkinson, D. J., Stochastic modelling for systems biology, (2006), Chapman & Hall/CRC · Zbl 1099.92004
[34] Xia, G.; Miranda, M. L.; Gelfand, A. E., Approximately optimal spatial design approaches for environmental health data, Environmetrics, 17, 4, 363-385, (2006)
[35] Youssef, N., 2010. An orthonormal function approach to optimal design for computer experiments. Ph.D. Thesis, London School of Economics, UK.
[36] Zhang, H.; Zimmerman, D. L., Towards reconciling two asymptotic frameworks in spatial statistics, Biometrika, 92, 4, 921-936, (2005) · Zbl 1151.62348
[37] Zhu, Z.; Stein, M. L., Spatial sampling design for parameter estimation of the covariance function, Journal of Statistical Planning and Inference, 134, 2, 583-603, (2005) · Zbl 1066.62092
[38] Zhu, Z.; Stein, M. L., Spatial sampling design for prediction with estimated parameters, Journal of Agricultural, Biological, and Environmental Statistics, 11, 1, 24-44, (2006)
[39] Zimmerman, D. L., Optimal network design for spatial prediction, covariance parameter estimation, and empirical prediction, Environmetrics, 17, 6, 635-652, (2006)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.