×

Bayesian nonstationary Gaussian process models via treed process convolutions. (English) Zbl 1474.60102

Summary: The Gaussian process is a common model in a wide variety of applications, such as environmental modeling, computer experiments, and geology. Two major challenges often arise: First, assuming that the process of interest is stationary over the entire domain often proves to be untenable. Second, the traditional Gaussian process model formulation is computationally inefficient for large datasets. In this paper, we propose a new Gaussian process model to tackle these problems based on the convolution of a smoothing kernel with a partitioned latent process. Nonstationarity can be modeled by allowing a separate latent process for each partition, which approximates a regional clustering structure. Partitioning follows a binary tree generating process similar to that of Classification and Regression Trees. A Bayesian approach is used to estimate the partitioning structure and model parameters simultaneously. Our motivating dataset consists of 11918 precipitation anomalies. Results show that our model has promising prediction performance and is computationally efficient for large datasets.

MSC:

60G15 Gaussian processes
60G60 Random fields
62M30 Inference from spatial processes
62M20 Inference from stochastic processes and prediction
62F15 Bayesian inference
PDFBibTeX XMLCite
Full Text: DOI Link

References:

[1] Analytics R, Weston S (2015a) doParallel: Foreach parallel adaptor for the “parallel” package. http://CRAN.R-project.org/package=doParallel, R package version 1.0.10
[2] Analytics R, Weston S (2015b) foreach: Provides Foreach looping construct for R. http://CRAN.R-project.org/package=foreach, R package version 1.4.3
[3] Banerjee S, Gelfand AE, Finley AO, Sang H (2008) Gaussian predictive process models for large spatial data sets. J R Stat Soc Ser B 70(4):825-848 · Zbl 05563371 · doi:10.1111/j.1467-9868.2008.00663.x
[4] Bornn L, Shaddick G, Zidek J (2012) Modelling nonstationary processes through dimension expansion. J Am Stat Assoc 107(497):281-289 · Zbl 1261.62085 · doi:10.1080/01621459.2011.646919
[5] Breiman L, Friedman JH, Olshen R, Stone C (1984) Classification and regression trees. Wadsworth, Belmont · Zbl 0541.62042
[6] Brenning A (2001) Geostatistics without stationarity assumptions within geographical information systems. Freiberg Online Geosci 6:1-108
[7] Chipman HA, George EI, McCulloch RE (1998) Bayesian CART model search. J Am Stat Assoc 93(443):935-948 · doi:10.1080/01621459.1998.10473750
[8] Cressie N, Johannesson G (2008) Fixed rank kriging for very large spatial data sets. J R Stat Soc Ser B 70(Part 1):209-226 · Zbl 05563351 · doi:10.1111/j.1467-9868.2007.00633.x
[9] Damian D, Sampson P, Guttorp P (2001) Bayesian estimation of semi-parametric non-stationary spatial covariance structure. Environmetrics 12:161-178 · doi:10.1002/1099-095X(200103)12:2<161::AID-ENV452>3.0.CO;2-G
[10] Finley AO, Banerjee S, Carlin BP (2007) spBayes: an R package for univariate and multivariate hierarchical point-referenced spatial models. J Stat Softw 19(4):1-24 http://www.jstatsoft.org/article/view/v019i04
[11] Finley AO, Sang H, Banerjee S, Gelfand AE (2009) Improving the performance of predictive process modeling for large datasets. Comput Stat Data Anal 53:2873-2884 · Zbl 1453.62090 · doi:10.1016/j.csda.2008.09.008
[12] Fuentes M, Smith RL (2001) A new class of nonstationary spatial models. Technical reports on North Carolina State University, Department of Statistics, Raleigh, NC
[13] Fuentes M, Kelly R, Kittel T, Nychka D (1998) Spatial prediction of climate fields for ecological models. Technical reports on National Center for Atmospheric Research, Boulder CO
[14] Furrer R (2006) KriSp: an R package for covariance tapered kriging of large datasets using sparse matrix techniques. In: Technical reports on MCS 06-06, Colorado School of Mines, Golden, USA, http://user.math.uzh.ch/furrer/software/KriSp/, version 0.4, 2006-10-26
[15] Gaujoux R (2014) doRNG: generic reproducible parallel backend for “foreach” loops. http://CRAN.R-project.org/package=doRNG, R package version 1.6
[16] Geman S, Geman D (1984) Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Trans. Pattern Anal Mach Intell 12:609-628 · Zbl 0573.62030 · doi:10.1109/34.56204
[17] Gneiting T, Raftery AE (2007) Strictly proper scoring rules, prediction, and estimation. J Am Stat Assoc 102:359-378 · Zbl 1284.62093 · doi:10.1198/016214506000001437
[18] Gramacy RB (2007) tgp: an R package for Bayesian nonstationary, semiparametric nonlinear regression and design by treed Gaussian process models. J Stat Softw 19(9):1-46. http://www.jstatsoft.org/v19/i09/
[19] Gramacy RB, Apley DW (2015) Local Gaussian process approximation for large computer experiments. J Comput Graph Stat 24(2):561-578 · doi:10.1080/10618600.2014.914442
[20] Gramacy RB, Lee HK (2008) Bayesian treed Gaussian process models with an application to computer modeling. J Am Stat Assoc 103(483):1119-1130 · Zbl 1205.62218 · doi:10.1198/016214508000000689
[21] Green PJ (1995) Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82(4):711-32 · Zbl 0861.62023 · doi:10.1093/biomet/82.4.711
[22] Higdon D (1998) A process-convolution approach to modeling temperatures in the north Atlantic Ocean. J Environ Ecol Stat 5(2):173-190 · doi:10.1023/A:1009666805688
[23] Higdon, D.; Anderson, C. (ed.); Barnett, V. (ed.); Chatwin, P. (ed.); El-Shaarawi, A. (ed.), Space and space-time modeling using process convolutions, 37-54 (2002), London · Zbl 1255.86016 · doi:10.1007/978-1-4471-0657-9_2
[24] Higdon, D.; Finkenstadt, B. (ed.); Held, L. (ed.); Isham, V. (ed.), A primer on space-time modeling from a Bayesian perspective, 217-279 (2006), Boca Raton · Zbl 1121.62081 · doi:10.1201/9781420011050.ch6
[25] Higdon D, Swall J, Kern J (1999) Non-stationary spatial modeling. Bayesian Stat 6:761-768 · Zbl 0982.62079
[26] Johns CJ, Nychka D, Kittel TG, Daly C (2003) Infilling sparse records of spatial fields. J Am Stat Assoc 98:796-806 · doi:10.1198/016214503000000729
[27] Katzfuss M (2013) Bayesian nonstationary spatial modeling for very large datasets. Environmetrics 24(3):189-200 · Zbl 1525.62153 · doi:10.1002/env.2200
[28] Kim HM, Mallick BK, Holmes CC (2005) Analyzing nonstationary spatial data using piecewise Gaussian processes. J Am Stat Assoc 100:653-668 · Zbl 1117.62368 · doi:10.1198/016214504000002014
[29] Konomi BA, Sang H, Mallick BK (2014) Adaptive Bayesian nonstationary modeling for large spatial datasets using covariance approximations. J Comput Graph Stat 23(3):802-829 · doi:10.1080/10618600.2013.812872
[30] Lee HKH, Higdon D, Calder CA, Holloman CH (2005) Efficient models for correlated data via convolutions of intrinsic processes. Stat Model 5(1):53-74 · Zbl 1071.62089 · doi:10.1191/1471082X05st085oa
[31] Lemos RT, Sansó B (2009) Spatio-temporal model for mean, anomaly and trend fields of north atlantic sea surface temperature. J Am Stat Assoc 104(485):5-18 · Zbl 1248.62172 · doi:10.1198/jasa.2009.0018
[32] Liang WWJ (2012) Bayesian nonstationary Gaussian process models via treed process convolutions. Ph.D. Thesis, Department of AMS, UCSC, Santa Cruz, 95064
[33] Montagna S (2013) On Bayesian analyses of functional regression, correlated functional data and non-homogeneous computer models. Ph.D. Thesis, Duke University, Durham, NC 27708
[34] Naish-Guzman A, Holden S (2007) The generalized FITC approximation. In: Advances in neural information processing systems, pp 1057-1064
[35] Paciorek C, Schervish MJ (2006) Spatial modelling using a new class of nonstationary covariance functions. Environmetrics 17:483-506 · doi:10.1002/env.785
[36] Sampson P, Guttorp P (1992) Nonparametric estimation of nonstationary spatial covariance structure. J Am Stat Assoc 87:108-119 · doi:10.1080/01621459.1992.10475181
[37] Sang H, Huang JZ (2012) A full scale approximation of covariance functions for large spatial data sets. J R Stat Soc Ser B 74(22):111-132 · Zbl 1411.62274 · doi:10.1111/j.1467-9868.2011.01007.x
[38] Schmidt A, O’Hagan A (2003) Bayesian inference for non-stationary spatial covariance structure via spatial deformations. J R Stat Soc Ser B 65:743-758 · Zbl 1063.62034 · doi:10.1111/1467-9868.00413
[39] Snelson E, Ghahramani Z (2005) Sparse Gaussian processes using pseudo-inputs. In: Advances in neural information processing systems, 18
[40] Taddy MA, Gramacy RB, Polson NG (2011) Dynamic trees for learning and design. J Am Stat Assoc 106(493):109-123 · Zbl 1396.62158 · doi:10.1198/jasa.2011.ap09769
[41] van Dyk DA, Park T (2008) Partially collapsed Gibbs samplers: theory and methods. J Am Stat Assoc 103(482):790-796 · Zbl 1471.62198 · doi:10.1198/016214508000000409
[42] Yang H, Liu F, Ji C, Dunson D (2014) Adaptive sampling for Bayesian geospatial models. Stat Comput 24:1101-1110 · Zbl 1332.62367 · doi:10.1007/s11222-013-9422-4
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.