×

Embedded ensemble propagation for improving performance, portability, and scalability of uncertainty quantification on emerging computational architectures. (English) Zbl 1365.65017

Summary: Quantifying simulation uncertainties is a critical component of rigorous predictive simulation. A key component of this is forward propagation of uncertainties in simulation input data to output quantities of interest. Typical approaches involve repeated sampling of the simulation over the uncertain input data and can require numerous samples when accurately propagating uncertainties from large numbers of sources. Often simulation processes from sample to sample are similar, and much of the data generated from each sample evaluation could be reused. We explore a new method for implementing sampling methods that simultaneously propagates groups of samples together in an embedded fashion, which we call embedded ensemble propagation. We show how this approach takes advantage of properties of modern computer architectures to improve performance by enabling reuse between samples, reducing memory bandwidth requirements, improving memory access patterns, improving opportunities for fine-grained parallelization, and reducing communication costs. We describe a software technique for implementing embedded ensemble propagation based on the use of C++ templates and describe its integration with various scientific computing libraries within Trilinos. We demonstrate improved performance, portability, and scalability for the approach applied to the simulation of partial differential equations on a variety of multicore and manycore architectures, including up to 16,384 cores on a Cray XK7 (Titan).

MSC:

65C30 Numerical solutions to stochastic differential and integral equations
60H15 Stochastic partial differential equations (aspects of stochastic analysis)
60H35 Computational methods for stochastic equations (aspects of stochastic analysis)
35R60 PDEs with randomness, stochastic partial differential equations
65Y05 Parallel numerical computation
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] I. Babuška, F. Nobile, and R. Tempone, {\it A stochastic collocation method for elliptic partial differential equations with random input data}, SIAM J. Numer. Anal., 45 (2007), pp. 1005-1034, . · Zbl 1151.65008
[2] A. H. Baker, J. M. Dennis, and E. R. Jessup, {\it On improving linear solver performance: A block variant of GMRES}, SIAM J. Sci. Comput., 27 (2006), pp. 1608-1626, . · Zbl 1099.65029
[3] C. G. Baker and M. A. Heroux, {\it Tpetra, and the use of generic programming in scientific computing}, Sci. Program., 20 (2012), pp. 115-128.
[4] E. Bavier, M. Hoemmen, S. Rajamanickam, and H. Thornquist, {\it Amesos2 and Belos: Direct and iterative solvers for large sparse linear systems}, Sci. Program., 20 (2012), pp. 241-255.
[5] E. G. Boman, K. D. Devine, V. J. Leung, S. Rajamanickam, L. A. Riesen, M. Deveci, and Ü. Çatalyürek, {\it Zoltan\textup2: Next-Generation Combinatorial Toolkit}, Tech. report, Sandia National Laboratories, Washington, DC, 2012.
[6] J. Brown, {\it Vectorization, communication aggregation, and reuse in stochastic and temporal dimensions}, in DOE Exascale Math Workshop, Washington, DC, 2013.
[7] J. Cheng, M. Grossman, and T. McKercher, {\it Professional CUDA C Programming}, John Wiley & Sons, New York, 2014.
[8] M. Deveci, S. Rajamanickam, K. D. Devine, and Ü. V. Çatalyürek, {\it Multi-jagged: A scalable parallel spatial partitioning algorithm}, IEEE Trans. Parallel Distributed Syst., 27 (2016), pp. 803-817.
[9] H. C. Edwards, D. Sunderland, V. Porter, C. Amsler, and S. Mish, {\it Manycore performance-portability: Kokkos multidimensional array library}, Sci. Program., 20 (2012), pp. 89-114, .
[10] H. C. Edwards, C. R. Trott, and D. Sunderland, {\it Kokkos: Enabling manycore performance portability through polymorphic memory access patterns}, J. Parallel Distributed Comput., 74 (2014), pp. 3202-3216, .
[11] G. S. Fishman, {\it Monte Carlo. Concepts, Algorithms, and Applications}, Springer Series in Operations Research, Springer-Verlag, New York, 1996.
[12] Free Software Foundation, Inc., {\it Where’s the template?}, in GCC 5.2 Manual, available online from , 2015, Section 7.5 (last accessed October 2015).
[13] R. Garcia, J. Siek, and A. Lumsdaine, {\it Boost.MultiArray}, , 2013.
[14] R. Ghanem and P. D. Spanos, {\it Polynomial chaos in stochastic finite elements}, J. Appl. Mech., 57 (1990), pp. 197-202. · Zbl 0729.73290
[15] R. G. Ghanem and P. D. Spanos, {\it Stochastic Finite Elements: A Spectral Approach}, Springer-Verlag, New York, 1991. · Zbl 0722.73080
[16] R. Giering and M. Voß beck, {\it Increasing memory locality by executing several model instances simultaneously}, in Recent Advances in Algorithmic Differentiation, Lect. Notes Comput. Sci. Eng. 87, S. Forth, P. Hovland, E. Phipps, J. Utke, and A. Walther, eds., Springer, Berlin, Heidelberg, 2012, pp. 93-101, .
[17] M. D. Gunzburger, C. G. Webster, and G. Zhang, {\it Stochastic finite element methods for partial differential equations with random input data}, Acta Numer., 23 (2014), pp. 521-650. · Zbl 1398.65299
[18] J. C. Helton and F. J. Davis, {\it Latin hypercube sampling and the propagation of uncertainty in analyses of complex systems}, Reliability Engrg. Syst. Safety, 81 (2003), pp. 23-69.
[19] M. A. Heroux, R. A. Bartlett, V. E. Howle, R. J. Hoekstra, J. J. Hu, T. G. Kolda, R. B. Lehoucq, K. R. Long, R. P. Pawlowski, E. T. Phipps, A. G. Salinger, H. K. Thornquist, R. S. Tuminaro, J. M. Willenbring, A. B. Williams, and K. S. Stanley, {\it An overview of the Trilinos package}, ACM Trans. Math. Softw., 31 (2005), pp. 397-423. · Zbl 1136.65354
[20] M. A. Heroux and J. M. Willenbring, {\it A new overview of the Trilinos project}, Sci. Program., 20 (2012), pp. 83-88.
[21] M. Hoemmen, C. Trott, and M. A. Heroux, {\it Tpetra: Next-generation distributed linear algebra}, available online from , 2015.
[22] E. J. Im, K. Yelick, and R. Vuduc, {\it Sparsity: Optimization framework for sparse matrix kernels}, Internat. J. High Performance Comput. Appl., 18 (2004), pp. 135-158.
[23] P. Lin, M. T. Bettencourt, S. P. Domino, T. C. Fisher, M. Hoemmen, J. J. Hu, E. T. Phipps, A. Prokopenko, S. Rajamanickam, C. Siefert, E. C. Cyr, and S. R. Kennon, {\it Towards extreme-scale simulations with next-generation Trilinos: A low Mach fluid application case study}, in Workshop on Large-Scale Parallel Processing, IEEE International Parallel & Distributed Processing Symposium, IEEE, Washington, DC, 2014, pp. 1485-1494.
[24] P. Lin, M. T. Bettencourt, S. P. Domino, T. C. Fisher, M. Hoemmen, J. J. Hu, E. T. Phipps, A. Prokopenko, S. Rajamanickam, C. Siefert, and S. R. Kennon, {\it Towards extreme-scale simulations for low Mach fluids with second-generation Trilinos}, Parallel Process. Lett., 24 (2014), 1442005.
[25] J. D. McCalpin, {\it Memory bandwidth and machine balance in current high performance computers}, in IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter, IEEE, Washington, DC, 1995, pp. 19-25.
[26] M. D. McKay, R. J. Beckman, and W. J. Conover, {\it A comparison of three methods for selecting values of input variables in the analysis of output from a computer code}, Technometrics, 21 (1979), pp. 239-245. · Zbl 0415.62011
[27] N. Metropolis and S. Ulam, {\it The Monte Carlo method}, J. Amer. Statist. Assoc., 44 (1949), pp. 335-341. · Zbl 0033.28807
[28] H. Niederreiter, {\it Quasi-Monte Carlo methods and pseudo-random numbers}, Bull. Amer. Math. Soc., 84 (1978), pp. 957-1041. · Zbl 0404.65003
[29] F. Nobile, R. Tempone, and C. G. Webster, {\it A sparse grid stochastic collocation method for partial differential equations with random input data}, SIAM J. Numer. Anal., 46 (2008), pp. 2309-2345, . · Zbl 1176.65137
[30] F. Nobile, R. Tempone, and C. G. Webster, {\it An anisotropic sparse grid stochastic collocation method for partial differential equations with random input data}, SIAM J. Numer. Anal., 46 (2008), pp. 2411-2442, . · Zbl 1176.65007
[31] D. O’Leary, {\it The block conjugate gradient algorithm and related methods}, Linear Algebra Appl., 29 (1980), pp. 293-322. · Zbl 0426.65011
[32] R. P. Pawlowski, E. T. Phipps, and A. G. Salinger, {\it Automating embedded analysis capabilities and managing software complexity in multiphysics simulation, Part \textupI: Template-based generic programming}, Sci. Program., 20 (2012), pp. 197-219.
[33] R. P. Pawlowski, E. T. Phipps, A. G. Salinger, S. J. Owen, C. M. Siefert, and M. L. Staten, {\it Automating embedded analysis capabilities and managing software complexity in multiphysics simulation part II: Application to partial differential equations}, Sci. Program., 20 (2012), pp. 327-345.
[34] E. Phipps, H. C. Edwards, J. Hu, and C. Webster, {\it Realizing Exascale performance for uncertainty quantification}, in DOE Exascale Math Workshop, Washington, DC, 2013.
[35] E. T. Phipps, {\it Stokhos Stochastic Galerkin Uncertainty Quantification Methods}, available online from , 2015.
[36] A. Prokopenko, J. J. Hu, T. A. Wiesner, C. M. Siefert, and R. S. Tuminaro, {\it MueLu User’s Guide 1.0}, Tech. report SAND2014-18874, Sandia National Laboratories, Washington, DC, 2014.
[37] G. M. Slota, S. Rajamanickam, and K. Madduri, {\it High-performance graph analytics on manycore processors}, in Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium (IPDPS), IEEE, Washington, DC, 2015, pp. 17-27.
[38] M. Stoyanov, {\it Hierarchy-Direction Selective Approach for Locally Adaptive Sparse Grids}, Tech. report TM-2013/384, Oak Ridge National Laboratory, Oak Ridge, TN, 2013.
[39] M. Stoyanov, {\it User Manual: Tasmanian Sparse Grids}, Tech. report TM-2015/596, Oak Ridge National Laboratory, Oak Ridge, TN, 2015.
[40] C. R. Trott, M. Hoemmen, S. D. Hammond, and H. C. Edwards, {\it Kokkos: The Programming Guide}, available online from , 2015.
[41] T. Veldhuizen, {\it Expression templates}, C++ Report, 7 (1995), pp. 26-31.
[42] A. Walther and A. Griewank, {\it Getting started with ADOL-C}, in Combinatorial Scientific Computing, U. Naumann and O. Schenk, eds., Chapman & Hall/CRC Comput. Sci. Ser., CRC Press, Boca Raton, FL, 2012, pp. 181-202.
[43] D. B. Xiu and J. S. Hesthaven, {\it High-order collocation methods for differential equations with random inputs}, SIAM J. Sci. Comput., 27 (2005), pp. 1118-1139, . · Zbl 1091.65006
[44] D. B. Xiu and G. E. Karniadakis, {\it The Wiener-Askey polynomial chaos for stochastic differential equations}, SIAM J. Sci. Comput., 24 (2002), pp. 619-644, . · Zbl 1014.65004
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.