×

Preconditioned GMRES solver on multiple-GPU architecture. (English) Zbl 1359.65051

Summary: In this paper, we analyze the preconditioned GMRES algorithm in detail and decompose it into components to implement on multiple-GPU architecture. The operations of vector updates, dot products and Sparse Matrix-Vector multiplication (SpMV) are implemented in parallel. In addition, a specific communication mechanism for SpMV is designed. The preconditioner is established on the host (CPU) and solved on the devices (GPUs). Validated by a series of numerical experiments, the GPU-based GMRES solver is effective and favorable parallel performance is achieved.

MSC:

65F10 Iterative numerical methods for linear systems
65Y05 Parallel numerical computation
65F08 Preconditioners for iterative methods
65F50 Computational methods for sparse matrices
65Y10 Numerical algorithms for specific classes of architectures

Software:

SparseMatrix; CUDA
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Chen, Z.; Huan, G.; Ma, Y., Computational Methods for Multiphase Flows in Porous Media, (Computational Science and Engineering Series, vol. 2 (2006), SIAM: SIAM Philadelphia) · Zbl 1092.76001
[2] Saad, Y., Iterative Methods for Sparse Linear Systems (2003), SIAM · Zbl 1002.65042
[3] Barrett, R.; Berry, M.; Chan, T. F.; Demmel, J.; Donato, J.; Dongarra, J.; Eijkhout, V.; Pozo, R.; Romine, C.; Vander Vorst, H., Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods (1994), SIAM
[5] Van der Vorst, H. A., Bi-CGSTAB: A fast and smoothly converging variant of Bi-CG for the solution of nonsymmetric linear systems, SIAM J. Sci. Stat. Comput., 13, 2, 631-644 (1992) · Zbl 0761.65023
[6] Li, R.; Saad, Y., GPU-accelerated Preconditioned Iterative Linear Solvers, Technical Report umsi-2010-112 (2010), Minnesota Supercomputer Institute, University of Minnesota: Minnesota Supercomputer Institute, University of Minnesota Minneapolis, MN
[13] Chen, Z.; Zhang, Y., Development, analysis and numerical tests of a compositional reservoir simulator, Int. J. Numer. Anal. Model., 4, 86-100 (2008) · Zbl 1242.76319
[17] Zhang, P.; Gao, Y., Matrix multiplication on high-density multi-GPU architectures: Theoretical and experimental investigations, Lecture Notes in Comput. Sci., 9137, 17-30 (2015)
[18] Chen, Z.; Liu, H.; Yang, B., Accelerating preconditioned iterative linear solvers on GPU, Int. J. Numer. Anal. Model. Ser. B, 5, 1-2, 136-146 (2014) · Zbl 1463.65051
[21] Haase, G.; Liebmann, M.; Douglas, C. C.; Plank, G., A parallel algebraic multigrid solver on graphics processing units, High Perform. Comput. Appl., 38-47 (2010)
[22] Bolz, J.; Farmer, I.; Grinspun, E.; Schröder, P., Sparse matrix solvers on the GPU: Conjugate gradients and multigrid, Symp. Q. J Foreign Lit., 22, 3, 917-924 (2007)
[23] Buatois, L.; Caumon, G.; Lévy, B., Concurrent number cruncher: An efficient sparse linear solver on the GPU, High Perform. Comput. Commun., 4782, 358-371 (2007)
[24] Goddeke, D.; Strzodka, R.; Mohd-Yusof, J.; McCormick, P.; Wobker, H.; Becker, C.; Turek, S., Using GPUs to improve multigrid solver performance on a cluster, Int. J. Comput. Sci. Eng., 4, 1, 36-55 (2008)
[25] Brannick, J.; Chen, Y.; Hu, X.; Zikatanov, L., Parallel unsmoothed aggregation algebraic multigrid algorithms on GPUs, (Springer Processings in Mathematics and Statistics, vol. 45 (2013)), 81-102 · Zbl 1275.65084
[26] Wang, L.; Hu, X.; Cohen, J.; Xu, J., A parallel auxiliary grid algebraic multigrid method for graphic processing unit, SIAM J. Sci. Comput., 35, 3, 263-283 (2013)
[29] Liu, H.; Yu, S.; Chen, Z.; Hsieh, B.; Shao, L., Sparse matrix-vector multiplication on NVIDIA GPU, Int. J. Numer. Anal. Model. Ser. B, 3, 2, 185-191 (2012) · Zbl 1260.65039
[30] Cai, X.-C.; Sarkis, M., A restricted additive schwarz preconditioner for general sparse linear systems, SIAM J. Sci. Comput., 21, 792-797 (1999) · Zbl 0944.65031
[31] Karypis, G.; Kumar, V., A fast and highly quality multilevel scheme for partitioning irregular graphs, SIAM J. Sci. Comput., 20, 1, 359-392 (1999) · Zbl 0915.68129
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.