zbMATH — the first resource for mathematics

Accelerated finite element elastodynamic simulations using the GPU. (English) Zbl 1349.74324
Summary: An approach is developed to perform explicit time domain finite element simulations of elastodynamic problems on the graphical processing unit, using Nvidia’s CUDA. Of critical importance for this problem is the arrangement of nodes in memory, allowing data to be loaded efficiently and minimising communication between the independently executed blocks of threads. The initial stage of memory arrangement is partitioning the mesh; both a well established ‘greedy’ partitioner and a new, more efficient ‘aligned’ partitioner are investigated. A method is then developed to efficiently arrange the memory within each partition. The software is applied to three models from the fields of non-destructive testing, vibrations and geophysics, demonstrating a memory bandwidth of very close to the card’s maximum, reflecting the bandwidth-limited nature of the algorithm.comparison with Abaqus, a widely used commercial CPU equivalent, validated the accuracy of the results and demonstrated a speed improvement of around two orders of magnitude. A software package, Pogo, incorporating these developments, is released open source, downloadable from http://www. Pogo-fea.com/ to benefit the community.

74S05 Finite element methods applied to problems in solid mechanics
65M60 Finite element, Rayleigh-Ritz and Galerkin methods for initial value and initial-boundary value problems involving PDEs
65Y10 Numerical algorithms for specific classes of architectures
Full Text: DOI
[1] CUDA C programming guide
[2] Anderson, J. A.; Lorenz, C. D.; Travesset, A., General purpose molecular dynamics simulations fully implemented on graphics processing units, J. Comput. Phys., 227, 10, 5342-5359, (2008) · Zbl 1148.81301
[3] Joldes, G. R.; Wittek, A.; Miller, K., Real-time nonlinear finite element computations on GPU - application to neurosurgical simulation, Comput. Methods Appl. Mech. Eng., 199, 49, 3305-3314, (2010) · Zbl 1225.92021
[4] Cecka, C.; Lew, A. J.; Darve, E., Assembly of finite element methods on graphics processors, Int. J. Numer. Methods Eng., 85, 5, 640-669, (2011) · Zbl 1217.80146
[5] Lengyel, J.; Reichert, M.; Donald, B. R.; Greenberg, D. P., Real-time robot motion planning using rasterizing computer graphics hardware, SIGGRAPH Comput. Graph., 24, 4, 327-335, (1990)
[6] Hoff, K. E.; Keyser, J.; Lin, M.; Manocha, D.; Culver, T., Fast computation of generalized Voronoi diagrams using graphics hardware, (Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ʼ99, (1999), ACM Press/Addison-Wesley Publishing Co. New York, NY, USA), 277-286
[7] Woo, M.; Neider, J.; Davis, T.; Shreiner, D., Opengl programming guide: the official guide to learning opengl, (1999), Addison-Wesley Longman Publishing Co., Inc., version 1.2
[8] Trendall, C.; Stewart, J., General calculations using graphics hardware, with application to interactive caustics, (Peroche, B.; Rushmeier, H., Rendering Techniques ʼ00, Eurographics, (2000), Springer-Verlag Wien, New York, Brno, Tchèque République)
[9] Harris, M. J.; Coombe, G.; Scheuermann, T.; Lastra, A., Physically-based visual simulation on graphics hardware, (Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware, HWWS ʼ02, (2002), Eurographics Association Aire-la-Ville, Switzerland), 109-118
[10] Macedonia, M., The GPU enters computingʼs mainstream, Computer, 36, 10, 106-108, (2003)
[11] Krüger, J.; Westermann, R., Linear algebra operators for GPU implementation of numerical algorithms, ACM Trans. Graph., 22, 3, 908-916, (2003)
[12] Bolz, J.; Farmer, I.; Grinspun, E.; Schröder, P., Sparse matrix solvers on the GPU: conjugate gradients and multigrid, ACM Trans. Graph., 22, 3, 917-924, (2003)
[13] Thompson, C. J.; Hahn, S.; Oskin, M., Using modern graphics architectures for general-purpose computing: a framework and analysis, (Proceedings of the 35th Annual ACM/IEEE International Symposium on Microarchitecture, MICRO 35, (2002), IEEE Computer Society Press Los Alamitos, CA, USA), 306-317
[14] Buck, I.; Foley, T.; Horn, D.; Sugerman, J.; Fatahalian, K.; Houston, M.; Hanrahan, P., Brook for GPUs: stream computing on graphics hardware, ACM Trans. Graph., 23, 3, 777-786, (2004)
[15] Bayoumi, A. M.; Chu, M.; Hanafy, Y. Y.; Harrell, P.; Refai-Ahmed, G., Scientific and engineering computing using ATI stream technology, Comput. Sci. Eng., 11, 6, 92-97, (2009)
[16] Munshi, A., The opencl specification v1.2, (November 2012)
[17] Balevic, A.; Rockstroh, L.; Tausendfreund, A.; Patzelt, S.; Goch, G.; Simon, S., Accelerating simulations of light scattering based on finite-difference time-domain method with general purpose gpus, (11th IEEE International Conference on Computational Science and Engineering, (2008), IEEE), 327-334
[18] De Donno, D.; Esposito, A.; Tarricone, L.; Catarinucci, L., Introduction to GPU computing and CUDA programming: A case study on FDTD [EM programmerʼs notebook], IEEE Antennas Propag. Mag., 52, 3, 116-122, (2010)
[19] Michéa, D.; Komatitsch, D., Accelerating a three-dimensional finite-difference wave propagation code using GPU graphics cards, Geophys. J. Int., 182, 1, 389-402, (2010)
[20] Cangellaris, A. C.; Wright, D. B., Analysis of the numerical error caused by the stair-stepped approximation of a conducting boundary in FDTD simulations of electromagnetic phenomena, IEEE Trans. Antennas Propag., 39, 10, 1518-1525, (1991)
[21] Drozdz, M., Efficient finite element modelling of ultrasound waves in elastic media, (2008), Imperial College London, Ph.D. thesis
[22] Huthwaite, P.; Simonetti, F.; Lowe, M. J.S., On the convergence of finite element scattering simulations, (AIP Conference Proceedings, vol. 1211, (2010)), 65
[23] Fan, Z.; Qiu, F.; Kaufman, A.; Yoakum-Stover, S., GPU cluster for high performance computing, (Proceedings of the 2004 ACM/IEEE Conference on Supercomputing, (2004), IEEE Computer Society Washington, DC, USA), 47
[24] Liu, K.; Wang, X.-B.; Zhang, Y.; Liao, C., Acceleration of time-domain finite element method (TD-FEM) using graphics processor units (GPU), (7th International Symposium on Antennas, Propagation EM Theory, (2006)), 1-4
[25] Göddeke, D.; Strzodka, R.; Turek, S., Accelerating double precision FEM simulations with gpus, (Proceedings of ASIM - 18th Symposium on Simulation Technique, (2005))
[26] Göddeke, D.; Strzodka, R.; Mohd-Yusof, J.; McCormick, P.; Buijssen, S. H.M.; Grajewski, M.; Turek, S., Exploring weak scalability for FEM calculations on a GPU-enhanced cluster, Parallel Comput., 33, 10-11, 685-699, (2007)
[27] Göddeke, D.; Strzodka, R.; Turek, S., Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations, Int. J. Parallel Emerg. Distrib. Syst., 22, 4, 221-256, (2007) · Zbl 1188.68084
[28] Göddeke, D., Fast and accurate finite-element multigrid solvers for PDE simulations on GPU clusters, (May 2010), Technische Universität Dortmund, Fakultät für Mathematik, Ph.D. thesis
[29] Turek, S.; Göddeke, D.; Buijssen, S. H.M.; Wobker, H., Hardware-oriented multigrid finite element solvers on GPU-accelerated clusters, (Kurzak, J.; Bader, D. A.; Dongarra, J. J., Scientific Computing with Multicore and Accelerators, (2010), CRC Press), Ch. 6
[30] Geveler, M.; Ribbrock, D.; Göddeke, D.; Zajac, P.; Turek, S., Towards a complete FEM-based simulation toolkit on GPUs: unstructured grid finite element geometric multigrid solvers with strong smoothers based on sparse approximate inverses, Comput. Fluids, 80, 327-332, (2013) · Zbl 1284.76249
[31] Göddeke, D.; Becker, C.; Turek, S., Integrating GPUs as fast co-processors into the parallel FE package FEAST, (Becker, M.; Szczerbicka, H., 19th Symposium Simulationstechnique (ASIMʼ06), Frontiers in Simulation, (2006)), 277-282
[32] Comas, O.; Taylor, Z. A.; Allard, J.; Ourselin, S.; Cotin, S.; Passenger, J., Efficient nonlinear FEM for soft tissue modelling and its GPU implementation within the open source framework SOFA, (Bello, F.; Edwards, P., Biomedical Simulation, Lecture Notes in Computer Science, vol. 5104, (2008), Springer Berlin, Heidelberg), 28-39
[33] Dick, C.; Georgii, J.; Westermann, R., A real-time multigrid finite hexahedra method for elasticity simulation using CUDA, Simul. Model. Pract. Theory, 19, 2, 801-816, (2011)
[34] Komatitsch, D.; Michéa, D.; Erlebacher, G., Porting a high-order finite-element earthquake modeling application to NVIDIA graphics cards using CUDA, J. Parallel Distrib. Comput., 69, 5, 451-460, (2009)
[35] Klöckner, A.; Warburton, T.; Bridge, J.; Hesthaven, J. S., Nodal discontinuous Galerkin methods on graphics processors, J. Comput. Phys., 228, 21, 7863-7882, (2009) · Zbl 1175.65111
[36] Hsieh, S. H.; Paulino, G. H.; Abel, J. F., Evaluation of automatic domain partitioning algorithms for parallel finite element analysis, Int. J. Numer. Methods Eng., 40, 6, 1025-1051, (1997) · Zbl 0889.73067
[37] Lord, W.; Ludwig, R.; You, Z., Developments in ultrasonic modeling with finite element analysis, J. Nondestruct. Eval., 9, 2, 129-143, (1990)
[38] Lin, Y.; Sansalone, M.; Carino, N. J., Finite element studies of the impact-echo response of plates containing thin layers and voids, J. Nondestruct. Eval., 9, 1, 27-47, (1990)
[39] Moser, F.; Jacobs, L. J.; Qu, J., Modeling elastic wave propagation in waveguides with the finite element method, Nondestruct. Test. Eval. Int., 32, 4, 225-234, (1999)
[40] Baskaran, G.; Rao, C. L.; Balasubramaniam, K., Simulation of the TOFD technique using the finite element method, Insight, 49, 11, 641-646, (2007)
[41] Moczo, P.; Kristek, J.; Galis, M.; Pazak, P.; Balazovjech, M., The finite-difference and finite-element modeling of seismic wave propagation and earthquake motion, Acta Phys. Slovaca, 57, 2, 177-406, (2007)
[42] Cook, R., Concepts and applications of finite element analysis, (2007), John Wiley & Sons
[43] Bathe, K. J.; Wilson, E. L., Numerical methods in finite element analysis, (1976), Prentice-Hall Englewood Cliffs, NJ
[44] Cuthill, E.; McKee, J., Reducing the bandwidth of sparse symmetric matrices, (Proceedings of the 1969 24th National Conference, (1969), ACM), 157-172
[45] Caendish, J. C.; Field, D. A.; Frey, W. H., An approach to automatic three-dimensional finite element mesh generation, Int. J. Numer. Methods Eng., 21, 2, 329-347, (1985) · Zbl 0573.65090
[46] Jin, H.; Wiberg, N. E., Two-dimensional mesh generation, adaptive remeshing and refinement, Int. J. Numer. Methods Eng., 29, 7, 1501-1526, (1990)
[47] George, P. L.; Seveno, E., The advancing-front mesh generation method revisited, Int. J. Numer. Methods Eng., 37, 21, 3605-3619, (1994) · Zbl 0816.76045
[48] Dassault Systèmes Simulia Corp., Abaqus 6.11 documentation
[49] Rajagopal, P.; Drozdz, M.; Skelton, E.; Lowe, M. J.S.; Craster, R., On the use of absorbing layers to simulate the propagation of elastic waves in unbounded isotropic media using commercially available finite element packages, NDT & E International
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.