×

Parallel-vector algorithms for particle simulations on shared-memory multiprocessors. (English) Zbl 1370.70001

Summary: Over the last few decades, the computational demands of massive particle-based simulations for both scientific and industrial purposes have been continuously increasing. Hence, considerable efforts are being made to develop parallel computing techniques on various platforms. In such simulations, particles freely move within a given space, and so on a distributed-memory system, load balancing, i.e., assigning an equal number of particles to each processor, is not guaranteed. However, shared-memory systems achieve better load balancing for particle models, but suffer from the intrinsic drawback of memory access competition, particularly during (1) paring of contact candidates from among neighboring particles and (2) force summation for each particle. Here, novel algorithms are proposed to overcome these two problems. For the first problem, the key is a pre-conditioning process during which particle labels are sorted by a cell label in the domain to which the particles belong. Then, a list of contact candidates is constructed by pairing the sorted particle labels. For the latter problem, a table comprising the list indexes of the contact candidate pairs is created and used to sum the contact forces acting on each particle for all contacts according to Newton’s third law. With just these methods, memory access competition is avoided without additional redundant procedures. The parallel efficiency and compatibility of these two algorithms were evaluated in discrete element method (DEM) simulations on four types of shared-memory parallel computers: a multicore multiprocessor computer, scalar supercomputer, vector supercomputer, and graphics processing unit. The computational efficiency of a DEM code was found to be drastically improved with our algorithms on all but the scalar supercomputer. Thus, the developed parallel algorithms are useful on shared-memory parallel computers with sufficient memory bandwidth.

MSC:

70-08 Computational methods for problems pertaining to mechanics of particles and systems
70F99 Dynamics of a system of particles, including celestial mechanics
68W10 Parallel algorithms in computer science
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Benz, W., Computer Physics Communications, 48, 97 (1988)
[2] Brunini, A.; Santamaría, P.; Viturro, H.; Cionco, R., Planetary and Space Science, 55, 2121 (2007)
[3] Ristow, G., Annual Reviews of Computational Physics I, 1, 275 (1994)
[4] Karplus, M.; McCammon, J., Nature Structural Biology, 9, 646 (2002)
[5] Park, J. H.; Lee, D.; Kim, H.; Yang, Y. S., Ocean Engineering, 31, 1537 (2004)
[6] Koo, C.; Chern, J., International Journal of Rock Mechanics and Mining Sciences, 35, 683 (1998)
[7] Wu, J. H.; Lin, J. S.; Chen, C. S., International Journal of Rock Mechanics and Mining Sciences, 46, 397 (2009)
[8] Cundall, P. A.; Strack, O. D.L., Geotechnique, 29, 47 (1979)
[9] Sakaguchi, H.; Ozaki, E.; Igarashi, T., International Journal of Modern Physics B, 7, 1949 (1993)
[10] Tijskens, E.; Ramon, H.; Baerdemaeker, J., Journal of Sound and Vibration, 266, 493 (2003)
[11] Zhu, H.; Zhou, Z.; Yang, R.; Yu, A., Chemical Engineering Science, 63, 5728 (2008)
[12] Cleary, P., Engineering Computations: International Journal for Computer-Aided Engineering, 26, 698 (2009)
[13] Melo, F.; Umbanhowar, P.; Swinney, H., Physical Review Letters, 75, 3838 (1995)
[14] A. Carrillo, D. Horner, J. Peters, J. West, in: Proceedings of the 1996 ACM/IEEE Conference on Supercomputing (CDROM), 1996, p. 51.; A. Carrillo, D. Horner, J. Peters, J. West, in: Proceedings of the 1996 ACM/IEEE Conference on Supercomputing (CDROM), 1996, p. 51.
[15] Kohring, G., Computational Methods in Applied Sciences, 96, 190 (1996)
[16] Dowding, C. H.; Dmytryshyn, O.; Belytschko, T. B., Computers and Geotechnics, 25, 281 (1999)
[17] Carrión Schäfer, B.; Quigley, S.; Chan, A., Computers and Structures, 82, 1707 (2004)
[18] Sawley, M.; Cleary, P., EPFL Supercomputing Review, 11, 23 (1999)
[19] Maknickas, A.; Kačeniauskas, A.; Kačianauskas, R.; Balevičius, R.; Džiugys, A., Informatica, 17, 207 (2006)
[20] Chang, W.; Hsieh, S., Journal of the Chinese Institute of Engineers, 32, 825 (2009)
[21] Walther, J.; Sbalzarini, I., Engineering Computations: International Journal for Computer-Aided Engineering, 26, 688 (2009)
[22] Kohring, G., Parallel Computing, 21, 683 (1995)
[23] Fleissner, F.; Eberhard, P., Advances in Parallel Computing, 15, 37 (2008)
[24] Yang, J.; Wang, Y.; Chen, Y., Journal of Computational Physics, 221, 799 (2007)
[25] Anderson, J.; Lorenz, C.; Travesset, A., Journal of Computational Physics, 227, 5342 (2008)
[26] Friedrichs, M.; Eastman, P.; Vaidyanathan, V.; Houston, M.; Legrand, S.; Beberg, A.; Ensign, D.; Bruns, C.; Pande, V., Journal of Computational Chemistry, 30, 864 (2009)
[27] Preis, T.; Virnau, P.; Paul, W.; Schneider, J. J., Journal of Computational Physics, 228, 4468 (2009) · Zbl 1167.82347
[28] Dynerman, D.; Butzlaff, E.; Mitchell, J., Journal of Computational Biology, 16, 523 (2009)
[29] Stone, J.; Saam, J.; Hardy, D.; Vandivort, K.; Hwu, W.; Schulten, K., ACM International Conference Proceeding Series, 383, 9 (2009)
[30] Moore, G., Proceedings of the IEEE, 86, 82 (1998)
[31] A.M.D. Inc., The future is fusion; The industry-changing impact of accelerated computing, 2008. <http://sites.amd.com/us/Documents/AMD_fusion_Whitepaper.pdf>; A.M.D. Inc., The future is fusion; The industry-changing impact of accelerated computing, 2008. <http://sites.amd.com/us/Documents/AMD_fusion_Whitepaper.pdf>
[32] Seiler, L.; Carmean, D.; Sprangle, E.; Forsyth, T.; Abrash, M.; Dubey, P.; Junkins, S.; Lake, A.; Sugerman, J.; Cavin, R., Larrabee: a many-core x86 architecture for visual computing, (ACM SIGGRAPH 2008 (2008), ACM), 18
[33] Plimpton, S., Journal of Computational Physics, 117, 1 (1995)
[34] Nguyen, H., GPU Gems 3 (2007), Addison-Wesley Professional
[35] Blelloch, G.; Leiserson, C.; Maggs, B.; Plaxton, C.; Smith, S.; Zagha, M., Theory of Computing Systems, 31, 135 (1998) · Zbl 0895.68066
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.