Abstract
Large-scale simulations of blood flow that resolve the 3D deformation of each comprising cell are increasingly popular owing to algorithmic developments in conjunction with advances in compute capability. Among different approaches for modeling cell-resolved hemodynamics, fluid structure interaction (FSI) algorithms based on the immersed boundary method are frequently employed for coupling separate solvers for the background fluid and the cells within one framework. GPUs can accelerate these simulations; however, both current pre-exascale and future exascale CPU-GPU heterogeneous systems face communication challenges critical to performance and scalability. We describe, to our knowledge, the largest distributed GPU-accelerated FSI simulations of high hematocrit cell-resolved flows with over 17 million red blood cells. We compare scaling on a fat node system with six GPUs per node and on a system with a single GPU per node. Through comparison between the CPU- and GPU-based implementations, we identify the costs of data movement in multiscale multi-grid FSI simulations on heterogeneous systems and show it to be the greatest performance bottleneck on the GPU.
Original language | English (US) |
---|---|
Article number | 101153 |
Journal | Journal of Computational Science |
Volume | 44 |
DOIs | |
State | Published - Jul 2020 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- General Computer Science
- Modeling and Simulation
Keywords
- Distributed parallelization
- Fluid structure interaction
- GPU
- Immersed boundary method
- Lattice Boltzmann method