TY - GEN
T1 - Unbalanced Parallel I/O
T2 - 7th International Workshop on Data Analysis and Reduction for Big Scientific Data, DRBSD-7 2021
AU - Wang, Xinying
AU - Wan, Lipeng
AU - Chen, Jieyang
AU - Gong, Qian
AU - Whitney, Ben
AU - Wang, Jinzhen
AU - Gainaru, Ana
AU - Liu, Qing
AU - Podhorszki, Norbert
AU - Zhao, Dongfang
AU - Yan, Feng
AU - Klasky, Scott
N1 - Funding Information:
This work was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration, and National Science Foundation grants CAREER-2048044 and IIS-1838024. This research used resources of the Oak Ridge Leadership Computing Facility, a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725. Furthermore, the research in this project was also supported by the SIRIUS-2 ASCR research project and the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory (ORNL). We thank the anonymous reviewers for their insightful comments.
Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Lossy compression techniques have demonstrated promising results in significantly reducing the scientific data size while guaranteeing the compression error bounds. However, one important yet often neglected side effect of lossy scientific data compression is its impact on the performance of parallel I/O. Our key observation is that the compressed data size is often highly skewed across processes in lossy scientific compression. To understand this behavior, we conduct extensive experiments where we apply three lossy compressors MGARD, ZFP, and SZ, which are specifically designed and optimized for scientific data, to three real-world scientific applications Gray-Scott simulation, WarpX, and XGC. Our analysis result demonstrates that the size of the compressed data is always skewed even if the original data is evenly decomposed among processes. Such skewness widely exists in different scientific applications using different compressors as long as the information density of the data varies across processes. We then systematically study how this side effect of lossy scientific data compression impacts the performance of parallel I/O. We observe that the skewness in the sizes of the compressed data often leads to I/O imbalance, which can significantly reduce the efficiency of I/O bandwidth utilization if not properly handled. In addition, writing data concurrently to a single shared file through MPI-IO library is more sensitive to the unbalanced I/O loads. Therefore, we believe our research community should pay more attention to the unbalanced parallel I/O caused by lossy scientific data compression.
AB - Lossy compression techniques have demonstrated promising results in significantly reducing the scientific data size while guaranteeing the compression error bounds. However, one important yet often neglected side effect of lossy scientific data compression is its impact on the performance of parallel I/O. Our key observation is that the compressed data size is often highly skewed across processes in lossy scientific compression. To understand this behavior, we conduct extensive experiments where we apply three lossy compressors MGARD, ZFP, and SZ, which are specifically designed and optimized for scientific data, to three real-world scientific applications Gray-Scott simulation, WarpX, and XGC. Our analysis result demonstrates that the size of the compressed data is always skewed even if the original data is evenly decomposed among processes. Such skewness widely exists in different scientific applications using different compressors as long as the information density of the data varies across processes. We then systematically study how this side effect of lossy scientific data compression impacts the performance of parallel I/O. We observe that the skewness in the sizes of the compressed data often leads to I/O imbalance, which can significantly reduce the efficiency of I/O bandwidth utilization if not properly handled. In addition, writing data concurrently to a single shared file through MPI-IO library is more sensitive to the unbalanced I/O loads. Therefore, we believe our research community should pay more attention to the unbalanced parallel I/O caused by lossy scientific data compression.
UR - http://www.scopus.com/inward/record.url?scp=85124513775&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85124513775&partnerID=8YFLogxK
U2 - 10.1109/DRBSD754563.2021.00008
DO - 10.1109/DRBSD754563.2021.00008
M3 - Conference contribution
AN - SCOPUS:85124513775
T3 - Proceedings of DRBSD-7 2021: 7th International Workshop on Data Analysis and Reduction for Big Scientific Data, Held in conjunction with SC 2021: The International Conference for High Performance Computing, Networking, Storage and Analysis
SP - 26
EP - 32
BT - Proceedings of DRBSD-7 2021
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 14 November 2021
ER -