TY - GEN
T1 - Total Variation Reduction for Lossless Compression of HPC Applications
AU - Wang, Junqi
AU - Li, Yida
AU - Liu, Qing
AU - Luo, Huizhang
AU - Li, Kenli
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - With the growing size of high-performance computing (HPC) applications, a major challenge that domain scientists are facing is how to efficiently store and analyze the vast volume of output data. Compression can reduce the amount of data that needs to be transferred and stored. However, most of the large datasets are of floating-point format, which exhibit high entropy. As a result, existing lossless compressors usually achieve a modest reduction ratio of less than 2X. To address this problem, we propose a total variation reduction method to improve the compression ratio of lossless compressors. In particular, we first try to exploit space-filling curve (SFC), a well-known technique to preserve data locality for a multi-dimensional HPC dataset. We show and explain why a raw SFC, such as Hilbert curve and Z-order curve, cannot improve the compression ratio. Then, we explore the opportunity and theoretical feasibility of the proposed total variation reduction based algorithm. The experiment results show the effectiveness of the proposed method. The compression ratios are improved by 20.6% for FPZIP, and 18.4% for FPC, on average.
AB - With the growing size of high-performance computing (HPC) applications, a major challenge that domain scientists are facing is how to efficiently store and analyze the vast volume of output data. Compression can reduce the amount of data that needs to be transferred and stored. However, most of the large datasets are of floating-point format, which exhibit high entropy. As a result, existing lossless compressors usually achieve a modest reduction ratio of less than 2X. To address this problem, we propose a total variation reduction method to improve the compression ratio of lossless compressors. In particular, we first try to exploit space-filling curve (SFC), a well-known technique to preserve data locality for a multi-dimensional HPC dataset. We show and explain why a raw SFC, such as Hilbert curve and Z-order curve, cannot improve the compression ratio. Then, we explore the opportunity and theoretical feasibility of the proposed total variation reduction based algorithm. The experiment results show the effectiveness of the proposed method. The compression ratios are improved by 20.6% for FPZIP, and 18.4% for FPC, on average.
KW - High-performance computing
KW - floating-point data
KW - lossless compression
UR - http://www.scopus.com/inward/record.url?scp=85127741670&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85127741670&partnerID=8YFLogxK
U2 - 10.1109/SOCC52499.2021.9739467
DO - 10.1109/SOCC52499.2021.9739467
M3 - Conference contribution
AN - SCOPUS:85127741670
T3 - International System on Chip Conference
SP - 129
EP - 134
BT - Proceedings - 34th IEEE International System-on-Chip Conference, SOCC 2021
A2 - Qu, Gang
A2 - Xiong, Jinjun
A2 - Zhao, Danella
A2 - Muthukumar, Venki
A2 - Reza, Md Farhadur
A2 - Sridhar, Ramalingam
PB - IEEE Computer Society
T2 - 34th IEEE International System-on-Chip Conference, SOCC 2021
Y2 - 14 September 2021 through 17 September 2021
ER -