ZPerf: A Statistical Gray-Box Approach to Performance Modeling and Extrapolation for Scientific Lossy Compression

Jinzhen Wang, Qi Chen, Tong Liu, Qing Liu, Xubin He

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

With the scaling up of simulation-based scientific discovery on high-performance computing systems, the disparity between compute and I/O has increased, forcing domain scientists to save only a small amount of simulation data to persistent storage. This can result in the loss of essential physics fields that are needed for data analysis. While error-bounded lossy compression has made tremendous progress in bridging the gap between compute and I/O, the lack of understanding of compression performance remains a key hurdle to its wide adoption. In this work, we present zPerf, a statistical gray-box performance modeling approach for scientific lossy compression. Our contributions are threefold: 1) We develop zPerf to estimate the performance of lossy compression techniques, based on in-depth understanding and statistical modeling for data features and core compression metrics; 2) We demonstrate the in-detailed implementation of zPerf using two case studies, where we derive the performance modeling for SZ and ZFP, two leading lossy compressors; 3) We evaluate the effectiveness of zPerf on real-world datasets across various domains. Based on the evaluation, we demonstrate the efficacy of the zPerf performance model; 4) We further discuss three case studies where zPerf is applied to extrapolate the compression ratio of SZ and ZFP with alternative encoding schemes as well as ZFP with an alternative transform scheme. Through the case studies, we demonstrate the potential of zPerf for exploring the design space of lossy compression, which has hardly been studied in the literature.

Original languageEnglish (US)
Pages (from-to)2641-2655
Number of pages15
JournalIEEE Transactions on Computers
Volume72
Issue number9
DOIs
StatePublished - Sep 1 2023

All Science Journal Classification (ASJC) codes

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computational Theory and Mathematics

Keywords

  • Lossy compression
  • modeling
  • performance

Fingerprint

Dive into the research topics of 'ZPerf: A Statistical Gray-Box Approach to Performance Modeling and Extrapolation for Scientific Lossy Compression'. Together they form a unique fingerprint.

Cite this