Quality-aware data management for large scale scientific applications

Hongbo Zou, Fang Zheng, Matthew Wolf, Greg Eisenhauer, Karsten Schwan, Hasan Abbasi, Qing Liu, Norbert Podhorszki, Scott Klasky

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Scopus citations

Abstract

Increasingly larger scale simulations are generating an unprecedented amount of output data, causing researchers to explore new 'data staging' methods that buffer, use, and/or reduce such data online rather than simply pushing it to disk. Leveraging the capabilities of data staging, this study explores the potential for data reduction via online data compression, first using general compression techniques and then proposing use- specific methods that permit users to define simple data queries that cause only the data identified by those queries to be emitted. Using online methods for code generation and deployment, with such dynamic data queries, end users can precisely identify the quality of information (QoI) of their output data, by explicitly determining what data may be lost vs. retained, in contrast to general-purpose lossy compression methods that do not provide such levels of control. The paper also describes the key elements of a quality-aware data management system (QADMS) for high- end machines enabled by this approach. Initial experimental results demonstrate that QADMS can effectively reduce data movement cost and improve the QoS while meeting the QoI constraint stated by users.

Original languageEnglish (US)
Title of host publicationProceedings - 2012 SC Companion
Subtitle of host publicationHigh Performance Computing, Networking Storage and Analysis, SCC 2012
Pages816-820
Number of pages5
DOIs
StatePublished - 2012
Externally publishedYes
Event2012 SC Companion: High Performance Computing, Networking Storage and Analysis, SCC 2012 - Salt Lake City, UT, United States
Duration: Nov 10 2012Nov 16 2012

Publication series

NameProceedings - 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, SCC 2012

Other

Other2012 SC Companion: High Performance Computing, Networking Storage and Analysis, SCC 2012
Country/TerritoryUnited States
CitySalt Lake City, UT
Period11/10/1211/16/12

All Science Journal Classification (ASJC) codes

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Software

Keywords

  • Data management
  • HPC simulation
  • compression
  • quality of information
  • visualization

Fingerprint

Dive into the research topics of 'Quality-aware data management for large scale scientific applications'. Together they form a unique fingerprint.

Cite this