TY - GEN
T1 - HPC Campaign Management
T2 - 2025 Supercomputing Asia Conference, SCA 2025
AU - Podhorszki, Norbert
AU - Eisenhauer, Greg
AU - Kurc, Tahsin
AU - Gong, Qian
AU - Liu, Qing
AU - Chen, Jieyang
AU - Gainaru, Ana
AU - Gu, Junmin
AU - Klasky, Scott
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s)
PY - 2025/6/25
Y1 - 2025/6/25
N2 - Remote access to large-scale scientific datasets, like those generated by combustion simulations or other high-performance computing (HPC) applications, presents a significant challenge. Downloading entire datasets is often impractical due to their size and the bandwidth limitations of typical networks. To address this challenge, we propose a novel approach that enables efficient remote access to large datasets distributed across multiple facilities. Our method enables technologies to download only the data values of a select variable, in a select region of interest, to a user-defined accuracy. For this purpose, we extended the ADIOS IO library to provide read functions with user-defined accuracy, a remote data server that understands multidimensional selections of specific variables, steps and accuracy from an ADIOS dataset, and which uses lossy compression on the remote site to reduce the data to be transferred back to the client. In addition, our extension of the ADIOS library collects metadata from multiple datasets in small files called Campaign Archives, which can be shared among project participants on any HPC, cloud or laptop, and which can easily facilitate the discovery of content and pointers to the data location as well as remote access to the data by local tools as if data was local. This feature called Campaign Management, enables a group of scientists to manage related datasets stored in multiple files, across multiple facilities as if it was in a single file/database. We demonstrate the effectiveness of our approach using a 1.5 TB dataset from the S3D combustion simulation on Frontier at the Oak Ridge Leadership Facility. Even a single variable from this dataset, at 64 GB, is too large to be processed on a standard laptop. We show two different reading patterns for 2D plots and 3D visualization, with careful settings that a scientist studying combustion data would do and show that running the same Python scripts on Frontier directly takes comparable time than running them on the local laptop with remote access to the data on Frontier.
AB - Remote access to large-scale scientific datasets, like those generated by combustion simulations or other high-performance computing (HPC) applications, presents a significant challenge. Downloading entire datasets is often impractical due to their size and the bandwidth limitations of typical networks. To address this challenge, we propose a novel approach that enables efficient remote access to large datasets distributed across multiple facilities. Our method enables technologies to download only the data values of a select variable, in a select region of interest, to a user-defined accuracy. For this purpose, we extended the ADIOS IO library to provide read functions with user-defined accuracy, a remote data server that understands multidimensional selections of specific variables, steps and accuracy from an ADIOS dataset, and which uses lossy compression on the remote site to reduce the data to be transferred back to the client. In addition, our extension of the ADIOS library collects metadata from multiple datasets in small files called Campaign Archives, which can be shared among project participants on any HPC, cloud or laptop, and which can easily facilitate the discovery of content and pointers to the data location as well as remote access to the data by local tools as if data was local. This feature called Campaign Management, enables a group of scientists to manage related datasets stored in multiple files, across multiple facilities as if it was in a single file/database. We demonstrate the effectiveness of our approach using a 1.5 TB dataset from the S3D combustion simulation on Frontier at the Oak Ridge Leadership Facility. Even a single variable from this dataset, at 64 GB, is too large to be processed on a standard laptop. We show two different reading patterns for 2D plots and 3D visualization, with careful settings that a scientist studying combustion data would do and show that running the same Python scripts on Frontier directly takes comparable time than running them on the local laptop with remote access to the data on Frontier.
KW - Efficient data sharing at extreme-scale
KW - Near real-time data analysis
UR - https://www.scopus.com/pages/publications/105012245157
UR - https://www.scopus.com/pages/publications/105012245157#tab=citedBy
U2 - 10.1145/3718350.3727199
DO - 10.1145/3718350.3727199
M3 - Conference contribution
AN - SCOPUS:105012245157
T3 - Proceedings of 2025 Supercomputing Asia Conference, SCA 2025
SP - 91
EP - 95
BT - Proceedings of 2025 Supercomputing Asia Conference, SCA 2025
PB - Association for Computing Machinery, Inc
Y2 - 10 March 2025 through 13 March 2025
ER -