Collaborative Research: Elements: ProDM: Developing A Unified Progressive Data Management Library for Exascale Computational Science

Project: Research project

Project Details

Description

Effective management of scientific data produced by extreme-scale simulations and instruments is crucial for advancing scientific discoveries. Due to the scale of data and the diverse requirements of scientific analytics, there is a growing need to manage data in a progressive manner, such that users can stream as much data as they need to carry out their data analytics with reduced data movement and computation. However, little effort has been put into creating robust and scalable cyberinfrastructure services that link the recent algorithmic innovations in progressive methods with scientific data analytics, leaving these capabilities inaccessible to scientists. This project aims to develop a sustainable framework ProDM that supports the progressive management of scientific data to facilitate its use in scientific applications. The success of this project will enable new scientific research and novel findings by providing a new way to manage and analyze data. Furthermore, outcomes of this project will be delivered as publicly available software to enhance research cyberinfrastructure, promote education and teaching, and broaden participation in computing. ProDM is centered upon the unification of viable progressive representations and tailored development for in-situ and post-hoc analytic routines. In particular, it involves three key activities. First, a data engine will be built to unify state-of-the-art progressive representations, and provide portable hardware support for accelerators as well as interoperative software interfaces to other data management and analytic libraries. Second, an in-situ engine will be developed to facilitate the use of progressive representations for in-situ data analytics, which include a redesign of in-situ semantics and adjustment of runtime dynamics. Third, a post-hoc engine will be developed to efficiently access progressive data and improve the performance of data retrieval for post-hoc data analytics. ProDM will be deployed on campus-wide computing infrastructures and leadership systems for integration and evaluation with real-world scientific applications from climate, fusion, molecular dynamics, and beyond.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
StatusActive
Effective start/end date8/1/237/31/26

Funding

  • National Science Foundation: $179,471.00

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.