A view from ORNL: Scientific data research opportunities in the big data age

Scott Klasky, Matthew Wolf, Mark Ainsworth, Chuck Atkins, Jong Choi, Greg Eisenhauer, Berk Geveci, William Godoy, Mark Kim, James Kress, Tahsin Kurc, Qing Liu, Jeremy Logan, Arthur B. Maccabe, Kshitij Mehta, George Ostrouchov, Manish Parashar, Norbert Podhorszki, David Pugmire, Eric SuchytaLipeng Wan, Ruonan Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Scopus citations

Abstract

One of the core issues across computer and computational science today is adapting to, managing, and learning from the influx of 'Big Data'. In the commercial space, this problem has led to a huge investment in new technologies and capabilities that are well adapted to dealing with the sorts of human-generated logs, videos, texts, and other large-data artifacts that are processed and resulted in an explosion of useful platforms and languages (Hadoop, Spark, Pandas, etc.). However, translating this work from the enterprise space to the computational science and HPC community has proven somewhat difficult, in part because of some of the fundamental differences in type and scale of data and timescales surrounding its generation and use. We describe a forward-looking research and development plan which centers around the concept of making Input/Output (I/O) intelligent for users in the scientific community, whether they are accessing scalable storage or performing in situ workflow tasks. Much of our work is based on our experience with the Adaptable I/O System (ADIOS 1.X), and our next generation version of the software ADIOS 2.X [1].

Original languageEnglish (US)
Title of host publicationProceedings - 2018 IEEE 38th International Conference on Distributed Computing Systems, ICDCS 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1357-1368
Number of pages12
ISBN (Electronic)9781538668719
DOIs
StatePublished - Jul 19 2018
Event38th IEEE International Conference on Distributed Computing Systems, ICDCS 2018 - Vienna, Austria
Duration: Jul 2 2018Jul 5 2018

Publication series

NameProceedings - International Conference on Distributed Computing Systems
Volume2018-July

Other

Other38th IEEE International Conference on Distributed Computing Systems, ICDCS 2018
Country/TerritoryAustria
CityVienna
Period7/2/187/5/18

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Keywords

  • High Performance Computing
  • High Performance I/O
  • In Situ Visualization
  • Publish/Subscribe

Fingerprint

Dive into the research topics of 'A view from ORNL: Scientific data research opportunities in the big data age'. Together they form a unique fingerprint.

Cite this