Managing variability in the IO performance of petascale storage systems

Jay Lofstead, Fang Zheng, Qing Liu, Scott Klasky, Ron Oldfield, Todd Kordenbrock, Karsten Schwan, Matthew Wolf

Research output: Chapter in Book/Report/Conference proceedingConference contribution

132 Scopus citations

Abstract

Significant challenges exist for achieving peak or even consistent levels of performance when using IO systems at scale. They stem from sharing IO system resources across the processes of single large-scale applications and/or multiple simultaneous programs causing internal and external interference, which in turn, causes substantial reductions in IO performance. This paper presents interference effects measurements for two different file systems at multiple supercomputing sites. These measurements motivate developing a'managed' IO approach using adaptive algorithms varying the IO system workload based on current levels and use areas. An implementation of these methods deployed for the shared, general scratch storage system on Oak Ridge National Laboratory machines achieves higher overall performance and less variability in both a typical usage environment and with artificially introduced levels of'noise'. The latter serving to clearly delineate and illustrate potential problems arising from shared system usage and the advantages derived from actively managing it.

Original languageEnglish (US)
Title of host publication2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010
DOIs
StatePublished - 2010
Externally publishedYes
Event2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010 - New Orleans, LA, United States
Duration: Nov 13 2010Nov 19 2010

Publication series

Name2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010

Other

Other2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010
Country/TerritoryUnited States
CityNew Orleans, LA
Period11/13/1011/19/10

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Managing variability in the IO performance of petascale storage systems'. Together they form a unique fingerprint.

Cite this