TY - GEN
T1 - Managing variability in the IO performance of petascale storage systems
AU - Lofstead, Jay
AU - Zheng, Fang
AU - Liu, Qing
AU - Klasky, Scott
AU - Oldfield, Ron
AU - Kordenbrock, Todd
AU - Schwan, Karsten
AU - Wolf, Matthew
N1 - Funding Information:
The work was supported in part by the National Science Foundation through grants NSFACI-0102537 and NSF CCF-0444345, and by the Director, Office of Science, Division of Mathematical, Information, and Computational Sciences of the U.S. Department of Energy under contract number DE-AC03-76SF00098.
Funding Information:
★ The work was supported in part by the National Science Foundation through grants NSF ACI-0102537 and NSF CCF-0444345, and by the Director, Office of Science, Division of Mathematical, Information, and Computational Sciences of the U.S. De-partment of Energy under contract number DE-AC03-76SF00098.
PY - 2010
Y1 - 2010
N2 - Significant challenges exist for achieving peak or even consistent levels of performance when using IO systems at scale. They stem from sharing IO system resources across the processes of single large-scale applications and/or multiple simultaneous programs causing internal and external interference, which in turn, causes substantial reductions in IO performance. This paper presents interference effects measurements for two different file systems at multiple supercomputing sites. These measurements motivate developing a'managed' IO approach using adaptive algorithms varying the IO system workload based on current levels and use areas. An implementation of these methods deployed for the shared, general scratch storage system on Oak Ridge National Laboratory machines achieves higher overall performance and less variability in both a typical usage environment and with artificially introduced levels of'noise'. The latter serving to clearly delineate and illustrate potential problems arising from shared system usage and the advantages derived from actively managing it.
AB - Significant challenges exist for achieving peak or even consistent levels of performance when using IO systems at scale. They stem from sharing IO system resources across the processes of single large-scale applications and/or multiple simultaneous programs causing internal and external interference, which in turn, causes substantial reductions in IO performance. This paper presents interference effects measurements for two different file systems at multiple supercomputing sites. These measurements motivate developing a'managed' IO approach using adaptive algorithms varying the IO system workload based on current levels and use areas. An implementation of these methods deployed for the shared, general scratch storage system on Oak Ridge National Laboratory machines achieves higher overall performance and less variability in both a typical usage environment and with artificially introduced levels of'noise'. The latter serving to clearly delineate and illustrate potential problems arising from shared system usage and the advantages derived from actively managing it.
UR - http://www.scopus.com/inward/record.url?scp=78650807854&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78650807854&partnerID=8YFLogxK
U2 - 10.1109/SC.2010.32
DO - 10.1109/SC.2010.32
M3 - Conference contribution
AN - SCOPUS:78650807854
SN - 9781424475575
T3 - 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010
BT - 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010
T2 - 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010
Y2 - 13 November 2010 through 19 November 2010
ER -