Runtime I/O re-routing + throttling on HPC storage

Qing Liu, Norbert Podhorszki, Jeremy Logan, Scott Klasky

Research output: Contribution to conferencePaperpeer-review

22 Scopus citations

Abstract

Massively parallel storage systems are becoming more and more prevalent on HPC systems due to the emergence of a new generation of data-intensive applications. To achieve the level of I/O throughput and capacity that is demanded by data intensive applications, storage systems typically deploy a large number of storage devices (also known as LUNs or data stores). In doing so, parallel applications are allowed to access storage concurrently, and as a result, the aggregate I/O throughput can be linearly increased with the number of storage devices, reducing the application’s end-to-end time. For a production system where storage devices are shared between multiple applications, contention is often a major problem leading to a significant reduction in I/O throughput. In this paper, we describe our efforts to resolve this issue in the context of HPC using a balanced re-routing + throttling approach. The proposed scheme re-routes I/O requests to a less congested storage location in a controlled manner so that write performance is improved while limiting the impact on read.

Original languageEnglish (US)
StatePublished - 2013
Externally publishedYes
Event5th USENIX Workshop on Hot Topics in Storage and File Systems, HotStorage 2013 - San Jose, United States
Duration: Jun 27 2013Jun 28 2013

Conference

Conference5th USENIX Workshop on Hot Topics in Storage and File Systems, HotStorage 2013
Country/TerritoryUnited States
CitySan Jose
Period6/27/136/28/13

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Hardware and Architecture
  • Information Systems
  • Software

Fingerprint

Dive into the research topics of 'Runtime I/O re-routing + throttling on HPC storage'. Together they form a unique fingerprint.

Cite this