Exploring Transfer Learning to Reduce Training Overhead of HPC Data in Machine Learning

Tong Liu, Shakeel Alibhai, Jinzhen Wang, Qing Liu, Xubin He, Chentao Wu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Scopus citations

Abstract

Nowadays, scientific simulations on high-performance computing (HPC) systems can generate large amounts of data (in the scale of terabytes or petabytes) per run. When this huge amount of HPC data is processed by machine learning applications, the training overhead will be significant. Typically, the training process for a neural network can take several hours to complete, if not longer. When machine learning is applied to HPC scientific data, the training time can take several days or even weeks. Transfer learning, an optimization usually used to save training time or achieve better performance, has potential for reducing this large training overhead. In this paper, we apply transfer learning to a machine learning HPC application. We find that transfer learning can reduce training time without, in most cases, significantly increasing the error. This indicates transfer learning can be very useful for working with HPC datasets in machine learning applications.

Original languageEnglish (US)
Title of host publication2019 IEEE International Conference on Networking, Architecture and Storage, NAS 2019 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728144092
DOIs
StatePublished - Aug 2019
Event14th IEEE International Conference on Networking, Architecture and Storage, NAS 2019 - Enshi, China
Duration: Aug 15 2019Aug 17 2019

Publication series

Name2019 IEEE International Conference on Networking, Architecture and Storage, NAS 2019 - Proceedings

Conference

Conference14th IEEE International Conference on Networking, Architecture and Storage, NAS 2019
Country/TerritoryChina
CityEnshi
Period8/15/198/17/19

All Science Journal Classification (ASJC) codes

  • Information Systems and Management
  • Computer Networks and Communications
  • Hardware and Architecture

Keywords

  • HPC data
  • machine learning
  • transfer learning

Fingerprint

Dive into the research topics of 'Exploring Transfer Learning to Reduce Training Overhead of HPC Data in Machine Learning'. Together they form a unique fingerprint.

Cite this