TY - GEN
T1 - Exploring Transfer Learning to Reduce Training Overhead of HPC Data in Machine Learning
AU - Liu, Tong
AU - Alibhai, Shakeel
AU - Wang, Jinzhen
AU - Liu, Qing
AU - He, Xubin
AU - Wu, Chentao
PY - 2019/8
Y1 - 2019/8
N2 - Nowadays, scientific simulations on high-performance computing (HPC) systems can generate large amounts of data (in the scale of terabytes or petabytes) per run. When this huge amount of HPC data is processed by machine learning applications, the training overhead will be significant. Typically, the training process for a neural network can take several hours to complete, if not longer. When machine learning is applied to HPC scientific data, the training time can take several days or even weeks. Transfer learning, an optimization usually used to save training time or achieve better performance, has potential for reducing this large training overhead. In this paper, we apply transfer learning to a machine learning HPC application. We find that transfer learning can reduce training time without, in most cases, significantly increasing the error. This indicates transfer learning can be very useful for working with HPC datasets in machine learning applications.
AB - Nowadays, scientific simulations on high-performance computing (HPC) systems can generate large amounts of data (in the scale of terabytes or petabytes) per run. When this huge amount of HPC data is processed by machine learning applications, the training overhead will be significant. Typically, the training process for a neural network can take several hours to complete, if not longer. When machine learning is applied to HPC scientific data, the training time can take several days or even weeks. Transfer learning, an optimization usually used to save training time or achieve better performance, has potential for reducing this large training overhead. In this paper, we apply transfer learning to a machine learning HPC application. We find that transfer learning can reduce training time without, in most cases, significantly increasing the error. This indicates transfer learning can be very useful for working with HPC datasets in machine learning applications.
KW - HPC data
KW - machine learning
KW - transfer learning
UR - http://www.scopus.com/inward/record.url?scp=85073220149&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85073220149&partnerID=8YFLogxK
U2 - 10.1109/NAS.2019.8834723
DO - 10.1109/NAS.2019.8834723
M3 - Conference contribution
T3 - 2019 IEEE International Conference on Networking, Architecture and Storage, NAS 2019 - Proceedings
BT - 2019 IEEE International Conference on Networking, Architecture and Storage, NAS 2019 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 14th IEEE International Conference on Networking, Architecture and Storage, NAS 2019
Y2 - 15 August 2019 through 17 August 2019
ER -