Skill-Transferring Knowledge Distillation Method

Shunzhi Yang, Liuchi Xu, Mengchu Zhou, Xiong Yang, Jinfeng Yang, Zhenhua Huang

Research output: Contribution to journalArticlepeer-review

16 Scopus citations

Abstract

Knowledge distillation is a deep learning method that mimics the way that humans teach, i.e., a teacher network is used to guide the training of a student one. Knowledge distillation can generate an efficient student network to facilitate deployment in resource-constrained edge computing devices. Existing studies have typically mined knowledge from a teacher network and transferred it to a student one. The latter can only passively receive knowledge but cannot understand how the former acquires the knowledge, thus limiting the latter's performance improvement. Inspired by the old Chinese saying 'Give a man a fish and you feed him for a day; teach a man how to fish and you feed him for a lifetime,' this work proposes a Skill-transferring Knowledge Distillation (SKD) method to boost a student network's ability to create new valuable knowledge. SKD consists of two main meta-learning networks: Teacher Behavior Teaching and Teacher Experience Teaching. The former captures the process of a teacher network's learning behavior in the hidden layers and can predict the teacher network's subsequent behavior based on previous ones. The latter models the optimal empirical knowledge of a teacher network's output layer at each learning stage. With their help, a teacher network can provide its actions to a student one in the subsequent behavior and its optimal empirical knowledge in the current stage. SKD's performance is verified through its application to multiple object recognition tasks and comparison with the state of the art.

Original languageEnglish (US)
Pages (from-to)6487-6502
Number of pages16
JournalIEEE Transactions on Circuits and Systems for Video Technology
Volume33
Issue number11
DOIs
StatePublished - Nov 1 2023

All Science Journal Classification (ASJC) codes

  • Media Technology
  • Electrical and Electronic Engineering

Keywords

  • Knowledge distillation
  • edge computing devices
  • human teaching experience
  • knowledge and skills
  • machine learning
  • meta-learning
  • object recognition

Fingerprint

Dive into the research topics of 'Skill-Transferring Knowledge Distillation Method'. Together they form a unique fingerprint.

Cite this