TY - GEN
T1 - SGKD
T2 - 22nd IEEE International Conference on Data Mining Workshops, ICDMW 2022
AU - He, Yufei
AU - Ma, Yao
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - As Graph Neural Networks (GNNs) are widely used in various fields, there is a growing demand for improving their efficiency and scalablity. Knowledge Distillation (KD), a classical methods for model compression and acceleration, has been gradually introduced into the field of graph learning. More recently, it has been shown that, through knowledge distillation, the predictive capability of a well-trained GNN model can be transferred to lightweight and easy-to-deploy MLP models. Such distilled MLPs are able to achieve comparable performance as their corresponding G NN teachers while being significantly more efficient in terms of both space and time. However, the research of KD for graph learning is still in its early stage and there exist several limitations in the existing KD framework. The major issues lie in distilled MLPs lack useful information about the graph structure and logits of teacher are not always reliable. In this paper, we propose a Scalable and effective graph neural network Knowledge Distillation framework (SGKD) to address these issues. Specifically, to include the graph, we use feature propagation as preprocessing to provide MLPs with graph structure-aware features in the original feature space; to address unreliable logits of teacher, we introduce simple yet effective training strategies such as masking and temperature. With these innovations, our framework is able to be more effective while remaining scalable and efficient in training and inference. We conducted comprehensive experiments on eight datasets of different sizes - up to 100 million nodes - under various settings. The results demonstrated that SG KD is able to significantly outperform existing KD methods and even achieve comparable performance with their state-of-the-art GNN teachers.
AB - As Graph Neural Networks (GNNs) are widely used in various fields, there is a growing demand for improving their efficiency and scalablity. Knowledge Distillation (KD), a classical methods for model compression and acceleration, has been gradually introduced into the field of graph learning. More recently, it has been shown that, through knowledge distillation, the predictive capability of a well-trained GNN model can be transferred to lightweight and easy-to-deploy MLP models. Such distilled MLPs are able to achieve comparable performance as their corresponding G NN teachers while being significantly more efficient in terms of both space and time. However, the research of KD for graph learning is still in its early stage and there exist several limitations in the existing KD framework. The major issues lie in distilled MLPs lack useful information about the graph structure and logits of teacher are not always reliable. In this paper, we propose a Scalable and effective graph neural network Knowledge Distillation framework (SGKD) to address these issues. Specifically, to include the graph, we use feature propagation as preprocessing to provide MLPs with graph structure-aware features in the original feature space; to address unreliable logits of teacher, we introduce simple yet effective training strategies such as masking and temperature. With these innovations, our framework is able to be more effective while remaining scalable and efficient in training and inference. We conducted comprehensive experiments on eight datasets of different sizes - up to 100 million nodes - under various settings. The results demonstrated that SG KD is able to significantly outperform existing KD methods and even achieve comparable performance with their state-of-the-art GNN teachers.
KW - efficient training and inference
KW - graph neural networks
KW - knowledge distillation
UR - http://www.scopus.com/inward/record.url?scp=85148431464&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85148431464&partnerID=8YFLogxK
U2 - 10.1109/ICDMW58026.2022.00091
DO - 10.1109/ICDMW58026.2022.00091
M3 - Conference contribution
AN - SCOPUS:85148431464
T3 - IEEE International Conference on Data Mining Workshops, ICDMW
SP - 666
EP - 673
BT - Proceedings - 22nd IEEE International Conference on Data Mining Workshops, ICDMW 2022
A2 - Candan, K. Selcuk
A2 - Dinh, Thang N.
A2 - Thai, My T.
A2 - Washio, Takashi
PB - IEEE Computer Society
Y2 - 28 November 2022 through 1 December 2022
ER -