TY - JOUR
T1 - An effective convolutional neural network based on SMOTE and Gaussian mixture model for intrusion detection in imbalanced dataset
AU - Zhang, Hongpo
AU - Huang, Lulu
AU - Wu, Chase Q.
AU - Li, Zhanbo
N1 - Funding Information:
The research is partly supported by the Integration of Cloud Computing and Big Integration of Cloud Computing and Big Data, Innovation of Science and Education (grant number 2017A11017), the Key Research, Development, and Dissemination Program of Henan Province (Science and Technology for the People) (grant number 182207310002), and the Key Science and Technology Project of Xinjiang Production and Construction Corps (grant number 2018AB017 ).
Funding Information:
The research is partly supported by the Integration of Cloud Computing and Big Integration of Cloud Computing and Big Data, Innovation of Science and Education (grant number 2017A11017), the Key Research, Development, and Dissemination Program of Henan Province (Science and Technology for the People) (grant number 182207310002), and the Key Science and Technology Project of Xinjiang Production and Construction Corps (grant number 2018AB017).
Publisher Copyright:
© 2020 Elsevier B.V.
PY - 2020/8/4
Y1 - 2020/8/4
N2 - Network Intrusion Detection System (NIDS) is a key security device in modern networks to detect malicious activities. However, the problem of imbalanced class associated with intrusion detection dataset limits the classifier's performance for minority classes. To improve the detection rate of minority classes while ensuring efficiency, we propose a novel class imbalance processing technology for large-scale dataset, referred to as SGM, which combines Synthetic Minority Over-Sampling Technique (SMOTE) and under-sampling for clustering based on Gaussian Mixture Model (GMM). We then design a flow-based intrusion detection model, SGM-CNN, which integrates imbalanced class processing with convolutional neural network, and investigate the impact of different numbers of convolution kernels and different learning rates on model performance. The advantages of the proposed model are verified using the UNSW-NB15 and CICIDS2017 datasets. The experimental results show that i) for binary classification and multiclass classification on the UNSW-NB15 dataset, SGM-CNN achieves a detection rate of 99.74% and 96.54%, respectively; ii) for 15-class classification on the CICIDS2017 dataset, it achieves a detection rate of 99.85%. We compare five imbalanced processing methods and two classification algorithms, and conclude that SGM-CNN provides an effective solution to imbalanced intrusion detection and outperforms the state-of-the-art intrusion detection methods.
AB - Network Intrusion Detection System (NIDS) is a key security device in modern networks to detect malicious activities. However, the problem of imbalanced class associated with intrusion detection dataset limits the classifier's performance for minority classes. To improve the detection rate of minority classes while ensuring efficiency, we propose a novel class imbalance processing technology for large-scale dataset, referred to as SGM, which combines Synthetic Minority Over-Sampling Technique (SMOTE) and under-sampling for clustering based on Gaussian Mixture Model (GMM). We then design a flow-based intrusion detection model, SGM-CNN, which integrates imbalanced class processing with convolutional neural network, and investigate the impact of different numbers of convolution kernels and different learning rates on model performance. The advantages of the proposed model are verified using the UNSW-NB15 and CICIDS2017 datasets. The experimental results show that i) for binary classification and multiclass classification on the UNSW-NB15 dataset, SGM-CNN achieves a detection rate of 99.74% and 96.54%, respectively; ii) for 15-class classification on the CICIDS2017 dataset, it achieves a detection rate of 99.85%. We compare five imbalanced processing methods and two classification algorithms, and conclude that SGM-CNN provides an effective solution to imbalanced intrusion detection and outperforms the state-of-the-art intrusion detection methods.
KW - Class imbalance
KW - Convolutional neural network
KW - Deep learning
KW - Gaussian mixture model
KW - Network intrusion detection
UR - http://www.scopus.com/inward/record.url?scp=85085920330&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85085920330&partnerID=8YFLogxK
U2 - 10.1016/j.comnet.2020.107315
DO - 10.1016/j.comnet.2020.107315
M3 - Article
AN - SCOPUS:85085920330
SN - 1389-1286
VL - 177
JO - Computer Networks
JF - Computer Networks
M1 - 107315
ER -