Countering Machine-Learning Classification of Applications by Equalizing Network Traffic Statistics

Sina Fathi-Kazerooni, Roberto Rojas-Cessa

Research output: Contribution to journalArticlepeer-review

Abstract

We propose to equalize primary statistical metrics to decrease adversarial classification accuracy of network traffic classifiers that are based on supervised machine learning (ML) algorithms.We use packet count, inter-arrival times, and packet length as primary traffic metrics and estimate their effect on classification accuracy when used for application identification. Internet traffic classification has been used for website fingerprinting and application layer protocols in the past. Here, we consider for identifying six popular but different user applications (e.g., Skype and others). Weevaluate theperformance of many ML algorithms inthe identification of user applications on intact traffic collected from an actual production campus network and show that these ML algorithms classify the considered applications with an average precision and recall of up to 0.96. We apply the three proposed equalizing methods and show that the modified statistics decrease the average precision and recall of the considered traffic classifiers to 0.56. We discuss the efficacy of each of the proposed methods through exhaustive evaluations on actual network traffic and the effect of the implementation of these methods on other traffic properties.

Original languageEnglish (US)
JournalIEEE Transactions on Network Science and Engineering
DOIs
StateAccepted/In press - 2021

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Computer Science Applications
  • Computer Networks and Communications

Keywords

  • Equalization of Metrics
  • Feature extraction
  • Internet
  • Internet packet classification
  • Machine Learning
  • Measurement
  • Online Privacy
  • Payloads
  • Ports (computers)
  • Privacy
  • Statistical Traffic Properties
  • Testing

Fingerprint

Dive into the research topics of 'Countering Machine-Learning Classification of Applications by Equalizing Network Traffic Statistics'. Together they form a unique fingerprint.

Cite this