Countering Machine-Learning Classification of Applications by Equalizing Network Traffic Statistics

Sina Fathi-Kazerooni, Roberto Rojas-Cessa

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

We propose to equalize primary statistical metrics to decrease adversarial classification accuracy of network traffic classifiers that are based on supervised machine learning (ML) algorithms. We use packet count, inter-arrival times, and packet length as primary traffic metrics and estimate their effect on classification accuracy when used for application identification. Internet traffic classification has been used for website fingerprinting and application layer protocols in the past. Here, we consider for identifying six popular but different user applications (e.g., Skype and others). We evaluate the performance of many ML algorithms in the identification of user applications on intact traffic collected from an actual production campus network and show that these ML algorithms classify the considered applications with an average precision and recall of up to 0.96. We apply the three proposed equalizing methods and show that the modified statistics decrease the average precision and recall of the considered traffic classifiers to 0.56. We discuss the efficacy of each of the proposed methods through exhaustive evaluations on actual network traffic and the effect of the implementation of these methods on other traffic properties.

Original languageEnglish (US)
Pages (from-to)3392-3403
Number of pages12
JournalIEEE Transactions on Network Science and Engineering
Volume8
Issue number4
DOIs
StatePublished - 2021

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Computer Science Applications
  • Computer Networks and Communications

Keywords

  • Internet packet classification
  • equalization of metrics.
  • machine learning
  • online privacy
  • statistical traffic properties

Fingerprint

Dive into the research topics of 'Countering Machine-Learning Classification of Applications by Equalizing Network Traffic Statistics'. Together they form a unique fingerprint.

Cite this