Systematization of Knowledge (SoK): A Systematic Review of Software-Based Web Phishing Detection

Zuochao Dou, Issa Khalil, Abdallah Khreishah, Ala Al-Fuqaha, Mohsen Guizani

Research output: Contribution to journalReview articlepeer-review

77 Scopus citations


Phishing is a form of cyber attack that leverages social engineering approaches and other sophisticated techniques to harvest personal information from users of websites. The average annual growth rate of the number of unique phishing websites detected by the Anti Phishing Working Group is 36.29% for the past six years and 97.36% for the past two years. In the wake of this rise, alleviating phishing attacks has received a growing interest from the cyber security community. Extensive research and development have been conducted to detect phishing attempts based on their unique content, network, and URL characteristics. Existing approaches differ significantly in terms of intuitions, data analysis methods, as well as evaluation methodologies. This warrants a careful systematization so that the advantages and limitations of each approach, as well as the applicability in different contexts, could be analyzed and contrasted in a rigorous and principled way. This paper presents a systematic study of phishing detection schemes, especially software based ones. Starting from the phishing detection taxonomy, we study evaluation datasets, detection features, detection techniques, and evaluation metrics. Finally, we provide insights that we believe will help guide the development of more effective and efficient phishing detection schemes.

Original languageEnglish (US)
Article number8036198
Pages (from-to)2797-2819
Number of pages23
JournalIEEE Communications Surveys and Tutorials
Issue number4
StatePublished - Oct 1 2017

All Science Journal Classification (ASJC) codes

  • Electrical and Electronic Engineering


  • Phishing
  • Phishing website detection
  • software based methods


Dive into the research topics of 'Systematization of Knowledge (SoK): A Systematic Review of Software-Based Web Phishing Detection'. Together they form a unique fingerprint.

Cite this