TY - JOUR
T1 - Systematization of Knowledge (SoK)
T2 - A Systematic Review of Software-Based Web Phishing Detection
AU - Dou, Zuochao
AU - Khalil, Issa
AU - Khreishah, Abdallah
AU - Al-Fuqaha, Ala
AU - Guizani, Mohsen
N1 - Publisher Copyright:
© 1998-2012 IEEE.
PY - 2017/10/1
Y1 - 2017/10/1
N2 - Phishing is a form of cyber attack that leverages social engineering approaches and other sophisticated techniques to harvest personal information from users of websites. The average annual growth rate of the number of unique phishing websites detected by the Anti Phishing Working Group is 36.29% for the past six years and 97.36% for the past two years. In the wake of this rise, alleviating phishing attacks has received a growing interest from the cyber security community. Extensive research and development have been conducted to detect phishing attempts based on their unique content, network, and URL characteristics. Existing approaches differ significantly in terms of intuitions, data analysis methods, as well as evaluation methodologies. This warrants a careful systematization so that the advantages and limitations of each approach, as well as the applicability in different contexts, could be analyzed and contrasted in a rigorous and principled way. This paper presents a systematic study of phishing detection schemes, especially software based ones. Starting from the phishing detection taxonomy, we study evaluation datasets, detection features, detection techniques, and evaluation metrics. Finally, we provide insights that we believe will help guide the development of more effective and efficient phishing detection schemes.
AB - Phishing is a form of cyber attack that leverages social engineering approaches and other sophisticated techniques to harvest personal information from users of websites. The average annual growth rate of the number of unique phishing websites detected by the Anti Phishing Working Group is 36.29% for the past six years and 97.36% for the past two years. In the wake of this rise, alleviating phishing attacks has received a growing interest from the cyber security community. Extensive research and development have been conducted to detect phishing attempts based on their unique content, network, and URL characteristics. Existing approaches differ significantly in terms of intuitions, data analysis methods, as well as evaluation methodologies. This warrants a careful systematization so that the advantages and limitations of each approach, as well as the applicability in different contexts, could be analyzed and contrasted in a rigorous and principled way. This paper presents a systematic study of phishing detection schemes, especially software based ones. Starting from the phishing detection taxonomy, we study evaluation datasets, detection features, detection techniques, and evaluation metrics. Finally, we provide insights that we believe will help guide the development of more effective and efficient phishing detection schemes.
KW - Phishing
KW - Phishing website detection
KW - software based methods
UR - http://www.scopus.com/inward/record.url?scp=85030253690&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85030253690&partnerID=8YFLogxK
U2 - 10.1109/COMST.2017.2752087
DO - 10.1109/COMST.2017.2752087
M3 - Review article
AN - SCOPUS:85030253690
SN - 1553-877X
VL - 19
SP - 2797
EP - 2819
JO - IEEE Communications Surveys and Tutorials
JF - IEEE Communications Surveys and Tutorials
IS - 4
M1 - 8036198
ER -