Machine Learning-Based Phishing Detection System

Authors

  • Ahd Al-qasmi, Aseel Al-anazi, Lujain AL-shehri, Shoug A-lshaman, Wiam Al-atawi, Onytra Abbass

Keywords:

Feature Extraction, Machine Learning, Mendeley Dataset 2020, Phishing Detection, Random Forest

Abstract

Phishing attacks are still one of the most important threats to cybersecurity, exploiting human weaknesses to illicitly obtain sensitive information such as credit card numbers, personal data, and passwords. These attacks are generally carried out by misleading emails or websites that mimic legitimate sources, which can have severe consequences, such as financial losses, identity theft and data breaches within the organization. To address this growing concern, we have developed a phishing detection system using a random forest (RF) model. The model has been trained on significant Mendeley datasets_2020 and has demonstrated considerable advantages in accurately detecting phishing attempts. By analyzing the critical features of the site's URL, the system can distinguish between legitimate and malicious sites. Our comprehensive evaluation showed a high 99.4% accuracy and makes it a reliable tool for phishing detection. We have integrated the system into Chrome's web extension, allowing real-time detection and improving user protection. The paper highlights the potential of machine learning in cybersecurity and offers opportunities for future research and development to improve phishing detection through advanced ML techniques and larger, more diverse datasets.

Downloads

Download data is not yet available.

References

M. K. Prabakaran, P. Meenakshi Sundaram, and A. D. Chandrasekar, “An enhanced deep learning-based phishing detection mechanism to effectively identify malicious URLs using variational autoencoders,” IET Inf. Secur., vol. 17, no. 3, pp. 423–440, 2023, doi: 10.1049/ise2.12106.

S. Alnemari and M. Alshammari, “Detecting Phishing Domains Using Machine Learning,” Appl. Sci., vol. 13, no. 8, Art. no. 8, Jan. 2023, doi: 10.3390/app13084649.

L. Tang and Q. H. Mahmoud, “A Survey of Machine Learning-Based Solutions for Phishing Website Detection,” Mach. Learn. Knowl. Extr., vol. 3, no. 3, Art. no. 3, Sep. 2021, doi: 10.3390/make3030034.

V. Shahrivari, M. M. Darabi, and M. Izadi, “Phishing Detection Using Machine Learning Techniques,” Sep. 20, 2020, arXiv: arXiv:2009.11116. doi: 10.48550/arXiv.2009.11116.

Y. Wei and Y. Sekiya, “Sufficiency of Ensemble Machine Learning Methods for Phishing Websites Detection,” IEEE Access, vol. 10, pp. 124103–124113, 2022, doi: 10.1109/ACCESS.2022.3224781.

H. Ali, M. Salleh, K. Hussain, A. Ullah, A. Ahmad, and R. Naseem, “A review on data preprocessing methods for class imbalance problem,” pp. 390–397, Oct. 2019, doi: 10.14419/ijet.v8i3.29508.

G. Vrbančič, “Phishing Websites Dataset.” Mendeley Data, Sep. 24, 2020. doi: 10.17632/72ptz43s9v.1.

S. Kapan and E. Sora Gunal, “Improved Phishing Attack Detection with Machine Learning: A Comprehensive Evaluation of Classifiers and Features,” Appl. Sci., vol. 13, no. 24, Art. no. 24, Jan. 2023, doi: 10.3390/app132413269.

S. Raschka, J. Patterson, and C. Nolet, “Machine Learning in Python: Main Developments and Technology Trends in Data Science, Machine Learning, and Artificial Intelligence,” Information, vol. 11, no. 4, Art. no. 4, Apr. 2020, doi: 10.3390/info11040193.

V. Chang, V. R. Bhavani, A. Q. Xu, and M. Hossain, “An artificial intelligence model for heart disease detection using machine learning algorithms,” Healthc. Anal., vol. 2, p. 100016, Nov. 2022, doi: 10.1016/j.health.2022.100016.

Downloads

Published

12.06.2024

How to Cite

Ahd Al-qasmi. (2024). Machine Learning-Based Phishing Detection System. International Journal of Intelligent Systems and Applications in Engineering, 12(4), 4367–4372. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/7071

Issue

Section

Research Article