Comparative Study of KNN and LR Approaches of Machine Learning with Respect to the Identification of Phishing Websites

Authors

  • Sachin Kadam Institute of Management and Entrepreneurship Development, Bharati Vidyapeeth (Deemed to be University), Pune (India)
  • Nidhi Bharati Vidyapeeth’s Institute of Management & Information Technology, Navi Mumbai (India)
  • Pratibha Deshmukh Bharati Vidyapeeth’s Institute of Management & Information Technology, Navi Mumbai (India)
  • Nidhi Khare Symbiosis Skills and Professional University, Pune (India)
  • Irfan Khatik Fergusson College, Pune (India)

Keywords:

KNN modal ML, LR model ML, , Phishing and Non Phishing Websites identification using KNN, Phishing and Non Phishing Websites identification using LR

Abstract

With the advent of Internet and growth in the field of Information and Communication technology, phishing attacks are becoming very common source for finding users personal or confidential information. These types of attacks are executed through email, websites, instant messaging services etc. This type of attack is very common and is also considered as one of the major threats to the organization. Therefore, it becomes very important for an individual to check if the message has been received from the trusted sender, as it fools the victim by pretending to be the original user and asking them to share their personal and confidential information. There are lots of techniques which are used to detect phishing websites.  In this paper, the two machine learning classification algorithms: K-Nearest Neighbors (KNN) and Logistic Regression (LR) are applied to the phishing and non-phishing website URLs dataset. The performance of classification algorithms KNN and LR are compared by using the classification report accuracy, precision, confusion matrix, sensitivity, f-score and time required for its execution. Hence, this paper will compare the accuracy of KNN and LR models in order to find phishing websites. The major objective of this paper is to use key features to detect phishing websites with higher accuracy and also lower rate of error.

Downloads

Download data is not yet available.

References

Chaudhry, J. A., Chaudhry, S. A., & Rittenhouse, R. G. (2016). Phishing attacks and defenses. International journal of security and its applications, 10(1), 247-256.

2. Basit, A., Zafar, M., Liu, X., Javed, A. R., Jalil, Z., & Kifayat, K. (2021). A comprehensive survey of AI-enabled phishing attacks detection techniques. Telecommunication Systems, 76, 139-154.

Odeh, A., Keshta, I., & Abdelfattah, E. (2021, January). Machine learning techniquesfor detection of website phishing: A review for promises and challenges. In 2021 IEEE 11th Annual Computing and Communication Workshop and Conference (CCWC) (pp. 0813-0818). IEEE.

Wu, M., Miller, R. C., & Garfinkel, S. L. (2006, April). Do security toolbars actually prevent phishing attacks? In Proceedings of the SIGCHI conference on Human Factors in computing systems (pp. 601-610).

Song, F., Lei, Y., Chen, S., Fan, L., & Liu, Y. (2021). Advanced evasion attacks and mitigations on practical ML‐based phishing website classifiers. International Journal of Intelligent Systems, 36(9), 5210-5240.

Athulya, A. A., & Praveen, K. (2020, June). Towards the detection of phishing attacks. In 2020 4th international conference on trends in electronics and informatics (ICOEI)(48184) (pp. 337-343). IEEE.

Gupta, B. B., Arachchilage, N. A., & Psannis, K. E. (2018). Defending against phishing attacks: taxonomy of methods, current issues and future directions. Telecommunication Systems, 67, 247-267.

Fette, I., Sadeh, N., & Tomasic, A. (2007, May). Learning to detect phishing emails. In Proceedings of the 16th international conference on World Wide Web (pp. 649-656).

Basit, A., Zafar, M., Javed, A. R., & Jalil, Z. (2020, November). A novel ensemble machine learning method to detect phishing attack. In 2020 IEEE 23rd International Multitopic Conference (INMIC) (pp. 1-5). IEEE.

Chen, Y. S., Yu, Y. H., Liu, H. S., & Wang, P. C. (2014, August). Detect phishing by checking content consistency. In Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration (IEEE IRI 2014) (pp. 109-119). IEEE.

Apruzzese, G., Conti, M., & Yuan, Y. (2022, December). SpacePhish: The Evasion-space of Adversarial Attacks against Phishing Website Detectors using Machine Learning. In Proceedings of the 38th Annual Computer Security Applications Conference (pp. 171-185).

Aljabri, M., & Mirza, S. (2022, March). Phishing attacks detection using machine learning and deep learning models. In 2022 7th International Conference on Data Science and Machine Learning Applications (CDMA) (pp. 175-180). IEEE.

Christobel, A., & Sivaprakasam, Y. (2011). An empirical comparison of data mining classification methods. International Journal of Computer Information Systems, 3(2), 24-28.

Apruzzese, G., Conti, M., & Yuan, Y. (2022, December). SpacePhish: The Evasion-space of Adversarial Attacks against Phishing Website Detectors using Machine Learning. In Proceedings of the 38th Annual Computer Security Applications Conference (pp. 171-185).

Bajpai, D., & He, L. (2020, September). Evaluating KNN performance on WESAD dataset. In 2020 12th International Conference on Computational Intelligence and Communication Networks (CICN) (pp. 60-62). IEEE. Wu, X., Zhu, F., Zhou, M., Sabri, M. M. S., & Huang, J. (2022). Intelligent Design of Construction Materials: A Comparative Study of AI Approaches for Predicting the Strength of Concrete with Blast Furnace Slag. Materials, 15(13), 4582

Page, A., Turner, J. T., Mohsenin, T., & Oates, T. (2014, May). Comparing raw data and feature extraction for seizure detection with deep learning methods. In The twenty-seventh international flairs conference.

Mahesh, T. R., Vivek, V., Kumar, V. V., Natarajan, R., Sathya, S., & Kanimozhi, S. (2022, January). A comparative performance analysis of machine learning approaches for the early prediction of diabetes disease. In 2022 International Conference on Advances in Computing, Communication and Applied Informatics (ACCAI) (pp. 1-6). IEEE

Ramasamy, J. ., Doshi, R. ., & Hiran, K. K. . (2023). Three Step Authentication of Brain Tumour Segmentation Using Hybrid Active Contour Model and Discrete Wavelet Transform. International Journal on Recent and Innovation Trends in Computing and Communication, 11(3s), 56–64. https://doi.org/10.17762/ijritcc.v11i3s.6155

Waheeb , M. Q. ., SANGEETHA, D., & Raj , R. . (2021). Detection of Various Plant Disease Stages and Its Prevention Method Based on Deep Learning Technique. Research Journal of Computer Systems and Engineering, 2(2), 33:37. Retrieved from https://technicaljournals.org/RJCSE/index.php/journal/article/view/30

Downloads

Published

20.10.2023

How to Cite

Kadam, S. ., Nidhi, N., Deshmukh, P. ., Khare, N. ., & Khatik, I. . (2023). Comparative Study of KNN and LR Approaches of Machine Learning with Respect to the Identification of Phishing Websites. International Journal of Intelligent Systems and Applications in Engineering, 12(2s), 650–656. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/3686

Issue

Section

Research Article