Phishing Website Detection Using Advanced Machine Learning Techniques

Authors

  • Nitin N. Sakhare BRACT’s Vishwakarma Institute of Information Technology, Pune
  • Jyoti L. Bangare MKSSS’s Cummins College of Engineering, Pune
  • Radhika G. Purandare BRACT’s Vishwakarma Institute of Information Technology, Pune
  • Disha S. Wankhede BRACT’s Vishwakarma Institute of Information Technology, Pune
  • Pooja Dehankar Assistant Professor School of Engineering, Ajeenkya D. Y. Patil University, Pune

Keywords:

Phishing Detection, Artificial Intelligence (AI), URL Analysis, XGBoost, LightGBM, Naïve Bayes, CatBoost, Graph Neural Network (GNN), Feature Extraction, Real-time Monitoring

Abstract

In the contemporary digital landscape, the escalating threat of phishing attacks necessitates innovative solutions for timely detection and mitigation. This paper presents a pioneering endeavour at the intersection of Artificial Intelligence (AI) and Machine Learning (ML) to combat phishing attempts. Employing a multifaceted approach, this research work integrates XGBoost, LightGBM, Naïve Bayes and CatBoost algorithms, alongside a Graph Neural Network (GNN), to meticulously analyse URL structures, content patterns, and user behaviour. Features such as URL length, dots, slashes, numbers, and special characters are extracted for comprehensive model training. Real-time monitoring ensures the continual adaptation of the system to emerging phishing tactics, enhancing its efficacy in proactively safeguarding users and organizations from the dynamic and evolving realm of cyber threats. This research encapsulates a comprehensive exploration of diverse machine learning methodologies to fortify online security against the pervasive threat of phishing.

Downloads

Download data is not yet available.

References

. Basit, A., Zafar, M., Liu, X. et al. A comprehensive survey of AI-enabled phishing attacks detection techniques. Telecommun Syst 76, 139–154 (2021). https://doi.org/10.1007/s11235-020-00733-2

. Suman, Om Prakash, A Novel Approach for Malicious Domain Classification Based on Dns Traffic Analysis and Machine Learning. Available at SSRN: https://ssrn.com/abstract=4592811 or http://dx.doi.org/10.2139/ssrn.4592811

."A.A. Orunsolu, A.S. Sodiya, A.T. Akinwale, A predictive model for phishing detection, Journal of King Saud University - Computer and Information Sciences,Volume 34, Issue 2, 2022, Pages 232-247, ISSN 1319-1578, https://doi.org/10.1016/j.jksuci.2019.12.005.

. "Asadullah Safi, Satwinder Singh, A systematic literature review on phishing website detection techniques,Journal of King Saud University - Computer and Information Sciences, Volume 35, Issue 2, 2023, Pages 590-611, ISSN 1319-1578, https://doi.org/10.1016/j.jksuci.2023.01.004.(https://www.sciencedirect.com/science/article/pii/S1319157823000034)"

. Dattaa, S., Sena, S. and Kundua, P., A Trustworthy Swift Weapon to Detect the Phishing URLs by Machine Learning Approaches.

. "Dong-Jie Liu, Guang-Gang Geng, Xiao-Bo Jin, Wei Wang, An efficient multistage phishing website detection model based on the CASE feature framework: Aiming at the real web environment, Computers & Security, Volume 110,2021, 102421, ISSN 0167-4048,

https://doi.org/10.1016/j.cose.2021.102421.

. Atlam HF, Oluwatimilehin O. Business Email Compromise Phishing Detection Based on Machine Learning: A Systematic Literature Review. Electronics. 2023; 12(1):42. https://doi.org/10.3390/electronics12010042

. Omari, Kamal. (2023). Comparative Study of Machine Learning Algorithms for Phishing Website Detection. International Journal of Advanced Computer Science and Applications. 14. 10.14569/IJACSA.2023.0140945.

. "Abdulhamit Subasi, Emir Kremic, Comparison of Adaboost with MultiBoosting for Phishing Website Detection, Procedia Computer Science, Volume 168,2020, Pages 272-278, ISSN 1877-0509, https://doi.org/10.1016/j.procs.2020.02.251.

. N. Q. Do, A. Selamat, O. Krejcar, E. Herrera-Viedma and H. Fujita, "Deep Learning for Phishing Detection: Taxonomy, Current Challenges and Future Directions," in IEEE Access, vol. 10, pp. 36429-36463, 2022, doi: 10.1109/ACCESS.2022.3151903.

Takale, Sayli & Pawar, Samta & Khot, Varad & Acharya, Aditya. (2023). Deep Learning Algorithms for Cybersecurity Applications.

. Alnemari S, Alshammari M. Detecting Phishing Domains Using Machine Learning. Applied Sciences. 2023; 13(8):4649. https://doi.org/10.3390/app13084649

. Deshpande, A., Pedamkar, O., Chaudhary, N. and Borde, S., 2021. Detection of phishing websites using Machine Learning. International Journal of Engineering Research & Technology (IJERT), 10(05).

. A. Aldo Tenis and R. Santhosh, "Modelling an efficient url phishing detection approach based on a dense network model," Computer Systems Science and Engineering, vol. 47, no.2, pp. 2625–2641, 2023.

. "SK Hasane Ahammad, Sunil D. Kale, Gopal D. Upadhye, Sandeep Dwarkanath Pande, E Venkatesh Babu, Amol V. Dhumane, Mr. Dilip Kumar Jang Bahadur, Phishing URL detection using machine learning methods, Advances in Engineering Software, Volume 173, 2022, 103288, ISSN 0965-9978, https://doi.org/10.1016/j.advengsoft.2022.103288.

. Chawla, A. (2022). Phishing website analysis and detection using Machine Learning. International Journal of Intelligent Systems and Applications in Engineering, 10(1), 10–16. https://doi.org/10.18201/ijisae.2022.262

. J. Kumar, A. Santhanavijayan, B. Janet, B. Rajendran and B. S. Bindhumadhava, "Phishing Website Classification and Detection Using Machine Learning," 2020 International Conference on Computer Communication and Informatics (ICCCI), Coimbatore, India, 2020, pp. 1-6, doi: 10.1109/ICCCI48352.2020.9104161.

. M. Sánchez-Paniagua, E. F. Fernández, E. Alegre, W. Al-Nabki and V. González-Castro, "Phishing URL Detection: A Real-Case Scenario Through Login URLs," in IEEE Access, vol. 10, pp. 42949-42960, 2022, doi: 10.1109/ACCESS.2022.3168681.

. Rendall, K.; Nisioti, A.; Mylonas, A. Towards a Multi-Layered Phishing Detection. Sensors 2020, 20, 4540. https://doi.org/10.3390/s20164540

. G. Sonowal, "A Model for Detecting Sounds-alike Phishing Email Contents for Persons with Visual Impairments," 2020 Sixth International Conference on e-Learning (econf), Sakheer, Bahrain, 2020, pp. 17-21, doi: 10.1109/econf51404.2020.9385451.

. S. Sindhu, S. P. Patil, A. Sreevalsan, F. Rahman and M. S. A. N., "Phishing Detection using Random Forest, SVM and Neural Network with Backpropagation," 2020 International Conference on Smart Technologies in Computing, Electrical and Electronics (ICSTCEE), Bengaluru, India, 2020, pp. 391-394, doi: 10.1109/ICSTCEE49637.2020.9277256.

. C. Singh and Meenu, "Phishing Website Detection Based on Machine Learning: A Survey," 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, 2020, pp. 398-404, doi: 10.1109/ICACCS48705.2020.9074400.

. A. Lakshmanarao, P. S. P. Rao and M. M. B. Krishna, "Phishing website detection using novel machine learning fusion approach," 2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS), Coimbatore, India, 2021, pp. 1164-1169, doi: 10.1109/ICAIS50930.2021.9395810.

. R. Zieni, L. Massari and M. C. Calzarossa, "Phishing or Not Phishing? A Survey on the Detection of Phishing Websites," in IEEE Access, vol. 11, pp. 18499-18519, 2023, doi: 10.1109/ACCESS.2023.3247135.

. E. S. Gualberto, R. T. De Sousa, T. P. De Brito Vieira, J. P. C. L. Da Costa and C. G. Duque, "The Answer is in the Text: Multi-Stage Methods for Phishing Detection Based on Feature Engineering," in IEEE Access, vol. 8, pp. 223529-223547, 2020, doi: 10.1109/ACCESS.2020.3043396.

NN Sakhare, SS Imambi, S Kagad, H Malekar, M Dalal, “Stock market prediction using sentiment analysis” International Journal of Advanced Science and Technology, Vol. 4, issue 3, 2020.

NN Sakhare, SA Joshi, “Criminal Identification System Based On Data Mining” 3rd ICRTET, ISBN, Issue 978-93, Pages 5107-220, 2015

NN Sakhare, SA Joshi, “Classification of criminal data using J48-Decision Tree algorithm” IFRSA International Journal of Data Warehousing & Mining, Vol. 4, 2014.

NN Sakhare, SS Imambi, Technical Analysis Based Prediction of Stock Market Trading Strategies Using Deep Learning and Machine Learning Algorithms, International Journal of Intelligent Systems and Applications in Engineering, 2022, 10(3), pp. 411–42.

Sakhare,N.N., Shaik,I.S.,Saha,S.: Prediction of stock market movement via technical analysis of stock data stored on blockchain using novel History Bits based machine learning algorithm. IET Soft.1–12(2023). https://doi.org/10.1049/sfw2.1209212

Sharma, R., Dhabliya, D. A review of automatic irrigation system through IoT (2019) International Journal of Control and Automation, 12 (6 Special Issue), pp. 24-29.

Sharma, R., Dhabliya, D. Attacks on transport layer and multi-layer attacks on manet(2019) International Journal of Control and Automation, 12 (6 Special Issue), pp. 5-11.

Downloads

Published

12.01.2024

How to Cite

Sakhare, N. N. ., Bangare, J. L. ., Purandare, R. G. ., Wankhede, D. S. ., & Dehankar, P. . (2024). Phishing Website Detection Using Advanced Machine Learning Techniques. International Journal of Intelligent Systems and Applications in Engineering, 12(12s), 329 –. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/4519

Issue

Section

Research Article

Most read articles by the same author(s)