Phishing website analysis and detection using Machine Learning

Ameya Chawla

doi:10.18201/ijisae.2022.262

Authors

Ameya Chawla Guru Gobind Singh Indraprastha University https://orcid.org/0000-0002-9917-8807

DOI:

https://doi.org/10.18201/ijisae.2022.262

Keywords:

Cybersecurity, Phishing, K-Nearest Neighbour, Support Vector Machine , Artificial Neural Network, Decision Tree, Random Forest, Logsitic Regression, Max Vote Classifier

Abstract

Cybersecurity has become an essential part of this new digital age with more than 820 million users of internet by the year 2022 there is need of security systems to protect public from phishing scams as it not only effects the wealth but also effects the mental health of public, making people afraid to surf or use the internet services which motivates me to work on this problem to develop efficient solution. Objective of this paper is to analyze some common attributes shown by phishing websites and develop a model to detect these websites. Various models where trained on the dataset like Random Forest Classifier, Decision Tree Classifier, Logistic Regression, K Nearest Neighbors, Artificial Neural Networks and Max Vote Classifier of Random Forest, Artificial Neural Networks and K Nearest Neighbors. Highest accuracy was achieved by Max Vote Classifier of Random Forest (max depth 16), Decision Tree (max depth 18) and Artificial Neural Network of 97.73%. This research can be used in real life by implementing a web application in which user can enter the website link and using the link the application will get values for various factor on which model was trained and it will detect whether a website is phishing website or not.

Downloads

Download data is not yet available.

References

Patil S, Dhage S. A methodical overview on phishing detection along with an organized way to construct an anti-phishing framework. In: 2019 5th International Conference on Advanced Computing & Communication Systems (ICACCS). IEEE; 2019. p. 588–93.

Geng G-G, Yan Z-W, Zeng Y, Jin X-B. RRPhish: Anti-phishing via mining brand resources request. In: 2018 IEEE International Conference on Consumer Electronics (ICCE). IEEE; 2018. p. 1–2.

Pratiwi ME, Lorosae TA, Wibowo FW. Phishing site detection analysis using artificial neural network. J Phys Conf Ser. 2018; 1140:012048.

Oza Pranali P., Upadhyay D, Gujarat Technological University. Review on phishing sites detection techniques. Int J Eng Res Technol (Ahmedabad) [Internet]. 2020 [cited 2021 Aug 10];V9(04). Available from: https://www.ijert.org/review-on-phishing-sites-detection-techniques

Alkhalil Z, Hewage C, Nawaf L, Khan I. Phishing attacks: A recent comprehensive study and a new anatomy. Front Comput Sci [Internet]. 2021;3. Available from: http://dx.doi.org/10.3389/fcomp.2021.563060

Jain AK, Gupta BB. A novel approach to protect against phishing attacks at client side using auto-updated white-list. EURASIP J Multimed Inf Secur [Internet]. 2016;2016(1). Available from: http://dx.doi.org/10.1186/s13635-016-0034-3

Patil NM, Dias SP, Dcunha AA, Dodti RJ. Hybrid phishing site detection. Int j adv sci technol. 2020;29(6s):2452–9.

Harinahalli Lokesh G, BoreGowda G. Phishing website detection based on effective machine learning approach. J cyber secur technol. 2021;5(1):1–14.

Jain AK, Gupta BB. Phishing detection: Analysis of visual similarity-based approaches. Secur Commun Netw. 2017; 2017:1–20.

UCI machine learning repository: Phishing websites data set [Internet]. Uci.edu. [cited 2021 Aug 10]. Available from: https://archive.ics.uci.edu/ml/datasets/phishing+websites