An ML-Powered Framework for Email Spam Identification

Authors

  • Ravindra Ramesh Agrawal, Simran Shinde, Swatantrakumar Gupta, Sagar Thakare, Bhavna Sharma

Keywords:

Ubiquitous, Phishing, Machine Learning, Spam, Predictions

Abstract

Email remains a globally ubiquitous communication tool due to its ease of use and speed. However, its effectiveness is often compromised by an inability to accurately filter unwanted messages. A growing number of reported cases involve the theft of personal information or phishing attempts conducted via email. This project explores the application of Machine Learning (ML) to enhance spam detection. ML, a facet of artificial intelligence, enables systems to automatically learn and improve from data without explicit programming. A binary classifier will be employed to categorize email content into "spam" or "ham" (legitimate mail), aiming for more accurate predictions. The primary objective of this model is to detect and classify words both rapidly and precisely.

Downloads

Download data is not yet available.

References

H. Faris, A. M. Al-Zoubi, A. A. Heidari et al., “An intelligent system for spam detection and identification of the most relevant features based on evolutionary random weight networks,” Information Fusion, vol. 48, pp. 67–83, 2019.

E. Blanzieri and A. Bryl, “A survey of learning-based techniques of email spam filtering,” Artificial Intelligence Review, vol. 29, no. 1, pp. 63–92, 2008.

A. Alghoul, S. Al Ajrami, G. Al Jarousha, G. Harb, and S. S. Abu-Naser, “Email classification using artificial neural network,” International Journal for Academic Development, vol. 2, 2018. 16 Security and Communication Networks

N. Udayakumar, S. Anandaselvi, and T. Subbulakshmi, “Dynamic malware analysis using machine learning algorithm,” in Proceedings of the 2017 International Conference on Intelligent Sustainable Systems (ICISS), IEEE, Palladam, India, December 2017.

S. O. Olatunji, “Extreme Learning machines and Support Vector Machines models for email spam detection,” in Proceedings of the 2017 IEEE 30th Canadian Conference on Electrical and Computer Engineering (CCECE), IEEE, Windsor, Canada, April 2017.

J. Dean, “Large scale deep learning,” in Proceedings of the Keynote GPU Technical Conference, San Jose, CA, USA, 2015.

J. K. Kruschke and T. M. Liddell, “Bayesian data analysis for newcomers,” Psychonomic Bulletin & Review, vol. 25, no. 1, pp. 155–177, 2018.

K. S. Adewole, N. B. Anuar, A. Kamsin, K. D. Varathan, and S. A. Razak, “Malicious accounts: dark of the social networks,” Journal of Network and Computer Applications, vol. 79, pp. 41–67, 2017.

A. Barushka and P. Hajek, “Spam filtering using regularized ´ neural networks with rectified linear units,” in Proceedings of the Conference of the Italian Association for Artificial Intelligence, Springer, Berlin, Germany, November 2016.

F. Jamil, H. K. Kahng, S. Kim, and D. H. Kim, “Towards secure fitness framework based on IoT-enabled blockchain network integrated with machine learning algorithms,” Sensors, vol. 21, no. 5, p. 1640, 2021.

M. H. Arif, J. Li, M. Iqbal, and K. Liu, “Sentiment analysis and spam detection in short informal text using learning classifier systems,” Soft Computing, vol. 22, no. 21, pp. 7281–7291, 2018.

X. Zheng, X. Zhang, Y. Yu, T. Kechadi, and C. Rong, “ELMbased spammer detection in social networks,” 5e Journal of Supercomputing, vol. 72, no. 8, pp. 2991–3005, 2016.

M. A. Ferrag, L. Maglaras, S. Moschoyiannis, and H. Janicke, “Deep learning for cyber security intrusion detection: approaches, datasets, and comparative study,” Journal of Information Security and Applications, vol. 50, Article ID 102419, 2020.

N. Kumar and S. Sonowal, “Email spam detection using machine learning algorithms,” in Proceedings of the 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA), pp. 108–113, Coimbatore, India, 2020.

I. Santos, Y. K. Penya, J. Devesa, and P. G. Bringas, “N-gramsbased file signatures for malware detection,” ICEIS, vol. 9, no. 2, pp. 317–320, 2009.

S. Cresci, M. Petrocchi, A. Spognardi, and S. Tognazzi, “On the capability of evolved spambots to evade detection via genetic engineering,” Online Social Networks and Media, vol. 9, pp. 1–16, 2019.

A. J. Saleh, A. Karim, B. Shanmugam et al., “An intelligent spam detection model based on artificial immune system,” Information, vol. 10, no. 6, p. 209, 2019.

S. B. Kotsiantis, I. Zaharakis, and P. Pintelas, “Supervised machine learning: a review of classification techniques,” Emerging artificial intelligence applications in computer engineering, vol. 160, pp. 3–24, 2007.

E. Blanzieri and A. Bryl, E-mail Spam Filtering with Local SVM Classifiers, University of Trento, Trento, Italy, 2008.

H. Bhuiyan, A. Ashiquzzaman, T. Islam Juthi, S. Biswas, and J. Ara, “A survey of existing e-mail spam filtering methods considering machine learning techniques,” Global Journal of Computer Science and Technology, vol. 18, 2018.

A. Asuncion and D. Newman, “UCI machine learning repository,” 2007, https://archive.ics.uci.edu/ml/index.php.

T. Vyas, P. Prajapati, and S. Gadhwal, “A survey and evaluation of supervised machine learning techniques for spam e-mail filtering,” in Proceedings of the 2015 IEEE international conference on electrical, computer and communication technologies (ICECCT), IEEE, Tamil Nadu, India, March 2015.

L. N. Petersen, “(e ageing body in monty Python live (mostly),” European Journal of Cultural Studies, vol. 21, no. 3, pp. 382–394, 2018.

L. Zhuang, J. Dunagan, D. R. Simon, H. J. Wang, and J. D. Tygar, “Characterizing botnets from email spam records,” LEET, vol. 8, pp. 1–9, 2008.

W. N. Gansterer, A. G. K. Janecek, and R. Neumayer, “Spam filtering based on latent semantic indexing,” in Survey of Text Mining II, pp. 165–183, Springer, New York, NY, USA, 2008.

Downloads

Published

23.11.2024

How to Cite

Ravindra Ramesh Agrawal. (2024). An ML-Powered Framework for Email Spam Identification. International Journal of Intelligent Systems and Applications in Engineering, 12(23s), 3571 –. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/7787

Issue

Section

Research Article