Email Monitoring System Using Various Machine Learning Approaches

Authors

  • Ndivhuwo Netshamutshedzi, Netshikweta Rendani, Ibidun Christiana Obagbuwa

Keywords:

Machine learning, Deep learning, Ensemble learning, Naive Bayes, Support Vector Machine, Convolutional neural network, AdaBoost Classifier and Long short-term memory.

Abstract

A large portion of email traffic is made up of spam, which has caused issues throughout the world. Spammers always employ new techniques, making managing or preventing spam messages difficult. In today’s world, both businesses and educational institutions heavily rely on email communication. This study aims to compare the predictive performance of Machine Learning (ML), Deep Learning (DL), and Ensemble Learning (EL) in the context of email monitoring systems. In our research, we build upon previous studies addressing the spam problem to enhance accuracy. We employ a variety of methods, including Naive Bayes (NB), Support Vector Machine (SVM), Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and Adaptive Boosting and Random Forest. The paper findings reveal that LSTM achieved the highest level of accuracy, reaching 99.88%. Consequently, LSTM stands out as a potent machine-learning system with potential benefits for future studies in this field.

Downloads

Download data is not yet available.

References

Afzal, H. and Mehmood, K. (2016). Spam filtering of bi-lingual tweets using machine learning. In 2016 18th International conference on advanced communication technology (ICACT) (IEEE), 710–714

Agarwal, K. and Kumar, T. (2018). Email spam detection using integrated approach of naive bayes and particle swarm optimization. In 2018 Second International Conference on Intelligent Computing and Control Systems (ICICCS) (IEEE), 685–690

Ahmed, N., Amin, R., Aldabbas, H., Koundal, D., Alouffi, B., and Shah, T. (2022). Machine learning techniques for spam detection in email and iot platforms: analysis and research challenges. Security and Communication Networks 2022, 1–19

Akhtar, A., Tahir, G. R., and Shakeel, K. (2017). A mechanism to detect urdu spam emails. In 2017 IEEE 8th Annual Ubiquitous Computing, Electronics and Mobile Communication Conference (UEMCON) (IEEE), 168–172

Alpaydin, E. (2020). Introduction to machine learning (MIT press)

Androutsopoulos, I., Koutsias, J., Chandrinos, K. V., Paliouras, G., and Spyropoulos, C. D. (2000). An evaluation of naive bayesian anti-spam filtering. arXiv preprint cs/0006013

Bazzaz Abkenar, S., Mahdipour, E., Jameii, S. M., and Haghi Kashani, M. (2021). A hybrid classification method for twitter spam detection based on differential evolution and random forest. Concurrency and Computation: Practice and Experience 33, e6381

Biggio, B., Corona, I., Fumera, G., Giacinto, G., and Roli, F. (2011). Bagging classifiers for fighting poisoning attacks in adversarial classification tasks. In Multiple Classifier Systems: 10th International Workshop, MCS 2011, Naples, Italy, June 15-17, 2011. Proceedings 10 (Springer), 350–359

Breiman, L. (1996). Bagging predictors. Machine learning 24, 123–140

Chen, X.-l., Liu, P.-y., Zhu, Z.-f., and Qiu, Y. (2009). A method of spam filtering based on weighted support vector machines. In 2009 IEEE International Symposium on IT in Medicine & Education (IEEE), vol. 1, 947–950

Chhabra, P., Wadhvani, R., and Shukla, S. (2010). Spam filtering using support vector machine. Special Issue IJCCT 1, 3

Chory, R. M., Vela, L. E., and Avtgis, T. A. (2016). Organizational surveillance of computer-mediated workplace communication: Employee privacy concerns and responses. Employee Responsibilities and Rights Journal 28, 23–43

Drucker, H., Wu, D., and Vapnik, V. N. (1999). Support vector machines for spam categorization. IEEE Transactions on Neural networks 10, 1048–1054

Fallows, D. (2002). Email at work (Pew Internet & American Life Project)

Ferrag, M. A., Maglaras, L., Moschoyiannis, S., and Janicke, H. (2020). Deep learning for cyber security intrusion detection: Approaches, datasets, and comparative study. Journal of Information Security and Applications 50, 102419

Friedman, B. A. and Reed, L. J. (2007). Workplace privacy: Employee relations and legal implications of monitoring employee e-mail use. Employee Responsibilities and Rights Journal 19, 75–83

Gangavarapu, T., Jaidhar, C., and Chanduka, B. (2020). Applicability of machine learning in spam and phishing email filtering: review and approaches. Artificial Intelligence Review 53, 5019–5081

Garcez, A. d., Gori, M., Lamb, L. C., Serafini, L., Spranger, M., and Tran, S. N. (2019). Neural-symbolic computing: An effective methodology for principled integration of machine learning and reasoning. arXiv preprint arXiv:1905.06088

Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural computation 9, 1735–1780

Hossain, F., Uddin, M. N., and Halder, R. K. (2021). Analysis of optimized machine learning and deep learning techniques for spam detection. In 2021 IEEE International IOT, Electronics and Mechatronics

Conference (IEMTRONICS) (IEEE), 1–7

Iyengar, A., Kalpana, G., Kalyankumar, S., and GunaNandhini, S. (2017). Integrated spam detection for multilingual emails. In 2017 International Conference on Information Communication and Embedded Systems (ICICES) (IEEE), 1–4

Jain, G., Sharma, M., and Agarwal, B. (2019). Optimizing semantic lstm for spam detection. International Journal of Information Technology 11, 239–250

Karim, A., Azam, S., Shanmugam, B., Kannoorpatti, K., and Alazab, M. (2019). A comprehensive survey for intelligent spam email detection. IEEE Access 7, 168261–168295

Krause, T., Uetz, R., and Kretschmann, T. (2019). Recognizing email spam from meta data only. In 2019 IEEE Conference on Communications and Network Security (CNS) (IEEE), 178–186

Kumar, N., Sonowal, S., et al. (2020). Email spam detection using machine learning algorithms. In 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA) (IEEE), 108–113

Kumar, R. K., Poonkuzhali, G., and Sudhakar, P. (2012). Comparative study on email spam classifier using data mining techniques. In Proceedings of the international multiconference of engineers and computer scientists (Newswood Limited, Hong Kong), vol. 1, 14–16

Kumaresan, T. and Palanisamy, C. (2017). E-mail spam classification using s-cuckoo search and support vector machine. International Journal of Bio-Inspired Computation 9, 142–156

Magdy, S., Abouelseoud, Y., and Mikhail, M. (2022). Efficient spam and phishing emails filtering based on deep learning. Computer Networks 206, 108826

Masood, F., Almogren, A., Abbas, A., Khattak, H. A., Din, I. U., Guizani, M., et al. (2019). Spammer detection and fake user identification on social networks. IEEE Access 7, 68140–68152

MINASTIREANU, E.-A. and MESNITA, G. (2020). Reducing type ii errors in credit card fraud detection using xgboost classifier. In Proc. 19th Int. Conf. INFORMATICS Econ. Educ. Res. Bus. Technol. 174–182

Mishra, R. and Thakur, R. (2013). Analysis of random forest and naive bayes for spam mail using feature selection categorization. International Journal of Computer Applications 80, 42–47

Netsanet, S., Zhang, J., and Zheng, D. (2018). Bagged decision trees based scheme of microgrid protection using windowed fast fourier and wavelet transforms. Electronics 7, 61

Nisar, N., Rakesh, N., and Chhabra, M. (2021). Review on email spam filtering techniques. International Journal of Performability Engineering 17

Olatunji, S. O. (2019). Improved email spam detection model based on support vector machines. Neural Computing and Applications 31, 691–699

Powers, D. M. and Atyabi, A. (2012). The problem of cross-validation: averaging and bias, repetition and significance. In 2012 Spring Congress on Engineering and Technology (IEEE), 1–5

Punisˇkis, D., Laurutis, R., and Dirmeikis, R. (2006). An artificial neural nets for spam e-mail recognition.

Elektronika ir Elektrotechnika 69, 73–76

Rana, S., Jasola, S., and Kumar, R. (2011). A review on particle swarm optimization algorithms and their applications to data clustering. Artificial Intelligence Review 35, 211–222

Salama, W. M., Aly, M. H., and Abouelseoud, Y. (2023). Deep learning-based spam image filtering.

Alexandria Engineering Journal 68, 461–468

Scholkopf, B. and Smola, A. J. (2002). Learning with kernels: support vector machines, regularization, optimization, and beyond (MIT press)

Sharma, P. and Bhardwaj, U. (2018). Machine learning based spam e-mail detection. International Journal of Intelligent Engineering & Systems 11

Siddique, Z. B., Khan, M. A., Din, I. U., Almogren, A., Mohiuddin, I., and Nazir, S. (2021). Machine learning-based detection of spam emails. Scientific Programming 2021, 1–11

Smith, W. P. and Tabak, F. (2009). Monitoring employee e-mails: Is there any room for privacy? Academy of Management Perspectives 23, 33–48

Sundermeyer, M., Schlu¨ ter, R., and Ney, H. (2012). Lstm neural networks for language modeling. In

Interspeech. vol. 2012, 194–197

Suryawanshi, S., Goswami, A., and Patil, P. (2019). Email spam detection: an empirical comparative study of different ml and ensemble classifiers. In 2019 IEEE 9th International Conference on Advanced Computing (IACC) (IEEE), 69–74

Sutskever, I., Vinyals, O., and Le, Q. V. (2014). Sequence to sequence learning with neural networks.

Advances in neural information processing systems 27 [Dataset] Venkatesh, R. (2021). Spam mails dataset

Vyas, T., Prajapati, P., and Gadhwal, S. (2015). A survey and evaluation of supervised machine learning techniques for spam e-mail filtering. In 2015 IEEE international conference on electrical, computer and communication technologies (ICECCT) (IEEE), 1–7

Yu, B. and Xu, Z.-b. (2008). A comparative study for content-based dynamic spam classification using four machine learning algorithms. Knowledge-Based Systems 21, 355–362

Zamir, A., Khan, H. U., Mehmood, W., Iqbal, T., and Akram, A. U. (2020). A feature-centric spam email detection model using diverse supervised machine learning algorithms. The Electronic Library 38, 633–657

Downloads

Published

06.08.2024

How to Cite

Ndivhuwo Netshamutshedzi. (2024). Email Monitoring System Using Various Machine Learning Approaches . International Journal of Intelligent Systems and Applications in Engineering, 12(23s), 533 –. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/6903

Issue

Section

Research Article