Email Monitoring System Using Various Machine Learning Approaches
Keywords:
Machine learning, Deep learning, Ensemble learning, Naive Bayes, Support Vector Machine, Convolutional neural network, AdaBoost Classifier and Long short-term memory.Abstract
A large portion of email traffic is made up of spam, which has caused issues throughout the world. Spammers always employ new techniques, making managing or preventing spam messages difficult. In today’s world, both businesses and educational institutions heavily rely on email communication. This study aims to compare the predictive performance of Machine Learning (ML), Deep Learning (DL), and Ensemble Learning (EL) in the context of email monitoring systems. In our research, we build upon previous studies addressing the spam problem to enhance accuracy. We employ a variety of methods, including Naive Bayes (NB), Support Vector Machine (SVM), Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and Adaptive Boosting and Random Forest. The paper findings reveal that LSTM achieved the highest level of accuracy, reaching 99.88%. Consequently, LSTM stands out as a potent machine-learning system with potential benefits for future studies in this field.
Downloads
References
Afzal, H. and Mehmood, K. (2016). Spam filtering of bi-lingual tweets using machine learning. In 2016 18th International conference on advanced communication technology (ICACT) (IEEE), 710–714
Agarwal, K. and Kumar, T. (2018). Email spam detection using integrated approach of naive bayes and particle swarm optimization. In 2018 Second International Conference on Intelligent Computing and Control Systems (ICICCS) (IEEE), 685–690
Ahmed, N., Amin, R., Aldabbas, H., Koundal, D., Alouffi, B., and Shah, T. (2022). Machine learning techniques for spam detection in email and iot platforms: analysis and research challenges. Security and Communication Networks 2022, 1–19
Akhtar, A., Tahir, G. R., and Shakeel, K. (2017). A mechanism to detect urdu spam emails. In 2017 IEEE 8th Annual Ubiquitous Computing, Electronics and Mobile Communication Conference (UEMCON) (IEEE), 168–172
Alpaydin, E. (2020). Introduction to machine learning (MIT press)
Androutsopoulos, I., Koutsias, J., Chandrinos, K. V., Paliouras, G., and Spyropoulos, C. D. (2000). An evaluation of naive bayesian anti-spam filtering. arXiv preprint cs/0006013
Bazzaz Abkenar, S., Mahdipour, E., Jameii, S. M., and Haghi Kashani, M. (2021). A hybrid classification method for twitter spam detection based on differential evolution and random forest. Concurrency and Computation: Practice and Experience 33, e6381
Biggio, B., Corona, I., Fumera, G., Giacinto, G., and Roli, F. (2011). Bagging classifiers for fighting poisoning attacks in adversarial classification tasks. In Multiple Classifier Systems: 10th International Workshop, MCS 2011, Naples, Italy, June 15-17, 2011. Proceedings 10 (Springer), 350–359
Breiman, L. (1996). Bagging predictors. Machine learning 24, 123–140
Chen, X.-l., Liu, P.-y., Zhu, Z.-f., and Qiu, Y. (2009). A method of spam filtering based on weighted support vector machines. In 2009 IEEE International Symposium on IT in Medicine & Education (IEEE), vol. 1, 947–950
Chhabra, P., Wadhvani, R., and Shukla, S. (2010). Spam filtering using support vector machine. Special Issue IJCCT 1, 3
Chory, R. M., Vela, L. E., and Avtgis, T. A. (2016). Organizational surveillance of computer-mediated workplace communication: Employee privacy concerns and responses. Employee Responsibilities and Rights Journal 28, 23–43
Drucker, H., Wu, D., and Vapnik, V. N. (1999). Support vector machines for spam categorization. IEEE Transactions on Neural networks 10, 1048–1054
Fallows, D. (2002). Email at work (Pew Internet & American Life Project)
Ferrag, M. A., Maglaras, L., Moschoyiannis, S., and Janicke, H. (2020). Deep learning for cyber security intrusion detection: Approaches, datasets, and comparative study. Journal of Information Security and Applications 50, 102419
Friedman, B. A. and Reed, L. J. (2007). Workplace privacy: Employee relations and legal implications of monitoring employee e-mail use. Employee Responsibilities and Rights Journal 19, 75–83
Gangavarapu, T., Jaidhar, C., and Chanduka, B. (2020). Applicability of machine learning in spam and phishing email filtering: review and approaches. Artificial Intelligence Review 53, 5019–5081
Garcez, A. d., Gori, M., Lamb, L. C., Serafini, L., Spranger, M., and Tran, S. N. (2019). Neural-symbolic computing: An effective methodology for principled integration of machine learning and reasoning. arXiv preprint arXiv:1905.06088
Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural computation 9, 1735–1780
Hossain, F., Uddin, M. N., and Halder, R. K. (2021). Analysis of optimized machine learning and deep learning techniques for spam detection. In 2021 IEEE International IOT, Electronics and Mechatronics
Conference (IEMTRONICS) (IEEE), 1–7
Iyengar, A., Kalpana, G., Kalyankumar, S., and GunaNandhini, S. (2017). Integrated spam detection for multilingual emails. In 2017 International Conference on Information Communication and Embedded Systems (ICICES) (IEEE), 1–4
Jain, G., Sharma, M., and Agarwal, B. (2019). Optimizing semantic lstm for spam detection. International Journal of Information Technology 11, 239–250
Karim, A., Azam, S., Shanmugam, B., Kannoorpatti, K., and Alazab, M. (2019). A comprehensive survey for intelligent spam email detection. IEEE Access 7, 168261–168295
Krause, T., Uetz, R., and Kretschmann, T. (2019). Recognizing email spam from meta data only. In 2019 IEEE Conference on Communications and Network Security (CNS) (IEEE), 178–186
Kumar, N., Sonowal, S., et al. (2020). Email spam detection using machine learning algorithms. In 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA) (IEEE), 108–113
Kumar, R. K., Poonkuzhali, G., and Sudhakar, P. (2012). Comparative study on email spam classifier using data mining techniques. In Proceedings of the international multiconference of engineers and computer scientists (Newswood Limited, Hong Kong), vol. 1, 14–16
Kumaresan, T. and Palanisamy, C. (2017). E-mail spam classification using s-cuckoo search and support vector machine. International Journal of Bio-Inspired Computation 9, 142–156
Magdy, S., Abouelseoud, Y., and Mikhail, M. (2022). Efficient spam and phishing emails filtering based on deep learning. Computer Networks 206, 108826
Masood, F., Almogren, A., Abbas, A., Khattak, H. A., Din, I. U., Guizani, M., et al. (2019). Spammer detection and fake user identification on social networks. IEEE Access 7, 68140–68152
MINASTIREANU, E.-A. and MESNITA, G. (2020). Reducing type ii errors in credit card fraud detection using xgboost classifier. In Proc. 19th Int. Conf. INFORMATICS Econ. Educ. Res. Bus. Technol. 174–182
Mishra, R. and Thakur, R. (2013). Analysis of random forest and naive bayes for spam mail using feature selection categorization. International Journal of Computer Applications 80, 42–47
Netsanet, S., Zhang, J., and Zheng, D. (2018). Bagged decision trees based scheme of microgrid protection using windowed fast fourier and wavelet transforms. Electronics 7, 61
Nisar, N., Rakesh, N., and Chhabra, M. (2021). Review on email spam filtering techniques. International Journal of Performability Engineering 17
Olatunji, S. O. (2019). Improved email spam detection model based on support vector machines. Neural Computing and Applications 31, 691–699
Powers, D. M. and Atyabi, A. (2012). The problem of cross-validation: averaging and bias, repetition and significance. In 2012 Spring Congress on Engineering and Technology (IEEE), 1–5
Punisˇkis, D., Laurutis, R., and Dirmeikis, R. (2006). An artificial neural nets for spam e-mail recognition.
Elektronika ir Elektrotechnika 69, 73–76
Rana, S., Jasola, S., and Kumar, R. (2011). A review on particle swarm optimization algorithms and their applications to data clustering. Artificial Intelligence Review 35, 211–222
Salama, W. M., Aly, M. H., and Abouelseoud, Y. (2023). Deep learning-based spam image filtering.
Alexandria Engineering Journal 68, 461–468
Scholkopf, B. and Smola, A. J. (2002). Learning with kernels: support vector machines, regularization, optimization, and beyond (MIT press)
Sharma, P. and Bhardwaj, U. (2018). Machine learning based spam e-mail detection. International Journal of Intelligent Engineering & Systems 11
Siddique, Z. B., Khan, M. A., Din, I. U., Almogren, A., Mohiuddin, I., and Nazir, S. (2021). Machine learning-based detection of spam emails. Scientific Programming 2021, 1–11
Smith, W. P. and Tabak, F. (2009). Monitoring employee e-mails: Is there any room for privacy? Academy of Management Perspectives 23, 33–48
Sundermeyer, M., Schlu¨ ter, R., and Ney, H. (2012). Lstm neural networks for language modeling. In
Interspeech. vol. 2012, 194–197
Suryawanshi, S., Goswami, A., and Patil, P. (2019). Email spam detection: an empirical comparative study of different ml and ensemble classifiers. In 2019 IEEE 9th International Conference on Advanced Computing (IACC) (IEEE), 69–74
Sutskever, I., Vinyals, O., and Le, Q. V. (2014). Sequence to sequence learning with neural networks.
Advances in neural information processing systems 27 [Dataset] Venkatesh, R. (2021). Spam mails dataset
Vyas, T., Prajapati, P., and Gadhwal, S. (2015). A survey and evaluation of supervised machine learning techniques for spam e-mail filtering. In 2015 IEEE international conference on electrical, computer and communication technologies (ICECCT) (IEEE), 1–7
Yu, B. and Xu, Z.-b. (2008). A comparative study for content-based dynamic spam classification using four machine learning algorithms. Knowledge-Based Systems 21, 355–362
Zamir, A., Khan, H. U., Mehmood, W., Iqbal, T., and Akram, A. U. (2020). A feature-centric spam email detection model using diverse supervised machine learning algorithms. The Electronic Library 38, 633–657
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.