Intelligent Image Spam Analysis Using CNN-Based Visual Semantics and Web Log Mining for Next-Generation Spam Filtering

Authors

  • Khedkar Vaishali Shankar, Rais Abdul Hamid Khan

Keywords:

Image Spam Detection, Convolutional Neural Networks, Web Log Mining, Visual Semantics, Multimodal Learning, Explainable AI, Spam Filtering.

Abstract

Image-based spam presents a persistent challenge to traditional text-oriented filtering systems due to its use of visual obfuscation techniques. This paper proposes an intelligent image spam detection framework that combines convolutional neural network (CNN)-based visual semantic analysis with web log mining to improve detection accuracy and robustness. The framework processes image content and associated transmission logs in parallel, extracting high-level visual features using a deep CNN backbone and behavioral patterns from web and sender logs. These heterogeneous features are integrated through a late-fusion classification strategy to produce reliable spam predictions. Experimental evaluation on publicly available image spam datasets, augmented with adversarial obfuscations, demonstrates that the proposed multimodal approach consistently outperforms visual-only and behavior-only baseline models across standard performance metrics, including accuracy, F1-score, and ROC–AUC. Ablation studies highlight the significant contribution of web log features in enhancing robustness, while explainability analysis using SHAP and Grad-CAM provides transparent insights into model decisions. The results confirm that integrating visual semantics with behavioral context offers an effective and scalable solution for next-generation image spam filtering systems.

Downloads

Download data is not yet available.

References

B. Biggio, G. Fumera, I. Pillai, and F. Roli, “A survey and experimental evaluation of image spam filtering techniques,” Pattern Recognition Letters, vol. 32, no. 10, pp. 1436–1446, Jul. 2011, doi: 10.1016/j.patrec.2011.03.022.

A. Attar, R. M. Rad, and R. E. Atani, “A survey of image spamming and filtering techniques,” Artificial Intelligence Review, vol. 40, no. 1, pp. 71–105, Jan. 2013, doi: 10.1007/s10462-011-9280-4.

J. Shen, R. H. Deng, Z. Cheng, L. Nie, and S. Yan, “On robust image spam filtering via comprehensive visual modeling,” Pattern Recognition, vol. 48, no. 10, pp. 3227–3238, Oct. 2015, doi: 10.1016/j.patcog.2015.02.027.

F. Gargiulo and C. Sansone, “Visual and OCR-based features for detecting image spam,” in Proc. 8th Int. Workshop on Pattern Recognition in Information Systems (PRIS), Barcelona, Spain, 2008, pp. 154–163, doi: 10.5220/0001740801540163.

H. Zuo, P. O. H. A. Neto, and A. Nuñez, “Detecting image spam using local invariant features and one-class SVM,” in Proc. ACM Int. Conf., 2009, pp. 1–8, doi: 10.1145/1526709.1526921.

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 770–778, doi: 10.1109/CVPR.2016.90.

B. Kim, S. Abuadbba, and H. Kim, “DeepCapture: Image spam detection using deep learning and data augmentation,” in Lecture Notes in Computer Science, vol. 12209, Springer, 2020, pp. 299–313, doi: 10.1007/978-3-030-55304-3_24.

W. M. Salama, M. H. Aly, and Y. Abouelseoud, “Deep learning-based spam image filtering,” Alexandria Engineering Journal, vol. 62, no. 1, pp. 577–587, 2023, doi: 10.1016/j.aej.2023.01.048.

A. Makkar and N. Kumar, “PROTECTOR: An optimized deep-learning-based framework for image spam detection and prevention,” Future Generation Computer Systems, vol. 127, pp. 1–15, 2021, doi: 10.1016/j.future.2021.06.026.

C. Castillo, D. Donato, A. Gionis, V. Murdock, and F. Silvestri, “Know your neighbors: Web spam detection using the web topology,” in Proc. 30th Annual Int. ACM SIGIR Conf., Amsterdam, The Netherlands, 2007, pp. 423–430, doi: 10.1145/1277741.1277814.

C. Castillo, “Query-log mining for detecting spam,” in Proc. 31st Annual Int. ACM SIGIR Conf., Singapore, 2008, pp. 651–652, doi: 10.1145/1451983.1451987.

G. Fumera, I. Pillai, and F. Roli, “Spam filtering based on the analysis of text information embedded into images,” Journal of Machine Learning Research, vol. 7, pp. 2699–2720, Dec. 2006.

S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,” in Advances in Neural Information Processing Systems (NeurIPS), 2017, pp. 4765–4774, doi: 10.5555/3295222.3295230.

R. R. Selvaraju et al., “Grad-CAM: Visual explanations from deep networks via gradient-based localization,” in Proc. IEEE Int. Conf. Computer Vision (ICCV), Venice, Italy, 2017, pp. 618–626, doi: 10.1109/ICCV.2017.74.

Z. Zhang, E. Damiani, H. Al Hamadi, C. Y. Yeun, and F. Taher, “Explainable artificial intelligence to detect image spam using convolutional neural networks,” in Proc. Int. Conf. Cyber Resilience (ICCR), 2022, pp. 1–6, doi: 10.1109/ICCR56254.2022.9995839.

A. Annadatha and M. Stamp, “Image spam analysis and detection,” Journal of Computer Virology and Hacking Techniques, vol. 14, no. 1, pp. 39–52, 2018, doi: 10.1007/s11416-016-0287-x.

T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” in Proc. 22nd ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, San Francisco, CA, USA, 2016, pp. 785–794, doi: 10.1145/2939672.2939785.

B. Biggio and F. Roli, “Wild patterns: Ten years after the rise of adversarial machine learning,” Pattern Recognition, vol. 84, pp. 317–331, Dec. 2018, doi: 10.1016/j.patcog.2018.07.023.

F. Borisyuk, A. Gordo, and V. Sivakumar, “Rosetta: Large scale system for text detection and recognition in images,” in Proc. 24th ACM SIGKDD Int. Conf., London, UK, 2018, pp. 71–79, doi: 10.1145/3219819.3219861.

C. Gentry, “Fully homomorphic encryption using ideal lattices,” in Proc. 41st Annu. ACM Symp. Theory of Computing (STOC), Bethesda, MD, USA, 2009, pp. 169–178, doi: 10.1145/1536414.1536440.

A. Chavda, K. Potika, F. Di Troia, and M. Stamp, “Support vector machines for image spam analysis,” in Proc. Int. Conf. on Big Data Analytics and Security (BASS), 2018, pp. 431–441, doi: 10.5220/0006921404310441.

M. Dredze, R. Gevaryahu, and A. Elias-Bachrach, “Learning fast classifiers for image spam,” in Proc. Conf. on Email and Anti-Spam (CEAS), 2007.

G. Fumera, I. Pillai, and F. Roli, “Image spam filtering by content obscuring detection,” in Proc. CEAS, 2007.

R. K. Solanki and N. Shimbre, “Activation heatmap-guided FT-MultiCNN: Advancing skin cancer classification through transfer learning,” Ingénierie des Systèmes d’Information, vol. 30, no. 5, pp. 1349–1362, May 2025, doi: 10.18280/isi.300520.

M. Patil and R. Kumar, “Design a deep learning model for an enhanced fingerprint identification scheme,” Journal of Northeastern University, vol. 25, no. 4, pp. 549–556, Nov. 2022.

Downloads

Published

15.06.2024

How to Cite

Khedkar Vaishali Shankar. (2024). Intelligent Image Spam Analysis Using CNN-Based Visual Semantics and Web Log Mining for Next-Generation Spam Filtering. International Journal of Intelligent Systems and Applications in Engineering, 12(4), 5970 –. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/8025

Issue

Section

Research Article