Real-time Anomaly Detection in Big Data Streams: Machine Learning Approaches and Performance Evaluation

Authors

  • Aruna Bajpai, Samiksha Khule, Vijay Prakash Sharma, Yogeshkumar Sharma, Gaurav Dubey

Keywords:

Isolation Forest, Local Outlier Factor, Support Vector Machine, Elliptic Envelope, Anomaly detection, big data, Machine learning, Outlier detection methods, Performance evaluation

Abstract

The paper focuses on real-time anomaly detection in big data streams, discussing the machine learning techniques as well as performance evaluation. Different outlier detection methods, including Isolation Forest, Local Outlier Factor, Support Vector Machine and Elliptic Envelope, are considered and analyzed relying on a dataset of financial transactions. The analysis is made up of the visualization of data distributions, correlation matrices and metrics like accuracy and classification reports. The findings reveal varying performance levels among the methods pointing out the significance of choosing appropriate techniques for effective anomaly detection in dynamic data environments. This paper adds to the knowledge of outlier detection in big data streams and provides valuable insights for future studies.

Downloads

Download data is not yet available.

References

Habeeb, R.A.A., Nasaruddin, F., Gani, A., Hashem, I.A.T., Ahmed, E. and Imran, M., 2019. Real-time big data processing for anomaly detection: A survey. International Journal of Information Management, 45, pp.289-307.

Rettig, L., Khayati, M., Cudré-Mauroux, P. and Piórkowski, M., 2019. Online anomaly detection over big data streams. Applied Data Science: Lessons Learned for the Data-Driven Business, pp.289-312.

AriyaluranHabeeb, R.A., Nasaruddin, F., Gani, A., Amanullah, M.A., AbakerTargioHashem, I., Ahmed, E. and Imran, M., 2022. Clustering‐based real‐time anomaly detection A breakthrough in big data technologies. Transactions on Emerging Telecommunications Technologies, 33(8), p.e3647.

Viegas, E., Santin, A., Bessani, A. and Neves, N., 2019. BigFlow: Real-time and reliable anomaly-based intrusion detection for high-speed networks. Future Generation Computer Systems, 93, pp.473-485.

Al-amri, R., Murugesan, R.K., Man, M., Abdulateef, A.F., Al-Sharafi, M.A. and Alkahtani, A.A., 2021. A review of machine learning and deep learning techniques for anomaly detection in IoT data. Applied Sciences, 11(12), p.5320.

Ahmed, C.M., MR, G.R. and Mathur, A.P., 2020, October. Challenges in machine learning based approaches for real-time anomaly detection in industrial control systems. In Proceedings of the 6th ACM on cyber-physical system security workshop (pp. 23-29).

Schmidl, S., Wenig, P. and Papenbrock, T., 2022. Anomaly detection in time series: a comprehensive evaluation. Proceedings of the VLDB Endowment, 15(9), pp.1779-1797.

Ahmed, A., Sajan, K.S., Srivastava, A. and Wu, Y., 2021. Anomaly detection, localization and classification using drifting synchrophasor data streams. IEEE Transactions on Smart Grid, 12(4), pp.3570-3580.

Ahmed, A., Sajan, K.S., Srivastava, A. and Wu, Y., 2021. Anomaly detection, localization and classification using drifting synchrophasor data streams. IEEE Transactions on Smart Grid, 12(4), pp.3570-3580.

Ullah, W., Ullah, A., Hussain, T., Muhammad, K., Heidari, A.A., Del Ser, J., Baik, S.W. and De Albuquerque, V.H.C., 2022. Artificial Intelligence of Things-assisted two-stream neural network for anomaly detection in surveillance Big Video Data. Future Generation Computer Systems, 129, pp.286-297.

Rezaee, K., Rezakhani, S.M., Khosravi, M.R. and Moghimi, M.K., 2021. A survey on deep learning-based real-time crowd anomaly detection for secure distributed video surveillance. Personal and Ubiquitous Computing, pp.1-17.

Mokhtari, S., Abbaspour, A., Yen, K.K. and Sargolzaei, A., 2021. A machine learning approach for anomaly detection in industrial control systems based on measurement data. Electronics, 10(4), p.407.

Nassif, A.B., Talib, M.A., Nasir, Q. and Dakalbab, F.M., 2021. Machine learning for anomaly detection: A systematic review. Ieee Access, 9, pp.78658-78700.

Dong, Y., Wang, R. and He, J., 2019, October. Real-time network intrusion detection system based on deep learning. In 2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS) (pp. 1-4). IEEE.

Granat, J., Batalla, J.M., Mavromoustakis, C.X. and Mastorakis, G., 2019. Big data analytics for event detection in the IoT-multicriteria approach. IEEE Internet of Things Journal, 7(5), pp.4418-4430.

Downloads

Published

26.03.2024

How to Cite

Aruna Bajpai, Samiksha Khule, Vijay Prakash Sharma, Yogeshkumar Sharma, Gaurav Dubey. (2024). Real-time Anomaly Detection in Big Data Streams: Machine Learning Approaches and Performance Evaluation. International Journal of Intelligent Systems and Applications in Engineering, 12(21s), 915–924. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/5489

Issue

Section

Research Article