Real-time Anomaly Detection in Big Data Streams: Machine Learning Approaches and Performance Evaluation
Keywords:
Isolation Forest, Local Outlier Factor, Support Vector Machine, Elliptic Envelope, Anomaly detection, big data, Machine learning, Outlier detection methods, Performance evaluationAbstract
The paper focuses on real-time anomaly detection in big data streams, discussing the machine learning techniques as well as performance evaluation. Different outlier detection methods, including Isolation Forest, Local Outlier Factor, Support Vector Machine and Elliptic Envelope, are considered and analyzed relying on a dataset of financial transactions. The analysis is made up of the visualization of data distributions, correlation matrices and metrics like accuracy and classification reports. The findings reveal varying performance levels among the methods pointing out the significance of choosing appropriate techniques for effective anomaly detection in dynamic data environments. This paper adds to the knowledge of outlier detection in big data streams and provides valuable insights for future studies.
Downloads
References
Habeeb, R.A.A., Nasaruddin, F., Gani, A., Hashem, I.A.T., Ahmed, E. and Imran, M., 2019. Real-time big data processing for anomaly detection: A survey. International Journal of Information Management, 45, pp.289-307.
Rettig, L., Khayati, M., Cudré-Mauroux, P. and Piórkowski, M., 2019. Online anomaly detection over big data streams. Applied Data Science: Lessons Learned for the Data-Driven Business, pp.289-312.
AriyaluranHabeeb, R.A., Nasaruddin, F., Gani, A., Amanullah, M.A., AbakerTargioHashem, I., Ahmed, E. and Imran, M., 2022. Clustering‐based real‐time anomaly detection A breakthrough in big data technologies. Transactions on Emerging Telecommunications Technologies, 33(8), p.e3647.
Viegas, E., Santin, A., Bessani, A. and Neves, N., 2019. BigFlow: Real-time and reliable anomaly-based intrusion detection for high-speed networks. Future Generation Computer Systems, 93, pp.473-485.
Al-amri, R., Murugesan, R.K., Man, M., Abdulateef, A.F., Al-Sharafi, M.A. and Alkahtani, A.A., 2021. A review of machine learning and deep learning techniques for anomaly detection in IoT data. Applied Sciences, 11(12), p.5320.
Ahmed, C.M., MR, G.R. and Mathur, A.P., 2020, October. Challenges in machine learning based approaches for real-time anomaly detection in industrial control systems. In Proceedings of the 6th ACM on cyber-physical system security workshop (pp. 23-29).
Schmidl, S., Wenig, P. and Papenbrock, T., 2022. Anomaly detection in time series: a comprehensive evaluation. Proceedings of the VLDB Endowment, 15(9), pp.1779-1797.
Ahmed, A., Sajan, K.S., Srivastava, A. and Wu, Y., 2021. Anomaly detection, localization and classification using drifting synchrophasor data streams. IEEE Transactions on Smart Grid, 12(4), pp.3570-3580.
Ahmed, A., Sajan, K.S., Srivastava, A. and Wu, Y., 2021. Anomaly detection, localization and classification using drifting synchrophasor data streams. IEEE Transactions on Smart Grid, 12(4), pp.3570-3580.
Ullah, W., Ullah, A., Hussain, T., Muhammad, K., Heidari, A.A., Del Ser, J., Baik, S.W. and De Albuquerque, V.H.C., 2022. Artificial Intelligence of Things-assisted two-stream neural network for anomaly detection in surveillance Big Video Data. Future Generation Computer Systems, 129, pp.286-297.
Rezaee, K., Rezakhani, S.M., Khosravi, M.R. and Moghimi, M.K., 2021. A survey on deep learning-based real-time crowd anomaly detection for secure distributed video surveillance. Personal and Ubiquitous Computing, pp.1-17.
Mokhtari, S., Abbaspour, A., Yen, K.K. and Sargolzaei, A., 2021. A machine learning approach for anomaly detection in industrial control systems based on measurement data. Electronics, 10(4), p.407.
Nassif, A.B., Talib, M.A., Nasir, Q. and Dakalbab, F.M., 2021. Machine learning for anomaly detection: A systematic review. Ieee Access, 9, pp.78658-78700.
Dong, Y., Wang, R. and He, J., 2019, October. Real-time network intrusion detection system based on deep learning. In 2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS) (pp. 1-4). IEEE.
Granat, J., Batalla, J.M., Mavromoustakis, C.X. and Mastorakis, G., 2019. Big data analytics for event detection in the IoT-multicriteria approach. IEEE Internet of Things Journal, 7(5), pp.4418-4430.
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.