An Innovative Multi-Dataset Performance Analysis of Machine Learning Classifiers based on Features Reduction for Intrusion Detection

Authors

  • Salim Q. Mohammed Ph.D. Student Technical College of Engineering, Sulaimani Polytechnic University Sulaimanyah, Iraq
  • Mohammed A. El Sheikh Hussein Ph.D.in Computer Science – Engineering College of Engineer-ing, University of Sulaimani Sulaimanyah, Iraq

Keywords:

Binary Classifiers, Logistic Regression, Supervised Machine Learning Algorithms, Support Vector Machine (SVM).

Abstract

Computer users receive millions of internet packets every day, some are regular normal usage packets, others are packets sent by intruders for illegal purposes. With the increased numbers of users, regular countermeasure methods are no longer effective and Machine Learning is a key tool to deal with this increase of user numbers and attack types. Three well-known datasets, KDD99, UNSW NB15, and CICIDS2017 are used as a framework for a comprehensive comparison study were the proposed models deployed for performances measurements using a number of features reduction methods.  Nine machine learning models that start with K-Nearest Neighbor, Logistic Regression, Support Vector Machine-Linear, Stochastic Gradient Descent, Nave Bayes, Decision Trees, Random Forest, Gradient Boosting to Adaboost are applied to the reduced features datasets of KDD99, UNSW NB15, and CICIDS2017. For a variety of features reductions, the accuracy and F1 score metrics have been used to evaluate and analyze each model's performance. For KDD99, the achieved accuracy and F1 scores are 99.9663% and 99.979%, respectively, and with the UNSW NB15, the values are 95.1968% and 96.2473%, respectively. Finally, the CICIDS2017 dataset values of 99.7004% for accuracy and 99.7515% for F1 were obtained. Random forest classifier showed the highest performances values using all the three datasets, and the innovative features reduction by 80% gave better outcomes of accuracy and F1, surpassing other state-of-the-art surveyed researches.

Downloads

Download data is not yet available.

References

Mebawondua, J., 2020. Network Intrusion Detection System using Supervised Learning Paradigm. Elsevier, 24 July. Doi:10.1016/j.sciaf.2020.e00497

Al-Garadi, M., 2020. A survey of machine and deep learning methods for internet of things (IoT) security. IEEE Communications Surveys & Tutorials, 22(3), pp. pp.1646- 1685.Doi:10.1109/comst.2020.2988293

Azhagiri, M., Rajesh, D. A. and Karthik, D. S., 2015. Intrusion Detection and Prevention System: Technologies and Challenges. International Journal of Applied Engineering Research, 10(87). https://www.researchgate.net/publication/287208734_intrusion_detection_and_prevention_system_tchnologies_and_challenges

Anwar, S., 2017. From Intrusion Detection to an Intrusion Response System: Fundamentals, Requirements, and Future Directions. Algorithms. MDPI algorithms, 10(2), p. 39.Doi:10.3390/a10020039

Gupta, A. R. b. and Agrawal, J., 2020. A Comprehensive Survey on Various Machine Learning Methods used for Intrusion Detection System. 9th IEEE International Conference on Communication Systems and Network Technologies, 16 June. pp. 282-289.Doi:10.1109/csnt48778.2020.9115764

Ahmad, Z., 2021. Network intrusion detection system: A systematic study of machine learning and deep learning approaches. Transactions on Emerging Telecommunications Technologies, 32(1).Doi:10.1002/ett.4150

Daniya, T., Kumar, K. S., Kumar, B. S. and Kolli, C. S., 2021. A Survey on Anomaly based Intrusion Detection System. ELSEVIER, 12 March. Doi:10.1016/j.matpr.2021.03.353

Mahmood, D. Y. and Hussein, M. A., 2014. Feature based Unsupervised Intrusion Detection. International Journal of Computer, Electrical, Automation, Control and Information Engineering, 8(9), pp. 1515-1519.https://www.researchgate.net/publication/317730391_feature_based_unsupervised_intrusion_detection

Almseidin, M., Alzubi, M., Kovacs, S. and Alkasassbeh, M., 2017. Evaluation of Machine Learning Algorithms for Intrusion Detection System. In 2017 IEEE 15th International Symposium on Intelligent Systems and Informatics (SISY). IEEE, September. https://www.researchgate.net/publication/322328775_evaluation_of_machine_learning_algorithms_for_intrusion_detection_system

Belouch, M., 2018. Performance evaluation of intrusion detection based on machine learning using Apache Spark. Procedia Computer Science, Volume 127, pp. pp.1-6.Doi:10.1016/j.procs.2018.01.091

Rahul, V., KP, S. and Poornachandran, P., 2018. Evaluating Shallow and Deep Neural Networks for Network Intrusion Detection Systems in Cyber Security. In 2018 9th International conference on computing, communication and networking technologies (ICCCNT). IEEE, July. pp. 1-6.Doi:10.1109/icccnt.2018.8494096

Devi, R. R. and Abualkibash, M., 2019. INTRUSION DETECTION SYSTEM CLASSIFICATION USING DIFFERENT MACHINE LEARNING ALGORITHMS ON KDD-99 AND NSL-KDD DATASETS. International Journal of Computer Science & Information Technology (IJCSIT), 11(3).Doi:10.5121/ijcsi.2019.11306

Sandosh, S., Govindasamy, V. and Akila, G., 2020. Enhanced Intrusion Detection System via Agent Dlustering and Classification based on Outlier Detection. Peer-to-Peer Networking and Applications, 13(3), pp. 1038-1045.Doi:10.1007/s12083-019-00822-3

Meryem, A. and Ouahidi, B. E., 2020. Hybrid Intrusion Detection System using Machine Learning. Network Security, May, 2020(5), pp. 8-19.Doi:10.1016/s1353-4858(20)30056-8

Mohan, L., Jain, S., Suyal, P. and Kumar, A., 2020. Data mining Classification Techniques for Intrusion Detection System. IEEE, 12th International Conference on Computational Intelligence and Communication Networks, 20 December. Doi:10.1109/cicn49253.2020.9242642

Abrar, I., Ayub, Z., Masoodi, F. and Bamhdi, A. M., 2020. A Machine Learning Approach for Intrusion Detection System on NSL-KDD Dataset. In 2020 International Conference on Smart Electronics and Communication (ICOSEC). IEEE, September. pp. 919-924.Doi:10.1109/icosec49089.2020.9215232

Fitni, Q. a. R. K., 2020. Implementation of ensemble learning and feature selection for performance improvements in anomaly-based intrusion detection systems. IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT), pp. (pp. 118-124). IEEE. Doi:10.1109/iaict50021.2020.9172014

Iman, A. N. and Ahmad, T., 2020. Improving Intrusion Detection System by Estimating Parameters of Random Forest in Boruta. In 2020 International Conference on Smart Technology and Applications (ICoSTA). IEEE, February. 1-6.Doi:10.1109/icosta48221.2020.1570609975

Waskle, S. P. L. a. S. U., 2020, July. Intrusion detection system using PCA with random forest approach. 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC), pp. (pp. 803-808). IEEE. Doi:10.1109/icesc48915.2020.9155656

Liu, C., Gu, Z. and Wang, J., 2021. A Hybrid Intrusion Detection System Based on Scalable K-Means+ Random Forest and Deep Learning. IEEE Access, May, Volume 9, pp. 75729-75740.Doi:10.1109/access.2021.3082147

Seth, S., Chahal, K. K. and Singh, G., 2021. A Novel Ensemble Framework for an Intelligent Intrusion Detection System. IEEE Access, 29 September, Volume 9, pp. 138451-138466.Doi:10.1109/access.2021.3116219

Mohammed, S. Q. and Hussein, M. A., 2022. Performance Analysis of different Machine Learning Models for Intrusion Detection Systems. Journal of Engineering, 28(5).Doi:10.31026/j.eng.2022.05.05

Agrawal, D. and Agrawal, C., 2020. A Review on Various Methods of Intrusion Detection System. Computer Engineering and Intelligent Systems, 31 January .11(1).Doi:10.7176/ceis/11-1-02

Bertoli, G. D. C., 2021. An End-To-End Framework for Machine Learning-Based Network Intrusion Detection System. IEEE Access, 27 July, Volume 9, pp. 106790-106803.Doi:10.1109/access.2021.3101188

Zhang, B., 2018. Network Intrusion Detection Method Based on PCA and Bayes Algorithm. Security and Communication Networks, Research Article, 17 October. Doi:10.1155/2018/1914980

Zhu, H., Liu, W., Sun, M. and Xin, Y., 2017. A Universal High-Performance Correlation Analysis Detection Model and Algorithm for Network Intrusion Detection System. Mathematical Problems in Engineering. Doi:10.1155/2017/8439706

Xin, Y., 2018. Machine learning and deep learning methods for cybersecurity. IEEE Access, pp. 35365-35381.Doi:10.1109/access.2018.2836950

Hooshmand, M. a. G. I., 2020. Feature selection approach using ensemble learning for network anomaly detection. CAAI Transactions on Intelligence Technology, 5(4), pp. pp.283-293. Doi:10.1049/trit.2020.0073

Al-Daweri, M. S., Ariffin, K. A. Z., Abdullah, S. and Senan, M., 2020. An Analysis of the KDD99 and UNSW-NB15 Datasets for the Intrusion Detection System. Symmetry, 12(10), p. 1666.Doi:10.3390/sym12101666

Li, G., Yan, Z., Fu, Y. and Chen, H., 2018. Data Fusion for Network Intrusion Detection: A Review. Security and Communication Networks. Doi:10.1155/2018/8210614

Abdulhammed, R. M. H. A. A. F. M. a. A. A., 2019. Features dimensionality reduction approaches for machine learning based network intrusion detection. Electronics, 8(3), p. p.322.Doi:10.3390/electronics8030322

Raschka, S., Liu, Y.and Mirjalili, V., 2022. Machine Learning with PyTorch and Scikit-Learn. Birmingham B3 2PB, UK.: Packt Publishing.

Klosterman, S., 2021. Data Science Projects with Python. UK: Birmingham B3 2PB.

Mukhopadhyay, S., 2018. Advanced Data Analytics using Python: with Machine Learning, Deep Learning and nlp Examples. Kolkata, West Bengal, India: Apress.

Müller, A. C. and Guido, S., 2016. Introduction to Machine Learning with Python: A Guide for Data Scientists. First ed. s.l.:O’Reilly.

Salih, A. and Abdulazeez, A., 2021. Evaluation of Classification Algorithms for Intrusion Detection System. A Review. Journal of Soft Computing and Data Mining (JSCDM), 15 April, 2(1), pp. 31-40.Doi:10.30880/jscdm.2021.02.01.004

Hidayat, I., Muhammad, Z. A. and Arshad, A., 2023. Machine Learning -Based Intrusion Detection System:An Experimental Comparison. Journal of Computational and Cognitive Engineering, Volume 2(2), pp. 88-97.Doi:10.47852/bonviewjcce2202270

Downloads

Published

13.12.2023

How to Cite

Mohammed, S. Q. ., & Hussein, M. A. E. S. . (2023). An Innovative Multi-Dataset Performance Analysis of Machine Learning Classifiers based on Features Reduction for Intrusion Detection. International Journal of Intelligent Systems and Applications in Engineering, 12(8s), 553–569. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/4187

Issue

Section

Research Article