Improving Intrusion Detection Performance with Genetic Algorithm-Based Feature Extraction and Ensemble Machine Learning Methods

Authors

  • Gunupusala Satyanarayana Research Scholar, JNTU Hyderabad, India.
  • Kaila Shahu Chatrapathi Professor, Department of CSE, JNTU Hyderabad, India.

Keywords:

EM’s Classifier, GA Feature Selection, UNSW-NB15 dataset, Intrusion detection

Abstract

The Internet of Things (IoT) has transformed our world by offering enhanced accessibility, connectivity, and convenience in our daily lives. It facilitates the seamless flow of vast amounts of data among interconnected devices, creating a network that is susceptible to diverse network attacks and intrusions. Developing an efficient IDS (Intrusion Detection System) for IoT networks is a challenging task primarily due to two reasons: the massive amount of aggregated data and the diverse nature of IoT devices. Traditional IDS approaches struggle to handle and analyze this data in real time. Hence, there is a growing demand for advanced IDS techniques that leverage ML or DL methods. This study specifically focuses on intrusion detection in IoT networks, utilizing the UNSW-NB15 dataset. The UNSW-NB15 dataset is a well-known and publicly available dataset that is widely used for evaluating the effectiveness of IDS algorithms. The main purpose of the current work is to enhance the performance of intrusion detection by integrating feature extraction techniques based on genetic algorithms (GA) and ensemble machine learning algorithms (EM’s). By leveraging these approaches, the study aims to improve the accuracy and effectiveness of detecting intrusions in IoT networks. Feature extraction is a crucial step in IDS, as it aims to reduce the dimensionality of the dataset while retaining relevant information. Genetic algorithms, known for their optimization capabilities, are employed to search for an optimal subset of features that maximize the discriminatory power of the IDS. To achieve this, a framework is proposed that integrates genetic algorithms with various ensemble ML techniques, including random forests, Extra-Trees, XGBoost, AdaBoost, and stacking. The GA selects a subset of features from the UNSW-NB15 dataset, and the ensemble ML models are trained and evaluated using these selected features and calculate accuracy.

Downloads

Download data is not yet available.

References

K. Lueth, “State of the IoT 2018: Number of IoT devices now at 7B – Market accelerating.” https://iot-analytics.com/state-of-the-iot-update-q1-q2-2018- number-of-iot-devices-now-7b /(accessed May 27, 2020).

R. McKay, B. Pendleton, and J. Britt, “Machine Learning Algorithms on Botnet Traffic: Ensemble and Simple Algorithms,” Proceedings of the 2019 3rd International Conference on Compute and Data Analysis, p. 5, 2019.

M. Aldwairi, W. Mardini, A. Alhowaide, Anomaly Payload Signature Generation System Based on Efficient Tokenization Methodology, International Journal on Communications Antenna and Propagation (IRECAP) (2018) (Nov. 2018).

T. Mohamed, T. Otsuka, T. Ito, Towards Machine Learning Based IoT Intrusion Detection Service,” Recent Trends and Future Technology in Applied Intelligence. IEA/AIE 2018, Lecture Notes in Computer Science 10868 (May 2018), https://doi.org/10.1007/978-3-319-92058-0_56.

I. Butun, S.D. Morgera, R. Sankar, A Survey of Intrusion Detection Systems in Wireless Sensor Networks, IEEE Communications Surveys Tutorials 16 (1) (2014) 266–282, https://doi.org/10.1109/SURV.2013.050113.00191. First.

C. Zhang, Y. Ma (Eds.), Ensemble Machine Learning: Methods and Applications, Springer-Verlag, New York, 2012, https://doi.org/10.1007/978-1-4419-9326- 7.

S. Raschka, Python Machine Learning - Second Edition, Packt Publishing, 2017. Accessed: Nov. 19,

. Khammassi C, Krichen S. A GA-LR wrapper approach for feature selection in network intrusion detection. Comput Secur 2017;70:255–77.

Osanaiye O, Cai H, Choo K-KR, Dehghantanha A, Xu Z, Dlodlo M. Ensemble-based multi-flter feature selection method for DDOS detection in cloud computing. EURASIP J Wirel Commun Netw. 2016;20]16(1):130.

Ambusaidi MA, He X, Nanda P, Tan Z. Building an intrusion detection system using a flter-based feature selection algorithm. IEEE Trans Comput. 2016; 65(10):2986–98.

Y. Zhou, M. Han, L. Liu, J.S. He, Y. Wang, Deep learning approach for cyberattack detection, in: IEEE INFOCOM 2018 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Apr. 2018, pp. 262–267, https://doi.org/10.1109/INFCOMW.2018.8407032.

S.T. Miller, C. Busby-Earle, Multi-Perspective Machine Learning a Classifier Ensemble Method for Intrusion Detection, in: Proceedings of the 2017 International Conference on Machine Learning and Soft Computing - ICMLSC ’17, Ho Chi Minh City, Vietnam, 2017, pp. 7–12, https://doi.org/10.1145/3036290.3036303.

B.A. Tama, M. Comuzzi, K.-H. Rhee, TSE-IDS: A Two-Stage Classifier Ensemble for Intelligent Anomaly-Based Intrusion Detection System, IEEE Access 7 (Jul. 2019) 94497–94507, https://doi.org/10.1109/ACCESS.2019.2928048.

M. Aloqaily, S. Otosum, I.A. Ridhawi, Y. Jararweh, An intrusion detection system for connected vehicles in smart cities, Ad Hoc Networks 90 (Jul. 2019), 101842, https://doi.org/10.1016/j.adhoc.2019.02.001.

A.J. Siddiqui, A. Boukerche, TempoCode-IoT: temporal codebook-based encoding of flow features for intrusion detection in Internet of Things, Cluster Comput (Sep. 2020), https://doi.org/10.1007/s10586-020-03153-8.

Connelly L. Logistic regression. Medsurg Nurs. 2020;29(5):353–4.

Gao J, Chai S, Zhang B, Xia Y. Research on network intrusion detection based on incremental extreme learning machine and adaptive principal component analysis. Energies 2019;12(7):1223.

Almogren AS. Intrusion detection in edge-of-things computing. J Parallel Distrib Comput. 2020;137:259–65.

Jiang K, Wang W, Wang A, Wu H. Network intrusion detection combined hybrid sampling with deep hierarchical network. IEEE Access. 2020; 8:32464–476

[20]. Khan NM, Negi A, Thaseen IS, et al. Analysis on improving the performance of machine learning models using feature selection technique. In: International conference on intelligent systems design and applications. Springer; 2018. pp. 69–77

Huibing Wang, Jinbo Xiong, Zhiqiang Yao, Mingwei Lin, and Jun Ren. Research survey on support vector machine. In Proceedings of the 10th EAI International Conference on Mobile Multimedia Communications, pages 95–103, 2017.

Mohammad Marufur Rahman, Md Islam, Md Manik, Motaleb Hossen, Mabrook S Al-Rakhami, et al. Machine learning approaches for tackling novel coronavirus (covid-19) pandemic. Sn Computer Science, 2(5):1–10, 2021.

Mr Brijain, R Patel, Mr Kushik, and K Rana. A survey on decision tree algorithm for classifcation. International Journal of Engineering Development and Research, IJEDR, 2(1), 2014.

Breiman L. Random forests Machine learning. 2001;45(1):5–32.

Trevor Hastie, Robert Tibshirani, and Jerome Friedman. Random forests. In The elements of statistical learning, pages 587–604. Springer, 2009.

Belouch M, El Hadaj S, Idhammad M. A two-stage classifer approach using reptree algorithm for network intrusion detection. Int J Adv Comput Sci Appl. 2017;8(6):389–94

Gao J, Chai S, Zhang B, Xia Y. Research on network intrusion detection based on incremental extreme learning machine and adaptive principal component analysis. Energies 2019;12(7):1223.

Ahmad, M.W.; Reynolds, J. and Rezgui; Y. Predictive modelling for solar thermal energy systems: A comparison of support vector regression, random forest, extra trees and regression trees. Journal of cleaner production, 2 ¯ 018, 203, 810–821.

Alsariera, Y.A.; Adeyemo, V.E.; Balogun, A.O. and Alazzawi, A.K. AI meta-learners and extra-trees algorithm for the detection of phishing websites. IEEE Access, 2 ¯ 020, 8, 142532–142542.

Devan, P. and Khare, N., 2020. An efficient XGBoost–DNN-based classification model for network intrusion detection system. Neural Computing and Applications,2 ¯ 020, 1–16.

Scikit-Learn: Ensemble Gradient Boosting Classifier. Available online: https://scikit-learn.org/stable/modules/generated/ sklearn.ensemble.GradientBoostingClassifier.html (accessed on 21 May 2021

Wolpert, D.H. Stacked generalization. Neural Netw. 1992, 5, 241–259

F. Amato, N. Mazzocca, F. Moscato and E. Vivenzio, "Multilayer Perceptron: An Intelligent Model for Classification and Intrusion Detection," 2017 31st International Conference on Advanced Information Networking and Applications Workshops (WAINA), Taipei, Taiwan, 2017, pp. 686-691, doi: 10.1109/WAINA.2017.134.

Moustafa N, Slay J. UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In: 2015 military communications and information systems conference (MilCIS). IEEE; 2015. pp. 1–6.

Anwer, H. M., Farouk, M., & Abdel-Hamid, A. (2018, April). A framework for efficient network anomaly intrusion detection with features selection. In 2018 9th International Conference on Information and Communication Systems (ICICS) (pp. 157-162). IEEE

Hauke, J., & Kossowski, T, Correlations between variables can be measured with the use of different indices (coefficients). The three most popular are: Pearson’s coefficient, Spearman’s rho coefficient, and Kendall’s tau coefficient (2011)

Scikit Learn, Machine Learning in Python. https://scikit-learn.org/stable. Accessed 26 Sept 2020.

Kapoor, E. ., Kumar, A. ., & Singh , D. . (2023). Energy-Efficient Flexible Flow Shop Scheduling With Due Date and Total Flow Time. International Journal on Recent and Innovation Trends in Computing and Communication, 11(2s), 259–267. https://doi.org/10.17762/ijritcc.v11i2s.6145

Omondi, P., Rosenberg, D., Almeida, G., Soo-min, K., & Kato, Y. A Comparative Analysis of Deep Learning Models for Image Classification. Kuwait Journal of Machine Learning, 1(3). Retrieved from http://kuwaitjournals.com/index.php/kjml/article/view/128

Soundararajan, R., Stanislaus, P. M., Ramasamy, S. G., Dhabliya, D., Deshpande, V., Sehar, S., & Bavirisetti, D. P. (2023). Multi-channel assessment policies for energy-efficient data transmission in wireless underground sensor networks. Energies, 16(5) doi:10.3390/en16052285 Talukdar, V., Dhabliya, D., Kumar, B., Talukdar, S. B., Ahamad, S., & Gupta, A. (2022).

Downloads

Published

21.09.2023

How to Cite

Satyanarayana, G. ., & Chatrapathi, K. S. . (2023). Improving Intrusion Detection Performance with Genetic Algorithm-Based Feature Extraction and Ensemble Machine Learning Methods. International Journal of Intelligent Systems and Applications in Engineering, 11(4), 100–112. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/3458

Issue

Section

Research Article