A Two-Phase Feature Selection Technique using Information Gain and XGBoost-RFE for NIDS

Authors

  • Mohammed Sayeeduddin Habeeb Research Scholar, Department of Electronics and Communication Engineering, University College of Engineering, Acharya Nagarjuna University, Andhra Pradesh, India.
  • Tummala Ranga Babu Dept. of Electronics & Communication Engineering, R.V.R. & J.C.College of Engineering, Chowdavaram, Guntur, Andhra Pradesh, INDIA

Keywords:

NIDS, Gradient boosting (XGBoost), Recursive Feature Elimination (RFE), IoT, deep neural network (DNN)

Abstract

Many interconnected devices in Internet of Things (IoT) networks result in complicated and high-dimensional data. To protect this high-dimensional data, efficient and effective security is required. Network intrusion detection systems (NIDS) are important in securing IoT networks from unauthorized access, anomalies, and zero-day attacks. However, NIDS has a major issue because of the high dimensional dataset created by IoT devices, analyzing all these features from the dataset results in an increase in system complexity and compromises the detection accuracy, so we need an effective feature reduction technique. This paper addresses this issue by proposing a novel two-phase feature selection technique. In the first phase, the Information Gain (IG) is used to rank the features based on the information contained in each feature of the dataset this results in narrowing the feature space while improving computational complexity. The rest of the feature subset goes through to XGBoost with Recursive Feature Elimination (XGBoost-RFE) in the second phase. The least important features are eliminated in each iteration by XGBoost, a gradient-boosting algorithm to evaluate feature relevance continually. This iteration is continuous until we get optimal features for NIDS. These selected features are given to deep learning specifically to the deep neural network (DNN). Comparative analysis is done with other deep learning approaches using the BOT-IoT 2020 dataset. Experimental results show an improvement in model accuracy of 99.8% and reduced FAR to 0.000 with 16 features selected from the dataset, we compared the results with the well-known DL model to check the effectiveness of our proposed model.

Downloads

Download data is not yet available.

References

K. Albulayhi, Q. A. Al-Haija, S. A. Alsuhibany, A. A. Jillepalli, M. Ashrafuzzaman, and F. T. Sheldon, “IoT Intrusion Detection Using Machine Learning with a Novel High Performing Feature Selection Method,” Applied Sciences 2022, Vol. 12, Page 5015, vol. 12, no. 10, p. 5015, May 2022, doi: 10.3390/APP12105015.

B. Xu, L. Sun, X. Mao, R. Ding, and C. Liu, “IoT Intrusion Detection System Based on Machine Learning,” Electronics 2023, Vol. 12, Page 4289, vol. 12, no. 20, p. 4289, Oct. 2023, doi: 10.3390/ELECTRONICS12204289.

N. V. Sharma and N. S. Yadav, “An optimal intrusion detection system using recursive feature elimination and ensemble of classifiers,” Microprocess Microsyst, vol. 85, p. 104293, Sep. 2021, doi: 10.1016/J.MICPRO.2021.104293.

M. Ahmed, A. Naser Mahmood, and J. Hu, “A survey of network anomaly detection techniques,” Journal of Network and Computer Applications, vol. 60, pp. 19–31, Jan. 2016, doi: 10.1016/J.JNCA.2015.11.016.

M. Ahmed, A. N. Mahmood, and M. R. Islam, “A survey of anomaly detection techniques in financial domain,” Future Generation Computer Systems, vol. 55, pp. 278–288, Feb. 2016, doi: 10.1016/J.FUTURE.2015.01.001.

R. Genuer, J. M. Poggi, and C. Tuleau-Malot, “Variable selection using random forests,” Pattern Recognit Lett, vol. 31, no. 14, pp. 2225–2236, Oct. 2010, doi: 10.1016/J.PATREC.2010.03.014.

G. P. Dubey and D. R. K. Bhujade, “Optimal feature selection for machine learning based intrusion detection system by exploiting attribute dependence,” Mater Today Proc, vol. 47, pp. 6325–6331, Jan. 2021, doi: 10.1016/J.MATPR.2021.04.643.

A. Rashid, M. J. Siddique, and S. M. Ahmed, “Machine and Deep Learning Based Comparative Analysis Using Hybrid Approaches for Intrusion Detection System,” 3rd International Conference on Advancements in Computational Sciences, ICACS 2020, Feb. 2020, doi: 10.1109/ICACS47775.2020.9055946.

M. A. Khan, “HCRNNIDS: Hybrid Convolutional Recurrent Neural Network-Based Network Intrusion Detection System,” Processes 2021, Vol. 9, Page 834, vol. 9, no. 5, p. 834, May 2021, doi: 10.3390/PR9050834.

P. García-Teodoro, J. Díaz-Verdejo, G. Maciá-Fernández, and E. Vázquez, “Anomaly-based network intrusion detection: Techniques, systems and challenges,” Comput Secur, vol. 28, no. 1–2, pp. 18–28, Feb. 2009, doi: 10.1016/J.COSE.2008.08.003.

Sharafaldin, A. H. Lashkari, and A. A. Ghorbani, “Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization,” 2018, doi: 10.5220/0006639801080116.

Fotiadou, T. H. Velivassaki, A. Voulkidis, D. Skias, S. Tsekeridou, and T. Zahariadis, “Network Traffic Anomaly Detection via Deep Learning,” Information 2021, Vol. 12, Page 215, vol. 12, no. 5, p. 215, May 2021, doi: 10.3390/INFO12050215.

R. Vijayanand and D. Devaraj, “A Novel Feature Selection Method Using Whale Optimization Algorithm and Genetic Operators for Intrusion Detection System in Wireless Mesh Network,” IEEE Access, vol. 8, pp. 56847–56854, 2020, doi: 10.1109/ACCESS.2020.2978035.

G. P. Dubey and D. R. K. Bhujade, “Optimal feature selection for machine learning based intrusion detection system by exploiting attribute dependence,” Mater Today Proc, vol. 47, pp. 6325–6331, Jan. 2021, doi: 10.1016/J.MATPR.2021.04.643.

M. Gheisari, G. Wang, and M. Z. A. Bhuiyan, “A Survey on Deep Learning in Big Data,” Proceedings - 2017 IEEE International Conference on Computational Science and Engineering and IEEE/IFIP International Conference on Embedded and Ubiquitous Computing, CSE and EUC 2017, vol. 2, pp. 173–180, Aug. 2017, doi: 10.1109/CSE-EUC.2017.215.

M. A. Khan, “HCRNNIDS: Hybrid Convolutional Recurrent Neural Network-Based Network Intrusion Detection System,” Processes 2021, Vol. 9, Page 834, vol. 9, no. 5, p. 834, May 2021, doi: 10.3390/PR9050834.

M. S. Habeeb and T. R. Babu, “Network intrusion detection system: A survey on artificial intelligence-based techniques,” Expert Syst, vol. 39, no. 9, p. e13066, Nov. 2022, doi: 10.1111/EXSY.13066.

Z. Ahmad et al., “S-ADS: Spectrogram Image-based Anomaly Detection System for IoT networks,” Proceedings - AiIC 2022: 2022 Applied Informatics International Conference: Digital Innovation in Applied Informatics during the Pandemic, pp. 105–110, 2022, doi: 10.1109/AIIC54368.2022.9914599.

T. Wu, H. Fan, H. Zhu, C. You, H. Zhou, and X. Huang, “Intrusion detection system combined enhanced random forest with SMOTE algorithm,” EURASIP J Adv Signal Process, vol. 2022, no. 1, pp. 1–20, Dec. 2022, doi: 10.1186/S13634-022-00871-6/TABLES/6.

N. Koroniotis, N. Moustafa, E. Sitnikova, and B. Turnbull, “Towards the development of realistic botnet dataset in the Internet of Things for network forensic analytics: Bot-IoT dataset,” Future Generation Computer Systems, vol. 100, pp. 779–796, Nov. 2019, doi: 10.1016/J.FUTURE.2019.05.041.

Ullah and Q. H. Mahmoud, “A Technique for Generating a Botnet Dataset for Anomalous Activity Detection in IoT Networks,” Conf Proc IEEE Int Conf Syst Man Cybern, vol. 2020-October, pp. 134–140, Oct. 2020, doi: 10.1109/SMC42975.2020.9283220.

“Welcome To Colaboratory - Colaboratory.” Accessed: Dec. 25, 2023. [Online]. Available: https://colab.research.google.com/

Ullah and Q. H. Mahmoud, “A Two-Level Flow-Based Anomalous Activity Detection System for IoT Networks,” Electronics 2020, Vol. 9, Page 530, vol. 9, no. 3, p. 530, Mar. 2020, doi: 10.3390/ELECTRONICS9030530.

Downloads

Published

29.01.2024

How to Cite

Habeeb, M. S. ., & Babu , T. R. . (2024). A Two-Phase Feature Selection Technique using Information Gain and XGBoost-RFE for NIDS. International Journal of Intelligent Systems and Applications in Engineering, 12(13s), 278–287. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/4595

Issue

Section

Research Article