Telecom Churn Prediction Using an Ensemble Approach with Feature Engineering and Importance

Authors

Keywords:

Telecommunications, Churn Prediction, Random Forest, Gradient Boosted Tree, Feature Importance, Feature Engineering

Abstract

In Telecommunication industry, churn prediction is loss of customers and faces fierce competition to retain customers. Churn is the phenomena of a customer leaving a business, and in this context, churn prediction refers to predicting the client's intention to leave. In order to retain customers company needs a good churn prediction model. For a churn prediction model, company needs to predict why customer have churned in past and which factor is most important to predict customers who are near churn. This paper primarily focused on the feature importance and feature engineering for churn prediction model. For classification phase two ensemble models, Random forest and Gradient boosted trees were used. This paper also emphasised on why feature importance and feature engineering are important prediction. where, this paper includes various data pre-processing steps that played an important role in this model. This model uses Cell2Cell dataset of size 3333 subscribers and 57 features. This study presented a very good comparison between the model developed in the study with old models. The implementation part has been done using python and apache spark, that are very good platform for data analysis using machine learning and data mining. For improved performance and effective outcomes Hyper parameter optimization using a grid is used.  Prediction performance is evaluated for accuracy, Confusion matrix before and after grid based hyper parameter optimisation. The model out performed and achieved 95% accuracy using Random Forest and 97% accuracy using gradient boosted trees.

Downloads

Download data is not yet available.

References

Pretam Jayaswal+, Bakshi Rohit Prasad*, Divya Tomar!, and Sonali Agarwal#, “An Ensemble Approach for Efficient Churn Prediction in Telecom Industry”, International Journal of Database Theory and Application, Vol.9, No.8 (2016), pp.211-232 http://dx.doi.org/10.14257/ijdta.2016.9.8.21

Vanitha, D. D. . (2022). Comparative Analysis of Power switches MOFET and IGBT Used in Power Applications. International Journal on Recent Technologies in Mechanical and Electrical Engineering, 9(5), 01–09. https://doi.org/10.17762/ijrmee.v9i5.368

Abdelrahim Kasem Ahmad* , Assef Jafar and Kadan Aljoumaa, “Customer churn prediction in telecom using machine learning in big data platform”, Journal of big data, Ahmad et al. J Big Data (2019) 6:28 https://doi.org/10.1186/s40537-019-0191-6

IRFAN ULLAH1, BASIT RAZA 1, AHMAD KAMRAN MALIK 1, MUHAMMAD IMRAN1, SAIF UL ISLAM 2, AND SUNG WON KIM 3, “A Churn Prediction Model Using Random Forest: Analysis of Machine Learning Techniques for Churn Prediction and Factor Identification in Telecom Sector”, IEEE Access, May 6, 2019. Volume 7.

L. N. Balai, G. K. J. A. K. S. (2022). Investigations on PAPR and SER Performance Analysis of OFDMA and SCFDMA under Different Channels. International Journal on Recent Technologies in Mechanical and Electrical Engineering, 9(5), 28–35. https://doi.org/10.17762/ijrmee.v9i5.371

Sahar F. Sabbeh, “Machine-Learning Techniques for Customer Retention: A Comparative Study”, (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 9, No. 2, 2018.

Freddie Mathews Kau, Hlaudi Daniel Masethe and Craven Klaas Lepota, “ Service Provider Churn Prediction for Telecoms Company using Data Analytics”, Proceedings of the World Congress on Engineering and Computer Science 2017 Vol I WCECS 2017, October 25-27, 2017, San Francisco, USA.

Ravita, R., & Rathi, S. (2022). Inductive Learning Approach in Job Recommendation. International Journal of Intelligent Systems and Applications in Engineering, 10(2), 242–251. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/1829

Malla, S., M. J. . Meena, O. . Reddy. R, V. . Mahalakshmi, and A. . Balobaid. “A Study on Fish Classification Techniques Using Convolutional Neural Networks on Highly Challenged Underwater Images”. International Journal on Recent and Innovation Trends in Computing and Communication, vol. 10, no. 4, Apr. 2022, pp. 01-09, doi:10.17762/ijritcc.v10i4.5524.

Yin Wu, Jiayin Qi, “The Study on Feature Selection in Customer Churn Prediction Modeling’, Proceedings of the 2009 IEEE International Conference on Systems, Man, and Cybernetics San Antonio, TX, USA - October 2009.

Jiayin Qi, Yuanquan Li, “A novel and convenient variable selection method for choosing effective input variables for telecommunication customer churn prediction model”, Proceedings of the 2009 IEEE International Conference on Systems, Man, and Cybernetics San Antonio, TX, USA - October 2009.

Xu Hong, Zhang Zigang, Zhang Yishi*,” Churn Prediction in Telecom Using a Hybrid Two-phase Feature Selection Method”, 2009 Third International Symposium on Intelligent Information Technology Application. IEEE DOI 10.1109/IITA.2009.392.

Pepsi M, B. B. ., V. . S, and A. . A. “Tree Based Boosting Algorithm to Tackle the Overfitting in Healthcare Data”. International Journal on Recent and Innovation Trends in Computing and Communication, vol. 10, no. 5, May 2022, pp. 41-47, doi:10.17762/ijritcc.v10i5.5552.

V. Kavitha, S. V Mohan Kumar, G. Hemanth Kumar, M. Harish,” Churn Prediction of Customer in Telecom Industry using machine Learning Algorithms”, International Journal of Engineering Research & Technology (IJERT), ISSN: 2278-0181, Vol. 9 Issue 05, May-2020.

DR. M.BALASUBRAMANIAN *, M.SELVARANI **, “CHURN PREDICTION IN MOBILE TELECOM SYSTEM USING DATA MINING TECHNIQUES”, International Journal of Scientific and Research Publications, Volume 4, Issue 4, April 2014 1 ISSN 2250-3153.

J. Pamina, J. Beschi Raja, S. Sathya Bama, S. Soundarya, M.S. Sruthi, S. Kiruthika, V.J. Aiswaryadevi, G. Priyanka, “ An Effective Classifier for Predicting Churn in Telecommunication”, Jour of Adv Research in Dynamical & Control Systems, Vol. 11, 01-Special Issue, 2019.

J. H. Friedman, “Greedy function approximation: A gradient boosting machine”, Annals of Statistics vol. 29, (2001), pp. 1189-1232.

Chaudhary, D. S. . (2022). Analysis of Concept of Big Data Process, Strategies, Adoption and Implementation. International Journal on Future Revolution in Computer Science &Amp; Communication Engineering, 8(1), 05–08. https://doi.org/10.17762/ijfrcsce.v8i1.2065

G. M. Weiss, “Mining with rarity: A unifying framework”, ACM SIGKDD Explorations Newslett., vol. 6, no. 1, (2004), pp. 7-19.

Agarwal, D. A. . (2022). Advancing Privacy and Security of Internet of Things to Find Integrated Solutions. International Journal on Future Revolution in Computer Science &Amp; Communication Engineering, 8(2), 05–08. https://doi.org/10.17762/ijfrcsce.v8i2.2067

Random Forest Structure

Downloads

Published

01.10.2022

How to Cite

Jain , H. ., Khunteta, A. ., & Srivastava, S. . (2022). Telecom Churn Prediction Using an Ensemble Approach with Feature Engineering and Importance. International Journal of Intelligent Systems and Applications in Engineering, 10(3), 22–33. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/2134

Issue

Section

Research Article