A Hybrid Filter Feature Selection Approach for Remaining Useful Life Prediction of Industrial Machinery



feature selection, filter method, high dimensional data, prognostics, remaining useful life


Data-driven predictive maintenance commonly uses machine learning algorithms to conduct prognostics of an asset’s condition over its life cycle. Asset information and domain expert knowledge are essential in data-driven predictive maintenance to support maintenance-related decisions. Using a general feature selection approach in data-driven prognostics can cause misinterpretation, removal, or loss of domain-specific information of assets. The high dimensionality characteristics of asset data due to a large number of features sourced from various sensor measurements can affect the performance and reliability of machine learning algorithms. This paper presents a feature selection approach to overcome the challenges of retaining domain-specific asset data information by utilising the Safe Operating Limit of an asset. The asset information is combined with the filter method to reduce the high dimensional aspects of asset data for application in equipment’s remaining useful life prediction. The proposed feature selection approach is demonstrated on an oil and gas equipment dataset that contains multiple run-to-failure situations of a gas compressor.


Download data is not yet available.


Advanced Technology Services, I. (2006, March 29). Downtime Costs Auto Industry $22k/Minute - Survey. Retrieved December 19, 2020, from https://news.thomasnet.com/companystory/downtime-costs-auto-industry-22k-minute-survey-481017

The modern mine: How digitisation is transforming industry from pit to port. (2020, July 9). Schneider Electric. https://blog.se.com/mining-metals-minerals/2018/11/06/the-modern-mine-how-digitization-is-transforming-industry-from-pit-to-port/

Boyapati, B. ., and J. . Kumar. “Parasitic Element Based Frequency Reconfigurable Antenna With Dual Wideband Characteristics for Wireless Applications”. International Journal on Recent and Innovation Trends in Computing and Communication, vol. 10, no. 6, June 2022, pp. 10-23, doi:10.17762/ijritcc.v10i6.5619.

Baker Hughes, “The Impact of Digital on Unplanned Downtime: An Offshore Oil and Gas Perspective”, 2016

Tobon-Mejia, D. A., Medjaher, K., Zerhouni, N., & Tripot, G. (2012). A data-driven failure prognostics method based on mixture of Gaussians hidden Markov models. IEEE Transactions on reliability, 61(2), 491-503.

ISO. (2004). Condition Monitoring and Diagnostics of Machines Prognostics Part1: General Guidelines.

Kozlowski, J. D., Watson, M. J., Byington, C. S., Garga, A. K., & Hay, T. A. (2001, July). Electrochemical cell diagnostics using online impedance measurement, state estimation and data fusion techniques. In Intersociety energy conversion engineering conference (Vol. 2, pp. 981-986). SAE; 1999.

Agarwal, D. A. . (2022). Advancing Privacy and Security of Internet of Things to Find Integrated Solutions. International Journal on Future Revolution in Computer Science &Amp; Communication Engineering, 8(2), 05–08. https://doi.org/10.17762/ijfrcsce.v8i2.2067

Debattista, J., Lange, C., Scerri, S., & Auer, S. (2015, December). Linked ‘Big’ Data: towards a manifold increase in big data value and veracity. In 2015 IEEE/ACM 2nd International Symposium on Big Data Computing (BDC) (pp. 92-98). IEEE.

Fernandes, M., Canito, A., Bolón-Canedo, V., Conceição, L., Praça, I., & Marreiros, G. (2019). Data analysis and feature selection for predictive maintenance: A case-study in the metallurgic industry. International journal of information management, 46, 252-262.

Zhang, W., Yang, D., & Wang, H. (2019). Data-driven methods for predictive maintenance of industrial equipment: A survey. IEEE Systems Journal, 13(3), 2213-2227.

Yan, J., Meng, Y., Lu, L., & Li, L. (2017). Industrial big data in an industry 4.0 environment: Challenges, schemes, and applications for predictive maintenance. IEEE Access, 5, 23484-23491.

Cai, J., Luo, J., Wang, S., & Yang, S. (2018). Feature selection in machine learning: A new perspective. Neurocomputing, 300, 70-79.

Saeys, Y., Inza, I., & Larrañaga, P. (2007). A review of feature selection techniques in bioinformatics. bioinformatics, 23(19), 2507-2517.

Chandrashekar, G., & Sahin, F. (2014). A survey on feature selection methods. Computers & Electrical Engineering, 40(1), 16-28.

Georges, N., Mhiri, I., Rekik, I., & Alzheimer’s Disease Neuroimaging Initiative. (2020). Identifying the best data-driven feature selection method for boosting reproducibility in classification tasks. Pattern Recognition, 101, 107183.

Khumprom, P., Yodo, N., & Grewell, D. (2020, January). Neural Networks Based Feature Selection Approaches for Prognostics of Aircraft Engines. In 2020 Annual Reliability and Maintainability Symposium (RAMS) (pp. 1-7). IEEE.

Chen, G., & Chen, J. (2015). A novel wrapper method for feature selection and its applications. Neurocomputing, 159, 219-226.

Zhao, J., Chen, L., Pedrycz, W., & Wang, W. (2018). Variational inference-based automatic relevance determination kernel for embedded feature selection of noisy industrial data. IEEE Transactions on Industrial Electronics, 66(1), 416-428.

Bolón-Canedo, V., Sánchez-Maroño, N., & Alonso-Betanzos, A. (2015). Feature selection for high-dimensional data. Cham: Springer International Publishing.

Komas, T., Daub, R., Karamat, M. Z., Thiede, S., & Herrmann, C. (2019, August). Data-and expert-driven analysis of cause-effect relationships in the production of lithium-ion batteries. In 2019 IEEE 15th International Conference on Automation Science and Engineering (CASE) (pp. 380-385). IEEE.

Aremu, O. O., Palau, A. S., Parlikad, A. K., Hyland-Wood, D., & McAree, P. R. (2018). Structuring data for intelligent predictive maintenance in asset management. IFAC-PapersOnLine, 51(11), 514-519

Aremu, O. O., Cody, R. A., Hyland-Wood, D., & McAree, P. R. (2020). A relative entropy-based feature selection framework for asset data in predictive maintenance. Computers & Industrial Engineering, 106536.

Mlambo, N., Cheruiyot, W. K., & Kimwele, M. W. (2016). A survey and comparative study of filter and wrapper feature selection techniques. International Journal of Engineering and Science (IJES), 5(8), 57-67.

Lee, I. (2017). Big data: Dimensions, evolution, impacts, and challenges. Business Horizons, 60(3), 293-303.

Liu, Y., Wu, J. M., Avdeev, M., & Shi, S. Q. (2020). Multi‐Layer Feature Selection Incorporating Weighted Score‐Based Expert Knowledge toward Modeling Materials with Targeted Properties. Advanced Theory and Simulations, 3(2), 1900215.

Akhtar, F., Li, J., Pei, Y., Imran, A., Rajput, A., Azeem, M., & Liu, B. (2020). Diagnosis of large-for-gestational-age infants using a semi-supervised feature learned from expert and data. Multimedia Tools and Applications, 1-31.

Paithane, P. M., & Kakarwal, D. (2022). Automatic Pancreas Segmentation using A Novel Modified Semantic Deep Learning Bottom-Up Approach. International Journal of Intelligent Systems and Applications in Engineering, 10(1), 98–104. https://doi.org/10.18201/ijisae.2022.272

Mujtaba, G., Shuib, L., Raj, R. G., Rajandram, R., Shaikh, K., & Al-Garadi, M. A. (2017). Automatic ICD-10 multi-class classification of cause of death from plaintext autopsy reports through expert-driven feature selection. PloS one, 12(2), e0170242.

Nahar, J., Imam, T., Tickle, K. S., & Chen, Y. P. P. (2013). Computational intelligence for heart disease diagnosis: A medical knowledge driven approach. Expert Systems with Applications, 40(1), 96-104.

Gnana, D. A. A., Balamurugan, S. A. A., & Leavline, E. J. (2016). Literature review on feature selection methods for high-dimensional data. International Journal of Computer Applications, 975, 8887.

Open Data Science, ODSC. “Confronting the Curse of Dimensionality.” Medium, Medium, 4 Apr. 2019, medium.com/@ODSC/confronting-the-curse-of-dimensionality-5bcf2998b30d.

Cai, J., Luo, J., Wang, S., & Yang, S. (2018). Feature selection in machine learning: A new perspective. Neurocomputing, 300, 70-79.

Hira, Z. M., & Gillies, D. F. (2015). A review of feature selection and feature extraction methods applied on microarray data. Advances in bioinformatics, 2015.

Urbanowicz, R. J., Meeker, M., La Cava, W., Olson, R. S., & Moore, J. H. (2018). Relief-based feature selection: Introduction and review. Journal of biomedical informatics, 85, 189-203.

Bar, Y., Diamant, I., Wolf, L., Lieberman, S., Konen, E., & Greenspan, H. (2018). Chest pathology identification using deep feature selection with non-medical training. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualisation, 6(3), 259-263.

Stauffer, T., & Chastain-Knight, D. (2019, March). Don’t Let Your Safe Operating Limits Leave You SOL (Out of Luck). In 2019 Spring Meeting and 15th Global Congress on Process Safety. AIChE.

American Petroleum Institute. (2016). Recommended Practice for Completion/Workover Risers (2nd ed.). American Petroleum Institute.

Center for Chemical Process Safety (CCPS). (2010). Guidelines for Risk Based Process Safety. John Wiley & Sons.

Ellefsen, A. L., Bjørlykhaug, E., Æsøy, V., Ushakov, S., & Zhang, H. (2019). Remaining useful life predictions for turbofan engine degradation using semi-supervised deep architecture. Reliability Engineering & System Safety, 183, 240-251.

Stuart, R., & Peter, N. (2016). Artificial intelligence-a modern approach 3rd ed.

Lei, Y., Li, N., Guo, L., Li, N., Yan, T., & Lin, J. (2018). Machinery health prognostics: A systematic review from data acquisition to RUL prediction. Mechanical systems and signal processing, 104, 799-834.

Downey, A., Sadoughi, M., Laflamme, S., & Hu, C. (2018). Incipient damage detection for large area structures monitored with a network of soft elastomeric capacitors using relative entropy. IEEE Sensors Journal, 18(21), 8827-8834.

Maleki, E., Belkadi, F., Boli, N., van der Zwaag, B. J., Alexopoulos, K., Koukas, S., ... & Mourtzis, D. (2018). Ontology-based framework enabling smart product-service systems: application of sensing systems for machine health monitoring. IEEE internet of things journal, 5(6), 4496-4505.

Zhang, B., Zhang, L., & Xu, J. (2016). Degradation feature selection for remaining useful life prediction of rolling element bearings. Quality and Reliability Engineering International, 32(2), 547-554.

Mashhadi, P. S., Nowaczyk, S., & Pashami, S. (2019). Stacked Ensemble of Recurrent Neural Networks for Predicting Turbocharger Remaining Useful Life. Applied Sciences, 10(1), 69. MDPI AG. Retrieved from http://dx.doi.org/10.3390/app10010069

Pedregosa, F., et al., (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12(Oct), 2825–2830.

Chicco, D., Warrens, M. J., & Jurman, G. (2021). The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ. Computer science, 7, e623. https://doi.org/10.7717/peerj-cs.623

Ahmed Cherif Megri, Sameer Hamoush, Ismail Zayd Megri, Yao Yu. (2021). Advanced Manufacturing Online STEM Education Pipeline for Early-College and High School Students. Journal of Online Engineering Education, 12(2), 01–06. Retrieved from http://onlineengineeringeducation.com/index.php/joee/article/view/47

Overall process flow




How to Cite

K. A. A. . Ku Amir, S. M. . Taib, and M. H. . Hasan, “A Hybrid Filter Feature Selection Approach for Remaining Useful Life Prediction of Industrial Machinery”, Int J Intell Syst Appl Eng, vol. 10, no. 4, pp. 88–95, Dec. 2022.



Research Article