Feature Engineering for Predictive Maintenance: Identifying Key Predictors of Machine Defects Using Machine Learning

Authors

  • Chinedu Sebastian Ani Department of Industrial and production Engineering, Nnamdi Azikiwe University, Awka, Anambra State Nigeria
  • Godwin Harold Chukwuemeka Department of Industrial and production Engineering, Nnamdi Azikiwe University, Awka, Anambra State Nigeria
  • Uchendu Onwusoronye Onwurah Department of Industrial and production Engineering, Nnamdi Azikiwe University, Awka, Anambra State Nigeria

DOI:

https://doi.org/10.58471/jds.v3i2.7267

Keywords:

Preventive Maintenance, Feature Engineering, Machine learning, Machine defects, Vibration signal, ANOVA, neural network.

Abstract

In the modern industrial environments, the ability to predict equipment failure before it occurs is essential for minimizing downtime and maximizing operational efficiency. This research explores the use of feature engineering to identify key indicators of mechanical faults in a cement mill fan system. Vibration data were collected over 34 weeks from critical components of the fan and processed using several statistical techniques to extract relevant features. Various feature selection methods including Principal Component Analysis (PCA), Minimum Redundancy Maximum Relevance (mRMR), ReliefF, Chi-square, ANOVA, and Kruskal-Wallis were used to determine the most informative features. These features were then used to train and evaluate machine learning models, with neural networks demonstrating superior performance. Among all models, the neural network optimized with Chi-square-selected features achieved the highest classification accuracy, fastest prediction speed, and lowest misclassification cost. These results highlight the effectiveness of combining robust feature selection with deep learning methods for reliable fault detection and predictive maintenance in industrial systems.

References

Ahmad, G.N., Ullah, S., Algethami, A., Fatima, H., & Akhter, S.M.H. (2022). Comparative study of optimum medical diagnosis of human heart disease using machine learning technique with and without sequential feature selection. IEEE Access, 10, 23808–23828. https://ieeexplore.ieee.org/abstract/document/9718089/

Alshaer, H.N., Otair, M.A., Abualigah, L., Alshinwan, M., & Khasawneh, A.M. (2021). Feature selection method using improved Chi-square on Arabic text classifiers: Analysis and application. Multimedia Tools and Applications, 80(7), 10373–10390. https://doi.org/10.1007/s11042-020-10074-6

Balducci, F., Impedovo, D., & Pirlo, G. (2018). Machine learning applications on agricultural datasets for smart farm enhancement. Machines, 6, 38–59. https://doi.org/10.3390/machines6030038.

Bezerra, F. E., Oliveira Neto, G. C. de, Cervi, G. M., Francesconi Mazetto, R., Faria, A. M. de, Vido, M., Lima, G. A., Araújo, S. A. de, Sampaio, M., & Amorim, M. (2024). Impacts of feature selection on predicting machine failures by machine learning algorithms. Applied Sciences, 14(8), 3337. https://www.mdpi.com/2076-3417/14/8/3337.

Bharti, R., Khamparia, A., Shabaz, M., Dhiman, G., Pande, S., & Singh, P. (2021). Prediction of heart disease using a combination of machine learning and deep learning. Computational Intelligence and Neuroscience, 2021(1), 8387680. https://doi.org/10.1155/2021/8387680

Buchaiah, S., & Shakya, P. (2022). Bearing fault diagnosis and prognosis using data fusion based feature extraction and feature selection. Measurement, 188, 110506. https://www.sciencedirect.com/science/article/pii/S0263224121013889.

Çalışkan, A. (2023). Diagnosis of malaria disease by integrating chi-square feature selection algorithm with convolutional neural networks and autoencoder network. Transactions of the Institute of Measurement and Control, 45(5), 975–985. https://doi.org/10.1177/01423312221147335.

Debal, D. A., & Sitote, T. M. (2022). Chronic kidney disease prediction using machine learning techniques. Journal of Big Data, 9(1), 109. https://doi.org/10.1186/s40537-022-00657-5.

Ghosh, P., Azam, S., Jonkman, M., Karim, A., Shamrat, F. J. M., Ignatious, E., Shultana, S., Beeravolu, A. R., & De Boer, F. (2021). Efficient prediction of cardiovascular disease using machine learning algorithms with relief and LASSO feature selection techniques. IEEE Access, 9, 19304–19326. https://ieeexplore.ieee.org/abstract/document/9333574/

Ileberi, E., Sun, Y., & Wang, Z. (2022). A machine learning based credit card fraud detection using the GA algorithm for feature selection. Journal of Big Data, 9(1), 24. https://doi.org/10.1186/s40537-022-00573-8

Khattach, O., Moussaoui, O., & Hassine, M. (2024). Feature selection strategies in failure prediction. In M. Serrhini & K. Ghoumid (Eds.), Advances in Smart Medical, IoT & Artificial Intelligence, 11, 185–192. https://doi.org/10.1007/978-3-031-66850-0_21

Korial, A. E., Gorial, I. I., & Humaidi, A. J. (2024). An improved ensemble-based cardiovascular disease detection system with chi-square feature selection. Computers, 13(6), 126. https://www.mdpi.com/2073-431X/13/6/126

Mahmood, M. R. (2021). Two feature selection methods comparison chi-square and relief-f for facial expression recognition. Journal of Physics: Conference Series, 1804 (1), 012056. https://iopscience.iop.org/article/10.1088/1742-6596/1804/1/012056/meta.

Rupapara, V., Rustam, F., Ishaq, A., Lee, E., & Ashraf, I. (2023). Chi-square and PCA based feature selection for diabetes detection with ensemble classifier. Intelligent Automation & Soft Computing, 36(2).

Sharma, A., & Mishra, P. K. (2022). Performance analysis of machine learning based optimized feature selection approaches for breast cancer diagnosis. International Journal of Information Technology, 14(4), 1949–1960. https://doi.org/10.1007/s41870-021-00671-5

Suruliandi, A., Mariammal, G., & Raja, S. P. (2021). Crop prediction based on soil and environmental characteristics using feature selection techniques. Mathematical and Computer Modelling of Dynamical Systems, 27(1), 117–140. https://doi.org/10.1080/13873954.2021.1882505.

Theng, D., & Bhoyar, K. K. (2024). Feature selection techniques for machine learning: A survey of more than two decades of research. Knowledge and Information Systems, 66(3), 1575–1637. https://doi.org/10.1007/s10115-023-02010-5.

Xie, S., Zhang, Y., Lv, D., Chen, X., Lu, J., & Liu, J. (2023). A new improved maximal relevance and minimal redundancy method based on feature subset. The Journal of Supercomputing, 79(3), 3157–3180. https://doi.org/10.1007/s11227-022-04763-2

Yang, Y., Zhai, J., Wang, H., Xu, X., Hu, Y., & Wen, J. (2025). An improved fault diagnosis method for rolling bearing based on relief-F and optimized random forests algorithm. Machines, 13(3). https://doi.org/10.3390/machines13030183.

Downloads

Published

2025-08-29

How to Cite

Chinedu Sebastian Ani, Godwin Harold Chukwuemeka, & Uchendu Onwusoronye Onwurah. (2025). Feature Engineering for Predictive Maintenance: Identifying Key Predictors of Machine Defects Using Machine Learning. Journal Of Data Science, 3(02), 79–97. https://doi.org/10.58471/jds.v3i2.7267