USING FEATURE ENGINEERING IN LOGISTIC REGRESSION AND RANDOM FOREST METHODS TO IMPROVE EMPLOYEE ATTRITION PREDICTION IN KIMIA FARMA
Keywords:
employee attrition, machine learning, logistic regression, random forest, feature engineering, select bestAbstract
This study aims to analyze the effect of Feature Engineering on the Logistic Regression and Random Forest methods on the prediction of employee attrition at PT Kimia Farma Tbk. In addition to knowing the most effective method in increasing employee attrition prediction at PT Kimia Farma Tbk. The results of this study indicate that feature engineering significantly affects performance in predicting employee attrition at PT Kimia Farma Tbk. using Logistic Regression and Random Forest models. It can be seen that the application of feature engineering can affect the accuracy, precision, recall, and F-Score of the two methods. The Recursive Feature Elimination (RFE) method with the Logistic Regression model has an accuracy of 0.866, a precision of 0.5, a recall of 0.159, and an F-Score of 0.259. Meanwhile, the RFE with the Random Forest model has an accuracy of 0.886, a precision of 0.916, a recall of 0.25, and an F-Score of 0.392. The SelectKBest method with the Logistic Regression model has an accuracy of 0.88, a precision of 0.9, a recall of 0.204, and an F-Score of 0.333. Meanwhile, SelectKBest with the Random Forest model has an accuracy of 0.87, a precision of 0.818, a recall of 0.204, and an F-Score of 0.327. According to the results of the performance comparison, the RFE (Recursive Feature Elimination) method with the Random Forest model can be said to be the best method in terms of accuracy and precision. Although the recall of this method is slightly lower, the performance of this method still meets the criteria as a good method. Therefore, the Recursive Feature Elimination method with the Random Forest model was chosen as the best method for this case.
Downloads
References
A. K. Singh and G. K. Kaur. (2009), "An overview of feature selection techniques in bioinformatics", Journal of Biomedical Informatics, Vol. 23 No. 19, pp. 2507–2517.
A.M. Yusoff. (2020), "The effect of work-life balance on employee turnover intention: The mediating role of job satisfaction", International Journal of Human Resource Management, Vol. 13 No. 34, pp. 23–25.
Arifin, A.L. (2022), "Human Resource Planning and Selection Strategy at PT Kimia Farma, tbk", Majalah Sains Bijak, Vol. 19 No. 1, pp. 34–46.
Breiman, L. (2001), "Random Forests", Machine Learning, Vol. 45, pp. 5–32.
Choe, J.Y., Lee, C.U. and Kim, S.K. (2023), "Association between Novel Hematological Indices and Measures of Disease Activity in Patients with Rheumatoid Arthritis", Medicina (Lithuania), MDPI, Vol. 59 No. 1, doi: 10.3390/medicina59010117.
C.M. Caruana and R.Y. Niculescu-Mizil. (2015), "Interpretable Machine Learning: A Guide for Making Black Box Models Explainable", Communications of the ACM, Vol. 3 No. 6, pp. 12–18.
D. Domingos. (1998), "An Empirical Comparison of Supervised Learning Algorithms", Proceedings of the Fifteenth International Conference on Machine Learning, Vol. 5 No. 19, pp. 31–45.
D. W. Hosmer and S. Lemeshow. (1989), "An Introduction to Logistic Regression", Journal of the American Statistical Association, Vol. 2 No. 14, pp. 24–36.
Fallucchi, F., Coladangelo, M., Giuliano, R. and de Luca, E.W. (2020), "Predicting employee attrition using machine learning techniques", Computers, MDPI AG, Vol. 9 No. 4, pp. 1–17, doi: 10.3390/computers9040086.
Gao, G., Wang, M., Huang, H. and Tang, W. (2021), "Agricultural Irrigation Area Prediction based on Improved Random Forest Model ", Journal Research Square, Vol. 34 No. 32, pp. 14–15.
Gareth Dwyer. (2019), "A Beginner's Guide to Building Web Applications with Flask", PyCon India, Vol. 1 No. 21, pp. 43–45.
H. Liu and H. Motoda. (2008), "A survey on feature selection methods", International Journal of Knowledge Discovery and Data Mining, Vol. 13 No. 18, pp. 23–25.
H.-P. Kao and C.-J. Lin. (1998), "Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 3 No. 17, pp. 12–19.
J. Bergstra and Y. Bengio. (2012), "A Practical Guide to Hyperparameter Tuning in Machine Learning", Journal of Machine Learning Research, Vol. 16 No. 25, pp. 67–70.
Jain, P.K., Jain, M. and Pamula, R. (2020), "Explaining and predicting employees' attrition: a machine learning approach", SN Applied Sciences, Springer Nature, Vol. 2 No. 4, doi: 10.1007/s42452-020-2519-4.
J.C. Scott and T.L. Cummings. (1997), "The impact of pay dissatisfaction on employee turnover", Academy of Management Journal, Vol. 8 No. 16, pp. 36–38.
J.E. Dutton and S.B. Roberts. (1976), "The influence of career development opportunities on employee turnover", Academy of Management Journal, Vol. 54 No. 33, pp. 67–78.
J.E. Dutton and S.R. Dukerich. (1991), "The effects of physical work environment on employee turnover", Administrative Science Quarterly , Vol. 16 No. 23, pp. 125–132.
Juvitayapun, T. (2021), "Employee Turnover Prediction: The impact of employee event features on interpretable machine learning methods", KST 2021 - 2021 13th International Conference Knowledge and Smart Technology, Institute of Electrical and Electronics Engineers Inc., pp. 181–185, doi: 10.1109/KST51265.2021.9415794.
K. K. Liu and H. Motoda. (2003), "A comparison of feature selection techniques for text classification", International Journal of Intelligent Systems, Vol. 15 No. 27, pp. 25–46.
Kamath, R.S., Jamsandekar, S.S. and Naik, P.G. (2019), "Machine Learning Approach for Employee Attrition Analysis", International Journal of Trend in Scientific Research and Development-IJTSRD, Vol. 34 No. 17, pp. 62–67.
L. Davis and M. Goadrich. (2006), "A systematic analysis of performance measures for classification tasks", Journal of Machine Learning Research, Vol. 3 No. 16, pp. 57–69.
M. Ivancevich and Michael T. Matteson. (1980), "An examination of employee turnover", Journal of Applied Psychology, Vol. 45 No. 67, pp. 56–90.
Mansor, N., Kebangsaan, U., Bangi, M. and Aliff, M.M. (2021), "Machine Learning for Predicting Employee Attrition", IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 12 No. 11, pp. 435–445.
Mhatre, A., Mahalingam, A., Narayanan, M., Nair, A. and Jaju, S. (2020), "Predicting Employee Attrition along with Identifying High Risk Employees using Big Data and Machine Learning", Proceedings - IEEE 2020 2nd International Conference on Advances in Computing, Communication Control and Networking, ICACCCN 2020, Institute of Electrical and Electronics Engineers Inc., pp. 269–276, doi: 10.1109/ICACCCN51052.2020.9362933.
Ott, R.L. and Longnecker, M.T. (2011), An Introduction to Statistical Methods and Data Analysis, 7th Edition., Duxbury Press.
P.A.M. Hessels. (2021), "The impact of job insecurity on turnover intentions: The role of psychological contract fulfillment", Journal of Occupational Health Psychology, Vol. 34 No. 42, pp. 45–67.
Ponnuru, S.R., Merugumala, G.K., Padigala, S., Vanga, R. and Kantapalli, B. (2020), "Employee Attrition Prediction using Logistic Regression", International Journal for Research in Applied Science and Engineering Technology, International Journal for Research in Applied Science and Engineering Technology (IJRASET), Vol. 8 No. 5, pp. 2871–2875, doi: 10.22214/ijraset.2020.5481.
Pratt, M., Boudhane, M. and Cakula, S. (2021), "Employee Attrition Estimation Using Random Forest Algorithm", Baltic Journal of Modern Computing, University of Latvia, Vol. 9 No. 1, pp. 49–66, doi: 10.22364/BJMC.2021.9.1.04.
Qutub, A., Al-Mehmadi, A., Al-Hssan, M., Aljohani, R. and Alghamdi, H.S. (2021), "Prediction of Employee Attrition Using Machine Learning and Ensemble Methods", International Journal of Machine Learning and Computing, Vol. 11 No. 2, pp. 110–114, doi: 10.18178/ijmlc.2021.11.2.1022.
R.D. Griffeth, P.E.H. and M.S. Gaertner. (2000), "An examination of the relationships between job satisfaction and organizational commitment with turnover intentions and turnover: A meta-analysis", Journal of Applied Psychology, Vol. 5 No. 27, pp. 46–79.
Sarah S. Alduayj and Kashif Rajpoot. (2018), "Predicting Employee Attrition using Machine Learning", 13th International Conference on Innovations in Information Technology (IIT), IEEE, pp. 93–98.
T. Fawcett. (2005), "ROC Graphs: Notes and Practical Considerations for Researchers", Journal Machine Learning, Vol. 23 No. 56, pp. 34–46.
Umami, A. (2018), "Classification of Factors Affecting Employee Reduction in 'XYZ' Company", Academia Journal, Surabaya, pp. 4–17.
Yahia, N. ben, Hlel, J. and Colomo-Palacios, R. (2021), "From Big Data to Deep Data to Support People Analytics for Employee Attrition Prediction", IEEE Access, Institute of Electrical and Electronics Engineers Inc., Vol. 9, pp. 60447–60458, doi: 10.1109/ACCESS.2021.3074559.