Binary Classification of Academic Outcomes Using Ensemble Learning and Neural Networks: A Case Study on OULAD

Authors

  • Lili Dwi Yulianto Sistem Informasi, Universitas Nasional, Jl. Sawo Manila, Pejaten, Ps. Minggu Jakarta
  • Satriawan Desmana Rekayasa Kemanan Siber, Politeknik Negeri Cilacap, Jl. Dr. Soetomo No.1, Sidakaya, Cilacap, Jawa Tengah
  • Sutikman Sutikman Bisnis Digital, Universitas Nasional, Jl. Sawo Manila, Pejaten, Ps. Minggu Jakarta
  • Winarsih Winarsih Sistem Informasi, Universitas Nasional, Jl. Sawo Manila, Pejaten, Ps. Minggu Jakarta

Keywords:

Educational Data Mining (EDM); OULAD; Feature Selection; Dense Neural Networks (DNN); Machine Learning

Abstract

The importance of academic classification in online learning platforms is increasingly recognized as it helps in assessing student performance, early detection of issues, and identifying factors that influence academic success. This study uses the Open University Learning Analytics Dataset (OULAD) to predict students' academic success in various classification areas, including Distinction vs Non-Distinction, Withdrawn vs Non-Withdrawn, Pass vs Non-Pass, and Pass vs Fail. The aim of this research is to compare machine learning and deep learning techniques, such as Random Forest, Gradient Boosting, AdaBoost, LightGBM, and Voting Classifier, with a deep learning model based on Dense Neural Networks (DNN) to produce the best possible predictions. Relevant features are also selected using feature selection and dimensionality reduction strategies, including autoencoders and Recursive Feature Elimination (RFE). The results show that LightGBM and Gradient Boosting perform best in several classifications, with an accuracy of 75.47% for Pass vs Fail. On the other hand, DNN requires further refinement but shows potential in handling more complex classifications. In addition to identifying students at risk of failing, this method provides a deeper understanding of the variables affecting academic success in online learning environments.

Downloads

Download data is not yet available.

References

Alhothali, A., Albsisi, M., Assalahi, H., & Aldosemani, T. (2022). Predicting Student Outcomes in Online Courses Using Machine Learning Techniques: A Review. Sustainability (Switzerland), 14(10), 1–23. https://doi.org/10.3390/su14106199

Almulihi, A., Saleh, H., Hussien, A. M., Mostafa, S., El-Sappagh, S., Alnowaiser, K., Ali, A. A., & Refaat Hassan, M. (2022). Ensemble Learning Based on Hybrid Deep Learning Model for Heart Disease Early Prediction. Diagnostics, 12(12), 1–17. https://doi.org/10.3390/diagnostics12123215

Al-Zawqari, A., Peumans, D., & Vandersteen, G. (2022). A flexible feature selection approach for predicting students’ academic performance in online courses. Computers and Education: Artificial Intelligence, 3(November), 100103. https://doi.org/10.1016/j.caeai.2022.100103

Buenaño-Fernández, D., Gil, D., & Luján-Mora, S. (2019). Application of machine learning in predicting performance for computer engineering students: A case study. Sustainability (Switzerland), 11(10), 1–18. https://doi.org/10.3390/su11102833

Gnat, S. (2021). Impact of categorical variables encoding on property mass valuation. Procedia Computer Science, 192, 3542–3550. https://doi.org/10.1016/j.procs.2021.09.127

Habibi, A., Delavar, M. R., Sadeghian, M. S., Nazari, B., & Pirasteh, S. (2023). A hybrid of ensemble machine learning models with RFE and Boruta wrapper-based algorithms for flash flood susceptibility assessment. International Journal of Applied Earth Observation and Geoinformation, 122(March), 103401. https://doi.org/10.1016/j.jag.2023.103401

Hasan, R., Palaniappan, S., Mahmood, S., Abbas, A., & Sarker, K. U. (2021). Dataset of students’ performance using student information system, moodle and the mobile application “edify.” Data, 6(11), 1–10. https://doi.org/10.3390/data6110110

Jawad, K., Shah, M. A., & Tahir, M. (2022). Students’ Academic Performance and Engagement Prediction in a Virtual Learning Environment Using Random Forest with Data Balancing. Sustainability (Switzerland), 14(22). https://doi.org/10.3390/su142214795

Lemay, D. J., Baek, C., & Doleck, T. (2021). Comparison of learning analytics and educational data mining: A topic modeling approach. Computers and Education: Artificial Intelligence, 2(March), 100016. https://doi.org/10.1016/j.caeai.2021.100016

Natras, R., Soja, B., & Schmidt, M. (2022). Ensemble Machine Learning of Random Forest, AdaBoost and XGBoost for Vertical Total Electron Content Forecasting. Remote Sensing, 14(15), 1–34. https://doi.org/10.3390/rs14153547

Renò, V., Stella, E., Patruno, C., Capurso, A., Dimauro, G., & Maglietta, R. (2022). Learning Analytics: Analysis of Methods for Online Assessment. Applied Sciences (Switzerland), 12(18), 1–10. https://doi.org/10.3390/app12189296

Romero, C., & Ventura, S. (2010). Educational data mining: A review of the state of the art. IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews, 40(6), 601–618. https://doi.org/10.1109/TSMCC.2010.2053532

Trishna, T. I., Emon, S. U., Ema, R. R., Sajal, G. I. H., Kundu, S., & Islam, T. (2019). Detection of Hepatitis (A, B, C and E) Viruses Based on Random Forest, K-nearest and Naïve Bayes Classifier. 2019 10th International Conference on Computing, Communication and Networking Technologies, ICCCNT 2019, 1–7. https://doi.org/10.1109/ICCCNT45670.2019.8944455

Yağcı, M. (2022). Educational data mining: prediction of students’ academic performance using machine learning algorithms. Smart Learning Environments, 9(1). https://doi.org/10.1186/s40561-022-00192-z

Yahya, A. A., Sulaiman, A. A., Mashraqi, A. M., Zaidan, Z. M., & Halawani, H. T. (2021). Toward a better understanding of academic programs educational objectives: A data analytics-based approach. Applied Sciences (Switzerland), 11(20). https://doi.org/10.3390/app11209623

Zhang, P., Ma, Z., Ren, Z., Wang, H., Zhang, C., Wan, Q., & Sun, D. (2024). Design of an Automatic Classification System for Educational Reform Documents Based on Naive Bayes Algorithm. Mathematics, 12(8), 1127. https://doi.org/10.3390/math12081127

Downloads

Published

2025-07-28

How to Cite

Yulianto, L. D., Satriawan Desmana, Sutikman, S., & Winarsih, W. (2025). Binary Classification of Academic Outcomes Using Ensemble Learning and Neural Networks: A Case Study on OULAD. Jurnal Info Sains : Informatika Dan Sains, 15(01), 151–163. Retrieved from https://ejournal.seaninstitute.or.id/index.php/InfoSains/article/view/7005