Utilizing Random Forest Method for Predicting Student Dropout Risk in Madrasah Environments

  • Muhammad Mahsun Maulana Malik Ibrahim State Islamic University of Malang, Indonesia
  • M. Amin Hariyadi Maulana Malik Ibrahim State Islamic University of Malang, Indonesia
  • Sri Harini Maulana Malik Ibrahim State Islamic University of Malang, Indonesia
Keywords: Student Dropout Prediction, Random Forest, Machine Learning, Madrasah Miftahul Ulum

Abstract

The phenomenon of school dropout represents a crucial issue with negative impacts on educational institution performance, social stability, and national development. Consequently, the early detection of high-risk students constitutes a strategic preventive measure. This research aims to develop an accurate predictive model using a Machine Learning approach. The study employed a comparative evaluation using classification algorithms, with the primary focus being the performance analysis of the Random Forest Classifier. The dataset utilized, comprising 1,763 student records, underwent a rigorous data pre-processing phase, including data cleaning, variable transformation, and class imbalance handling, to ensure high-quality input. The model was trained using a Random Seed configuration of 75 to guarantee experimental reproducibility and consistency in evaluation results. Experimental findings indicate that the Random Forest algorithm provided the best performance, achieving an accuracy of 82.0% and a precision of 83.8%. This superior performance confirms the model's effectiveness in identifying the key determinants of dropout, stemming from both students' internal and external factors. Based on these results, the research recommends the application of Random Forest as a Decision Support System  instrument to facilitate targeted interventions, including medical support, economic assistance, and academic counseling. Future research is advised to integrate historical counseling data to further enhance the prediction sensitivity of the model.

 

 

Downloads

Download data is not yet available.

References

A. Tayebi, J. Gomez, and C. Delgado, “Analysis on the Lack of Motivation and Dropout in Engineering Students in Spain,” IEEE Access, vol. 9, pp. 66253–66265, 2021, doi: 10.1109/ACCESS.2021.3076751.

L. Masserini and M. Bini, “Does joining social media groups help to reduce students’ dropout within the first university year?,” Socio-Economic Planning Sciences, vol. 73, p. 100865, Feb. 2021, doi: 10.1016/j.seps.2020.100865.

M. Utari, B. Warsito, and R. Kusumaningrum, “Implementation of data mining for drop-out prediction using Random Forest method,” Proc. 8th Int. Conf. Inf. Commun. Technol. (ICoICT), pp. 1–5, Yogyakarta, Indonesia, Jun. 2020, doi: 10.1109/ICoICT49345.2020.9166276.

T. Devasia, V. T. P., and V. Hegde, “Prediction of students performance using educational data mining,” Proc. Int. Conf. Data Mining Adv. Comput. (SAPIENCE), pp. 91–95, Ernakulam, India, Mar. 2016, doi: 10.1109/SAPIENCE.2016.7684167.

M. N. Haque, M. S. Islam, M. M. Rahman, and R. Jannat, “Student performance prediction using machine learning techniques,” J. Inf. Knowl. Manage., 2020, doi: 10.1142/S0219649220500344.

K. Hastuti, D. Lestari, and Hartono, “Prediction of student dropout using Random Forest algorithm,” Int. J. Adv. Comput. Sci. Appl., vol. 12, no. 3, 2021, doi: 10.14569/IJACSA.2021.012345.

D. Kabakchieva, “Predicting student performance by using data mining methods,” Int. J. Comput. Sci. Manage. Res., vol. 2, no. 1, 2013, doi: 10.2478/cait-2013-0006.

S. Kotsiantis, C. Pierrakeas, and P. Pintelas, “Predicting students’ performance in distance learning using machine learning techniques,” Appl. Artif. Intell., vol. 18, no. 5, pp. 411–426, 2004, doi: 10.1080/08839510490256532.

S. Zhang, “Fundamental techniques in data preprocessing for machine learning,” J. Big Data, vol. 6, 2019.

L. Aulck, N. Velagapudi, J. Blumenstock, and J. West, “Predicting student dropout in higher education,” Proc. Int. Educ. Data Mining Soc., 2016.

S. Banerjee and S. Ruj, “Application of Random Forest in educational data mining for predicting student performance,” Int. J. Comput. Sci. Inf. Secur., vol. 18, no. 1, 2020.

L. Breiman, “Random forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32, Jan. 2001, doi: 10.1023/A:1010933404324.

C. Márquez-Vera, A. Cano, and C. Romero, “Predicting school failure using data mining,” Appl. Intell., vol. 38, no. 1, pp. 63–75, 2013, doi: 10.1007/s10489-013-0400-3.

S. Sathyanarayanan, “Confusion Matrix-Based Performance Evaluation Metrics,” AJBR, pp. 4023–4031, Nov. 2024, doi: 10.53555/AJBR.v27i4S.4345.

G. Zeng, “Invariance Properties and Evaluation Metrics Derived from the Confusion Matrix in Multiclass Classification,” Mathematics, vol. 13, no. 16, p. 2609, Aug. 2025, doi: 10.3390/math13162609.

P. Contreras, J. Orellana-Alvear, P. Muñoz, J. Bendix, and R. Célleri, “Influence of Random Forest Hyperparameterization on Short-Term Runoff Forecasting in an Andean Mountain Catchment,” Atmosphere, vol. 12, no. 2, p. 238, Feb. 2021, doi: 10.3390/atmos12020238.

H. Bichri, A. Chergui, and H. Mustapha, “Investigating the impact of train/test split ratio on the performance of pre-trained models with custom datasets,” Int. J. Adv. Comput. Sci. Appl., vol. 15, no. 2, pp. 154–161, Feb. 2024, doi: 10.14569/IJACSA.2024.0150235.

D. Temesgen and A. Ambelu, “Student dropout prediction using machine learning techniques: A comparative study,” Education and Information Technologies, vol. 28, pp. 3425–3438, Jan. 2023, doi: 10.1007/s10639-022-11463-y.

E. E. Osemwegie, F. I. Amadin, and O. M. Uduehi, “Student Dropout Prediction Using Machine Learning,” FJS, vol. 7, no. 6, pp. 347–353, Dec. 2023, doi: 10.33003/fjs-2023-0706-2103.

K. Schouten, F. Frasincar, and R. Dekker, “An Information Gain-Driven Feature Study for Aspect-Based Sentiment Analysis,” in Natural Language Processing and Information Systems, E. Métais, F. Meziane, M. Saraee, V. Sugumaran, and S. Vadera, Eds., Lecture Notes in Computer Science, vol. 9612, Cham: Springer International Publishing, 2016, pp. 48–59. doi: 10.1007/978-3-319-41754-7_5.

S. Raste, R. Singh, J. Vaughan, and V. N. Nair, “Quantifying Inherent Randomness in Machine Learning Algorithms,” SSRN Journal, 2022, doi: 10.2139/ssrn.4146989.

Published
2025-12-10
Abstract views: 63 times
Download PDF: 37 times
How to Cite
Mahsun, M., Hariyadi, M. A., & Harini, S. (2025). Utilizing Random Forest Method for Predicting Student Dropout Risk in Madrasah Environments. Journal of Information Systems and Informatics, 7(4), 3434-3453. https://doi.org/10.63158/journalisi.v7i4.1364
Section
Articles