Predicting Student Performance Using Machine Learning for Student Management in University (Study Case ABC University)

  • Berlit Deddy Setiawan Institut Teknologi Bandung
  • Dermawan Wibisono Institut Teknologi Bandung
Keywords: Higher education institutions, Predicting students' potential academic performance, Random Forest algorithm

Abstract

Higher education institutions play a vital role in providing quality education and producing skilled human resources. In Indonesia, there is a growing demand for higher education due to population growth and increasing awareness of its importance. ABC University, currently ranked 46-50 in Indonesia Uni Rank 2023, faces challenges in the rankings. To thrive in this competitive landscape, universities must be selective in admitting qualified students and ensure effective academic development processes. Machine learning capabilities can be leveraged to predict students' potential academic performance, facilitating timely interventions and support to enhance learning outcomes. However, there is currently no research available that focuses on creating a prediction model that integrates student profiles with academic performance.

 

Highlights factors that contribute to student failure, including low academic ability, financial constraints, and geographical location. ABC University, with its limited database, requires a method to improve performance and predict students' academic performance. Machine learning techniques such as Educational Data Mining (EDM) and Random Forest can assist the university in understanding students' needs and developing effective educational policies. By leveraging these techniques, ABC University can remain competitive and enhance its students' academic performance.

 

This research aims to establish a connection between the Random Forest algorithm theory and the prediction of students' potential academic performance. The objective is to develop an accurate and efficient method for managing student affairs at ABC University. The research employs both quantitative and qualitative approaches, with a focus on analyzing numerical data and generating classification predictions. The research process begins with a thorough analysis of the business situation to understand the university's environment and determine the research topic. The researcher then establishes research boundaries, prioritizes key issues, and constructs a research framework.

 

A comprehensive literature review and Focus Group Discussions are conducted to identify research gaps and determine the factors that influence student performance. Data collection and processing take place, with the data processing phase encompassing tasks such as data cleaning, outlier removal, handling missing values, and variable transformation. The data modeling stage employs the Random Forest algorithm and the k-Folds cross-validation technique, dividing the data into training and testing sets. The evaluation stage involves assessing the model's performance using the testing data and performance metrics such as accuracy, precision, and recall.

 

This study aims to investigate the completion rates of students at ABC University categorized as Fast (3 years), On Time (3.5-4 years), and Late (4.5-6 years) studies. The dataset includes both completed and uncompleted students, and a specific treatment is provided for those who have not completed their studies. Factors that influence student performance are identified through focus group discussions and correlation testing, enabling the identification of significant independent variables.

 

The study analyzes the profiles of students who graduated in 2016-2017, combined with academic performance data. A regression test is conducted to determine the influence of 18 attributes on performance. Random Forest Machine Learning is compared to other techniques to identify the most accurate predictive model for students' academic performance. ABC University, the Random Forest model achieved a prediction rate of 89.60%.

Downloads

Download data is not yet available.

References

Al-Barrak, M. A. dan Al-Razgan, M. (2016) Predicting Students Final GPA Using Decision Trees: A Case Study. International Journal of Information and Education Technology, Vol. 6, No. 7.
Asogwa, O. C., dan Oladugba, A. V. (2015) Of Students Academic Performance Rates Using Artificial Neural Networks (ANNs). American Journal of Applied Mathematics and Statistics.
Adnyana, I Made Budi (2015) Prediksi Lama Studi Mahasiswa dengan Metode Random Forest (Studi Kasus: STIKOM Bali). CSRIS Journal. Vol.8 No.3 Oktober 2015, Hal 201-208
Anwar, Muchamad Taufiq (2021) Model Prediksi Droupout Mahasiswa Menggunakan Teknik Data Mining. Jurnal Infromatika UPGRIS Vol.7 No.1 Juni 2021.
Barthos, H. Basir. (1992) Perguruan Tinggi Swasta di Indonesia: Proses Pendirian Penyelengaraan dan Ujian. Jakarta: Bumi Aksara.
Breiman L. 2001. Random Forests. Machine Learning 45, 5-32
El-Halees, A. (2009) Mining Students Data to Analyze Learning Behavior: A Case Study. Tunisia: Conference Proceedings, University of Sfax, Tunisia.
Feldman, D. C. (2004) Managing Individual Are Group. Behavioral in Organization. New York: McGraw Hill.
Goller, et.al. (2000) Automatic Document Classification: A Thorough Evaluation of Various Methods. USA: Proceedings of International Symposium on Information Theory and Its Application, pp. 145-162.
Hastuti, K. (2012) Analisis Komparasi Algoritma Klasifikasi Data Mining untuk Prediksi Mahasiswa Non Aktif. Semarang: Seminar Nasional Teknologi Informasi & Komunikasi Terapan.
Martanto (2019) Prediksi Tingkat Kelulusan Mahasiswa Menggunakan Machine Learning dengan Teknik Deep Learning. Jurnal Informatika: Jurnal Pengembangan IT (JPIT), Vol.04 No.2-2,2019.
Mustakim. (2015) Pengembangan Aplikasi Prediksi Penyakit Berbahaya Di ProvinsiRiau Berdasarkan Model Markov Chains. Pekanbaru: Jurnal Manajemen dan Rekayasa Sistem Informasi, Vol: 1 Universitas Islam Negri Sultan Syarif Kasim Riau.
Nugroho, A. S., Witarto, A. B., & Handoko, D. (2003) Support Vector Machine dan Aplikasinya Dalam Bioinformatika.
R. S. J. D. Baker and K. Yacef, The state of educational data mining in 2009: A review and future visions, JEDM| J. Educ. Data Min., vol. 1, no. 1, pp. 3–17, 2009.
Zhang, Y dan Wang, W. (2010) Pattern Classification of Electroencephalography from the Typical Specialized Students. Education Technology and Computer Science (ETCS), International Workshop., vol. 1, pp. 836-839.
Published
2022-10-20