Early Dropout Prediction in Online Learning of University using Machine Learning

Hee Sun Park - Department of Computer Science, Sejong University, 209 Neungdong-ro Gwanging-gu , Korea ,Seoul, 05006, South Korea
Seong Joon Yoo - Department of Computer Science, Sejong University, 209 Neungdong-ro Gwanging-gu , Korea ,Seoul, 05006, South Korea

Citation Format:

DOI: http://dx.doi.org/10.30630/joiv.5.4.732


Recently, most universities plan to open or open online learning courses, but the problem of  dropout of online learning  is still a problem for universities. Online learning has the advantage of being able to receive education anytime, anywhere, but it is true that the dropout rate is higher than offline classes because you have to manage and control your own study time without the help of a professor or manager. Therefore, it is very important for professors and managers to support students in a timely act to avoid the risk of dropout of university online classes. This study used the access log data recorded in the Learning Management System (LMS) and the learner's statistical information and calculated data, and aims to present predictive algorithms suitable for online learning dropout early prediction systems at universities. This study features a 7-year online learning history log data recorded in the Cyber University LMS system to overcome the data count limitations of existing studies and predict the risk of drop-out during the learning period.  The characteristics of the data you utilized were used to validate the availability of predictive models by applying learner statistical information, number of system connections, number of lectures, previous semester grade data, machine learning based decision tree, arbitrary forest (RF), support vector machine (SVM) and deep learning (DNN). Studies show that random forest (RF) algorithms have the best prediction and performance, and deep learning algorithms also apply to learning management (LMS) systems.


Dropout prediction; online learning; machine learning; deep learning.

Full Text:



Eun-mo Sung, Sung-Hee Jin, and Mi-na Yoo, “Exploring Learning Data for Supporting Self-Directed Learning in the Perspective of Learning Analytics,†Journal of Educational Technology, vol. 32, no. 3, pp. 487-533, Sep. 2016.

Ya-Han Hu, Chia-Lun Lo, Sheng-Pao Shih, "Developing early warning systems to predict students’ online learning performance," Computers in Human Behavior, vol.36, pp. 469-478, 2014.

Jae-Hoon Han, Suk-Jin Kwon, Jong-Sun Park," (Re)Binding the Factors Affecting Student Learning Outcomes in a Cyber University Using the 3P Model: Learning Analytics Approaches," Korean Association for Educational Information and Media, vol. 21, no. 2, pp. 309-332, Jun. 2015.

Jae-won Choi, “A study on the Use of Data Science in Learning Process Management: Focusing on the Risk Prediction,†(Masters dissertation). Ajou University, Seoul, Korea, 2016.

W. Xing, D. Du, “Dropout Prediction in MOOCs: Using Deep Learning for Personalized Intervention,†Journal of Educational Computing Research, vol. 57, no. 3, pp. 547-570, 2019.

R. Al-Shabandar, A. Hussain, A. Laws, R. Keight, J. Lunn, N. Radi, "Machine learning approaches to predict learning outcomes in Massive open online courses," Proceedings of the International Joint Conference on Neural Networks, vol. 2017-May, no. 7965922, pp. 713-720, Jun 2017.

Y. Lee, D. Shin, H. Loh, (...), B. Kim, Y. Choi, "Deep attentive study session dropout prediction in mobile learning environment," CSEDU 2020 - Proceedings of the 12th International Conference on Computer Supported Education, vo. 1, pp. 26-35, May 2020.

B. E. Shelton, J.-L. Hung, P.R. Lowenthal, "Predicting student success by modeling student interaction in asynchronous online courses," Distance Education, vol. 38, pp. 59-69, Jan 2017.

F. Dalipi, A.S. Imran, Z.Kastrati, "MOOC dropout prediction using machine learning techniques: Review and research challenges," IEEE Global Engineering Education Conference, EDUCON, vol. 2018-April, pp. 1007-1014, May 2018.

J. Gardner, C. Brooks, “Student success prediction in MOOCs,†User Modeling and User-Adapted Interaction, vol. 28, pp. 127-203, Jun 2018.

O. W. Adejo, T. Connolly, "Predicting student academic performance using multi-model heterogeneous ensemble approach," Journal of Applied Research in Higher Education, vol. 10, pp. 61-75, 2018.

V. L. Miguéis, A. Freitas, P. J. V. Garcia, A. Silva, "Early segmentation of students according to their academic performance: A predictive modelling approach," Decision Support Systems, vol. 115, pp. 36-51, Nov 2018.

Y.Qu, B. Fang, W. Zhang, R. Tang, M. Niu, H. Guo, Y. Yu, X. He, "Product-based neural networks for user response prediction over multi-field categorical data," ACM Transactions on Information Systems, vol. 37, art. no. a3, Oct 2018.

M. Hussain, W. Zhu, W. Zhang, S. M. R. Abidi, "Student Engagement Predictions in an e-Learning System and Their Impact on Student Course Assessment Scores Open Access," Computational Intelligence and Neuroscience, vol. 2018, art. no. 6347186, 2018.

J. L. Rastrollo-Guerrero, J. A. Gómez-Pulido, A. Durán-Domínguez, "Analyzing and predicting students' performance by means of machine learning: A review," Applied Sciences (Switzerland), vol. 10, art. no. 1042, Feb 2020.

J. A. Ruipérez-Valiente, R. Cobos, P. J. Muñoz-Merino, Ã. Andujar, C.D. Kloos, "Early prediction and variable importance of certificate accomplishment in a MOOC," Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 10254 LNCS, pp. 263-272, May 2017.

W. Zhang, X. Huang, S. Wang, J. Shu, H. Liu, H. Chen, "Student performance prediction via online learning behavior analytics," Proceedings - 2017 International Symposium on Educational Technology, art. no. 8005410, pp. 153-157, Aug 2017.

J.-L. Hung, B. E. Shelton, J. Yang, X. Du, "Improving Predictive Modeling for At-Risk Student Identification: A Multistage Approach," IEEE Transactions on Learning Technologies, vol.12, no.2, art. no. 8691494, pp. 148-157, 2019.

A. Alamri, M. Alshehri, A. Cristea, F. D. Pereira, E. Oliveira, L. Shi, C. Stewart, "Predicting MOOCs dropout using only two easily obtainable features from the first week’s activities," Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11528 LNCS, pp. 163-173, 2019.

D. Koehn, S. Lessmann, M. Schaal, "Predicting online shopping behaviour from clickstream data using deep learning," Expert Systems with Applications, vol. 150, art. no. 113342, Jul 2020.

X. Ma, Z. Zhou, "Student pass rates prediction using optimized support vector machine and decision tree," 2018 IEEE 8th Annual Computing and Communication Workshop and Conference, vol. 2018-January, pp. 209-215, Jan 2018.

C. -H. Yu, J. Wu, A. -C. Liu, "Predicting learning outcomes with MOOC clickstreams," Education Sciences, vol. 9, art. no. 104, Jun 2019.

K. Limsathitwong, K. Tiwatthanont, T. Yatsungnoen, "Dropout prediction system to reduce discontinue study rate of information technology students," Proceedings of 2018 5th International Conference on Business and Industrial Research: Smart Technology for Next Generation of Information, Engineering, Business and Social Science,pp. 110-114, May 2018.

G. Kostopoulos, S. Kotsiantis, C. Pierrakeas, G. Koutsonikos, G. A. Gravvanis, "Forecasting students' success in an open university," International Journal of Learning Technology, vol. 13, pp.26-43, 2018.

L. C. Sorensen, "“Big Data†in Educational Administration: An Application for Predicting School Dropout Risk," Educational Administration Quarterly, vol. 55, pp. 404-446, 2019.

K. Coussement, M. Phan, A. De Caigny, D. FBenoit, A. Raes, "Predicting student dropout in subscription-based online learning environments: The beneficial impact of the logit leaf model," Decision Support Systems, vol.135, art. no. 113325, 2020.

N. R. Aljohani, A. Fayoumi, S. -U. Hassan, "Predicting at-risk students using clickstream data in the virtual learning environment," Sustainability (Switzerland), vol.11, art. no. 7238, 2019.

R. Raga, J. Raga, "Early prediction of student performance in blended learning courses using deep neural networks," Proceedings - 2019 International Symposium on Educational Technology, ISET 2019, art. no. 8782240, pp. 39-43, 2019.

J. Lagus, K. Longi, A. Klami, A. Hellas,, "Transfer-learning methods in programming course outcome prediction," ACM Transactions on Computing Education, vol. 18, art. no. 19, 2018.

P. E. Ramírez, E. E. Grandón, "Prediction of student dropout in a Chilean public university through classification based on decision trees with optimized parameters," Formacion Universitaria, vol. 11, pp. 3-10, Jun 2018