Predicting Student's Soft Skills Based on Socio-Economical Factors: An Educational Data Mining Approach

Rathimala Kannan - PSG Institute of Management, PSG College of Technology, Coimbatore, India
Chew Chin Jet - Multimedia University, Cyberjaya, 63100, Malaysia
Kannan Ramakrishnan - Multimedia University, Cyberjaya, 63100, Malaysia
Sujatha Ramdass - PSG College of Technology, Coimbatore, Tamil Nādu, India

Citation Format:



Recent changes in the labor market and higher education sector have made graduates' employability a priority for researchers, governments, and employers in developed and emerging nations. There is, however, still a dearth of study about whether graduate students acquire the employability skills that businesses want of them because of their higher education. To determine a student's future employment and career path, it is critical to evaluate their soft skills. An emerging area called educational data mining (EDM) aims to gather enormous volumes of academic data produced and maintained by educational institutions and to derive explicit and specific information from it. This paper aims to predict students' soft skills such as professional, analytical, linguistic, communication, and ethical skills, based on their socio-economic, academic, and institutional data by leveraging data mining methods and machine learning techniques. All five soft skills were predicted using prediction models created using linear regression, probabilistic neural networks, and simple regression tree techniques. This study used a dataset from an open source that Universidad Technologica de Bolivar published. It covers academic, social, and economic data for 12,411 students. The experimental results demonstrated that the linear regression algorithm performed better than the others in predicting all five soft skills compared to machine learning methods. This finding can assist higher education institutions in making informed decisions, providing tailored support, enhancing student success and employability, and continuously modifying their programs to meet the needs of students.


Prediction; Machine learning; regression models; soft skills; higher education institutes.

Full Text:



S. M. Dol and P. M. Jawandhiya, "Classification Technique and its Combination with Clustering and Association Rule Mining in Educational Data Mining — A survey," Eng Appl Artif Intell, vol. 122, p. 106071, Jun. 2023, doi: 10.1016/J.ENGAPPAI.2023.106071.

N. N. Sanchez-Pozo, J. S. Mejia-Ordonez, D. C. Chamorro, D. Mayorca-Torres, and D. H. Peluffo-Ordonez, "Predicting High School Students' Academic Performance: A Comparative Study of Supervised Machine Learning Techniques," Future of Educational Innovation Workshop Series - Machine Learning-Driven Digital Technologies for Educational Innovation Workshop 2021, 2021, doi: 10.1109/IEEECONF53024.2021.9733756.

N. R. Beckham, L. J. Akeh, G. N. P. Mitaart, and J. V Moniaga, "Determining factors that affect student performance using various machine learning methods," Procedia Comput Sci, vol. 216, pp. 597–603, Jan. 2023, doi: 10.1016/J.PROCS.2022.12.174.

R. Alcaraz, A. Martinez-Rodrigo, R. Zangroniz, and J. J. Rieta, "Early Prediction of Students at Risk of Failing a Face-to-Face Course in Power Electronic Systems," IEEE Transactions on Learning Technologies, vol. 14, no. 5, pp. 590–603, Oct. 2021, doi: 10.1109/TLT.2021.3118279.

D. Olaya, J. Vásquez, S. Maldonado, J. Miranda, and W. Verbeke, "Uplift Modeling for preventing student dropout in higher education," Decis Support Syst, vol. 134, p. 113320, Jul. 2020, doi: 10.1016/J.DSS.2020.113320.

M. Yağcı, "Educational data mining: prediction of students' academic performance using machine learning algorithms," Smart Learning Environments, vol. 9, no. 1, pp. 1–19, Dec. 2022, doi: 10.1186/S40561-022-00192-Z/TABLES/14.

Y. Cheng, B. Pereira Nunes, and R. Manrique, "Not Another Hardcoded Solution to the Student Dropout Prediction Problem: A Novel Approach Using Genetic Algorithms for Feature Selection," Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 13284 LNCS, pp. 238–251, 2022, doi: 10.1007/978-3-031-09680-8_23.

S. Alturki, N. Alturki, and H. Stuckenschmidt, "Using Educational Data Mining to Predict Students' Academic Performance for Applying Early Interventions," Journal of Information Technology Education: Innovations in Practice, vol. 20, pp. 121–137, Jul. 2021, doi: 10.28945/4835.

R. Kannan et al., "Advancements in Machine Learning Techniques for Educational Data Mining: An Overview of Perspectives and Trends," International Journal of Membrane Science and Technology, vol. 10, no. 3, pp. 1820–1839, Sep. 2023, doi: 10.15379/IJMST.V10I3.1841.

A. A. Al Hassani and S. Wilkins, "Student retention in higher education: the influences of organizational identification and institution reputation on student satisfaction and behaviors," International Journal of Educational Management, vol. 36, no. 6, pp. 1046–1064, Jan. 2022, doi: 10.1108/IJEM-03-2022-0123.

S. Rajendran, S. Chamundeswari, and A. A. Sinha, "Predicting the academic performance of middle- and high-school students using machine learning algorithms," Social Sciences & Humanities Open, vol. 6, no. 1, p. 100357, Jan. 2022, doi: 10.1016/J.SSAHO.2022.100357.

O. Ojajuni et al., "Predicting Student Academic Performance Using Machine Learning," Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12957 LNCS, pp. 481–491, 2021, doi: 10.1007/978-3-030-87013-3_36.

D. Alboaneen, M. Almelihi, R. Alsubaie, R. Alghamdi, L. Alshehri, and R. Alharthi, "Development of a Web-Based Prediction System for Students’ Academic Performance," Data 2022, Vol. 7, Page 21, vol. 7, no. 2, p. 21, Jan. 2022, doi: 10.3390/DATA7020021.

A. Asefer and Z. Abidin, "Soft Skills and Graduates' Employability In The 21st Century From Employers' Perspectives: A Review Of Literature," International Journal of Infrastructure Research and Management, vol. 9, no. 2, pp. 44–59, 2021, [Online]. Available:

Ã. Escolà -Gascón and J. Gallifa, "How to measure soft skills in the educational context: psychometric properties of the SKILLS-in-ONE questionnaire," Studies in Educational Evaluation, vol. 74, p. 101155, Sep. 2022, doi: 10.1016/J.STUEDUC.2022.101155.

M. Galster, A. Mitrovic, S. Malinen, J. Holland, and P. Peiris, "Soft skills required from software professionals in New Zealand," Inf Softw Technol, vol. 160, p. 107232, Aug. 2023, doi: 10.1016/J.INFSOF.2023.107232.

S. DeArmond, B. L. Rau, J. Buelow-Fischer, A. Desai, and A. J. Miller, "Teaching professional skills during the pandemic: Does delivery mode matter?," The International Journal of Management Education, vol. 21, no. 2, p. 100770, Jul. 2023, doi: 10.1016/J.IJME.2023.100770.

K. M. Peesker, P. D. Kerr, W. Bolander, L. J. Ryals, J. A. Lister, and H. F. Dover, "Hiring for sales success: The emerging importance of salesperson analytical skills," J Bus Res, vol. 144, pp. 17–30, May 2022, doi: 10.1016/J.JBUSRES.2022.01.070.

N. Sghir, A. Adadi, and M. Lahmer, "Recent advances in Predictive Learning Analytics: A decade systematic review (2012–2022)," Educ Inf Technol (Dordr), pp. 1–35, Dec. 2022, doi: 10.1007/S10639-022-11536-0/FIGURES/1.

T. Mansouri, A. Z. href=' https://orcid. org/0000-0002-9477-0676' class=’orcid-link'>orcid.gif', and A. Ashrafi, "A Learning Fuzzy Cognitive Map (LFCM) Approach to Predict Student Performance," Journal of Information Technology Education: Research, vol. 20, pp. 221–243, May 2021, doi: 10.28945/4760.

N. Sharma and M. Yadav, "A Comparative Analysis of Students' Academic Performance using Prediction Algorithms Based on Their Time Spent on Extra-Curricular Activities," Proceedings of the 2022 3rd International Conference on Intelligent Computing, Instrumentation and Control Technologies: Computational Intelligence for Smart Systems, ICICICT 2022, pp. 745–750, 2022, doi: 10.1109/ICICICT54557.2022.9917606.

O. B. O. Damoah, A. A. Peprah, and K. O. Brefo, "Does higher education equip graduate students with the employability skills employers require? The perceptions of employers in Ghana,", vol. 45, no. 10, pp. 1311–1324, 2021, doi: 10.1080/0309877X.2020.1860204.

S. Ranjeeth, T. P. Latchoumi, and P. V. Paul, "A Survey on Predictive Models of Learning Analytics," Procedia Comput Sci, vol. 167, pp. 37–46, Jan. 2020, doi: 10.1016/J.PROCS.2020.03.180.

J. L. Rastrollo-Guerrero, J. A. Gómez-Pulido, and A. Durán-Domínguez, "Analyzing and Predicting Students' Performance by Means of Machine Learning: A Review," Applied Sciences 2020, Vol. 10, Page 1042, vol. 10, no. 3, p. 1042, Feb. 2020, doi: 10.3390/APP10031042.

C. Schröer, F. Kruse, and J. M. Gómez, "A Systematic Literature Review on Applying CRISP-DM Process Model," Procedia Comput Sci, vol. 181, pp. 526–534, Jan. 2021, doi: 10.1016/J.PROCS.2021.01.199.

E. Delahoz-Dominguez, R. Zuluaga, and T. Fontalvo-Herrera, "Dataset of academic performance evolution for engineering students," Data Brief, vol. 30, p. 105537, 2020, doi: 10.1016/j.dib.2020.105537.

R. Zuluaga-Ortiz, A. Camelo-Guarín, and E. Delahoz-Domínguez, "Efficiency analysis trees as a tool to analyze the quality of university education," International Journal of Electrical and Computer Engineering, vol. 13, no. 4, pp. 4412–4421, Aug. 2023, doi: 10.11591/IJECE.V13I4.PP4412-4421.

E. De La Hoz, "Data of Academic Performance evolution for Engineering Students," vol. 1, 2020, doi: 10.17632/83TCX8PSXV.1.

B. Panigrahi, K. C. R. Kathala, and M. Sujatha, "A Machine Learning-Based Comparative Approach to Predict the Crop Yield Using Supervised Learning With Regression Models," Procedia Comput Sci, vol. 218, pp. 2684–2693, Jan. 2023, doi: 10.1016/J.PROCS.2023.01.241.

J. Behera, A. K. Pasayat, H. Behera, and P. Kumar, "Prediction based mean-value-at-risk portfolio optimization using machine learning regression algorithms for multi-national stock markets," Eng Appl Artif Intell, vol. 120, p. 105843, Apr. 2023, doi: 10.1016/J.ENGAPPAI.2023.105843.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

JOIV : International Journal on Informatics Visualization
ISSN 2549-9610  (print) | 2549-9904 (online)
Organized by Society of Visual Informatocs, and Institute of Visual Informatics - UKM and Soft Computing and Data Mining Centre - UTHM
W :
E :,,

View JOIV Stats

Creative Commons License is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.