Predicting Factors that Affect East Asian Students’ Reading Proficiency in PISA

Adeline Low - Multimedia University, Cyberjaya, 63100, Malaysia
Amy Lim - Multimedia University, Cyberjaya, 63100, Malaysia
Fang-Fang Chua - Multimedia University, Cyberjaya, 63100, Malaysia

Citation Format:



Teachers, schools, and parents contribute to equipping students with essential knowledge and skills during their education years. When students are approaching the end of their education, they are randomly selected to participate in Program for International Student Assessment (PISA) to assess their reading proficiency. Existing work on analyzing PISA achievement results concentrates solely on identifying factors related to Parent or in combination with Student. Limited work has been proposed on how factors related to Teacher and School affect the students’ reading proficiency in PISA. This study focuses on identifying the factors related to Teacher and/or School that affect East Asian students’ reading proficiency in PISA. The PISA achievement results from East Asian students are chosen as the domain study because they are consistently the top performers in PISA in the past decade. Decision Tree (DT), Naïve Bayes (NB), K-Nearest Neighbors (KNN) and Random Forest (RF) are compared. Hamming score is used as the evaluation metric. The results indicate that RF produces the best predictive models with highest Hamming score of 0.8427. Based on the findings, School-related factors such as the number of school’s disciplinary cases, size of the school, the availability of computers with Internet facilities, the quality and educational qualifications of teachers have higher impact on the PISA achievement results. The identified factors can be used as a reference in assessing the current school’s teaching, learning environment, and organizing extra activities as part of intervention programs to cultivate reading habits and enhance reading abilities among students.


Data Mining; PISA; Reading Domain; Teacher.

Full Text:



R. C. Anderson, “Becoming a nation of readers: The report of the Commission on Reading.,†1985.

A. Talwar et al., “Early Academic Success in College: Examining the Contributions of Reading Literacy Skills, Metacognitive Reading Strategies, and Reading Motivation,†Journal of College Reading and Learning, vol. 53, no. 1, pp. 58–87, Jan. 2023, doi: 10.1080/10790195.2022.2137069.

K. Nyarko, N. Kugbey, C. C. Kofi, Y. A. Cole, and K. I. Adentwi, “En4glish Reading Proficiency and Academic Performance Among Lower Primary School Children in Ghana,†Sage Open, vol. 8, no. 3, p. 215824401879701, Apr. 2018, doi: 10.1177/2158244018797019.

L. Stoffelsma and W. Spooren, “The Relationship Between English Reading Proficiency and Academic Achievement of First-Year Science and Mathematics Students in a Multilingual Context,†Int J Sci Math Educ, vol. 17, no. 5, pp. 905–922, Jun. 2019, doi: 10.1007/s10763-018-9905-z.

Oecd, “PISA 2018 results: Combined executive summaries,†J Chem Inf Model., vol. 53, no. 9, pp. 1689–1699, 2019.

N. Aksu, G. Aksu, and S. Saracaloglu, “Prediction of the Factors Affecting PISA Mathematics Literacy of Students from Different Countries by Using Data Mining Methods,†International Electronic Journal of Elementary Education, vol. 14, no. 5, pp. 613–629, 2022.

A. Bozak and E. C. Aybek, “Comparison of Artificial Neural Networks and Logistic Regression Analysis in PISA Science Literacy Success Prediction.,†International Journal of Contemporary Educational Research, vol. 7, no. 2, pp. 99–111, 2020.

O. Lezhnina and G. Kismihók, “Combining statistical and machine learning methods to explore German students’ attitudes towards ICT in PISA,†International Journal of Research & Method in Education, vol. 45, no. 2, pp. 180–199, Mar. 2022, doi: 10.1080/1743727X.2021.1963226.

S. Kılıç Depren and Ö. Depren, “Cross-Cultural Comparisons of the Factors Influencing the High Reading Achievement in Turkey and China: Evidence from PISA 2018,†The Asia-Pacific Education Researcher, vol. 31, no. 4, pp. 427–437, Aug. 2022, doi: 10.1007/s40299-021-00584-8.

C. Nunes, T. Oliveira, M. Castelli, and F. Cruz-Jesus, “Determinants of academic achievement: How parents and teachers influence high school students’ performance,†Heliyon, vol. 9, no. 2, p. e13335, Feb. 2023, doi: 10.1016/j.heliyon.2023.e13335.

S. Li, X. Liu, Y. Yang, and J. Tripp, “Effects of Teacher Professional Development and Science Classroom Learning Environment on Students’ Science Achievement,†Res Sci Educ, vol. 52, no. 4, pp. 1031–1053, Aug. 2022, doi: 10.1007/s11165-020-09979-x.

J. G. Mora-Ruano, M. Schurig, and E. Wittmann, “Instructional Leadership as a Vehicle for Teacher Collaboration and Student Achievement. What the German PISA 2015 Sample Tells Us,†Front Educ (Lausanne), vol. 6, p. 582773, Feb. 2021, doi: 10.3389/feduc.2021.582773.

X. Dong and J. Hu, “An Exploration of Impact Factors Influencing Students’ Reading Literacy in Singapore with Machine Learning Approaches,†Int J Engl Linguist, vol. 9, no. 5, p. 52, Aug. 2019, doi: 10.5539/ijel.v9n5p52.

H. Lee, “What drives the performance of Chinese urban and rural secondary schools: A machine learning approach using PISA 2018,†Cities, vol. 123, p. 103609, Apr. 2022, doi: 10.1016/j.cities.2022.103609.

H. Lee and J.-W. Lee, “Why East Asian students perform better in mathematics than their peers: An investigation using a machine learning approach,†2021.

C. Ding, “Examining the context of better science literacy outcomes among U.S. schools using visual analytics: A machine learning approach,†International Journal of Educational Research Open, vol. 3, p. 100191, 2022, doi: 10.1016/j.ijedro.2022.100191.

Y. Wang, R. King, J. Haw, and S. on Leung, “What explains Macau students’ achievement? An integrative perspective using a machine learning approach ( ¿Cuál es la explicación del rendimiento de los estudiantes macaenses? Una perspectiva integradora mediante la adopción del enfoque del aprendizaje automático ),†Journal for the Study of Education and Development, vol. 46, no. 1, pp. 71–108, Jan. 2023, doi: 10.1080/02103702.2022.2149120.

T. Luo and Y. Peng, “The analysis of influencing factors on the value dimension of Asian students’ global competence - based on PISA 2018,†in 2021 16th International Conference on Computer Science & Education (ICCSE), IEEE, Aug. 2021, pp. 1130–1134. doi: 10.1109/ICCSE51940.2021.9569461.

B. Tan and M. Cutumisu, “Employing Tree-based Algorithms to Predict Students’ Self-Efficacy in PISA 2018,†in Proceedings of the 15th International Conference on Educational Data Mining, 2022, p. 634.

A. Gamazo and F. Martínez-Abad, “An Exploration of Factors Linked to Academic Performance in PISA 2018 Through Data Mining Techniques,†Front Psychol, vol. 11, p. 575167, Nov. 2020, doi: 10.3389/fpsyg.2020.575167.

J. Y. Haw and R. B. King, “Understanding Filipino students’ achievement in PISA: The roles of personal characteristics, proximal processes, and social contexts,†Social Psychology of Education, vol. 26, no. 4, pp. 1089–1126, Aug. 2023, doi: 10.1007/s11218-023-09773-3.

Z. Yujiao, L. W. Ang, S. Shaomin, and S. Palaniappan, “Dropout Prediction Model for College Students in MOOCs Based on Weighted Multi-feature and SVM,†Journal of Informatics and Web Engineering, vol. 2, no. 2, pp. 29–42, 2023, doi: 10.33093/jiwe.2023.2.2.3.

H. S. Park and S. J. Yoo, “Early Dropout Prediction in Online Learning of University using Machine Learning,†JOIV : International Journal on Informatics Visualization, vol. 5, no. 4, p. 347, Dec. 2021, doi: 10.30630/joiv.5.4.732.

Y. Zheng, Z. Gao, Y. Wang, and Q. Fu, “MOOC Dropout Prediction Using FWTS-CNN Model Based on Fused Feature Weighting and Time Series,†IEEE Access, vol. 8, pp. 225324–225335, 2020, doi: 10.1109/ACCESS.2020.3045157.

N. M. Alruwais, “Deep FM-Based Predictive Model for Student Dropout in Online Classes,†IEEE Access, vol. 11, pp. 96954–96970, 2023, doi: 10.1109/ACCESS.2023.3312150.

Y. Tong and Z. Zhan, “An evaluation model based on procedural behaviors for predicting MOOC learning performance: students’ online learning behavior analytics and algorithms construction,†Interactive Technology and Smart Education, vol. 20, no. 3, pp. 291–312, Sep. 2023, doi: 10.1108/ITSE-10-2022-0133.

D. Fahrudy and S. ’Uyun, “Classification of Student Graduation using Naïve Bayes by Comparing between Random Oversampling and Feature Selections of Information Gain and Forward Selection,†JOIV : International Journal on Informatics Visualization, vol. 6, no. 4, p. 798, Dec. 2022, doi: 10.30630/joiv.6.4.982.

R. Mehdi and M. Nachouki, “A neuro-fuzzy model for predicting and analyzing student graduation performance in computing programs,†Educ Inf Technol (Dordr), vol. 28, no. 3, pp. 2455–2484, Mar. 2023, doi: 10.1007/s10639-022-11205-2.

H. Mohd Nasir, N. M. A. Brahin, F. E. Mohd Sani @ Ariffin, M. S. Mispan, and N. H. Abd Wahab, “AI Educational Mobile App using Deep Learning Approach,†JOIV : International Journal on Informatics Visualization, vol. 7, no. 3, p. 952, Sep. 2023, doi: 10.30630/joiv.7.3.1247., “Hamming score.†Accessed: Jun. 01, 2023. [Online]. Available: