Exploring the Capabilities of GPT Models in Drafting Course Assessments Based on Bloom’s Taxonomy

Gilang Muhamad - King Abdulaziz University, Jeddah, 21589, Kingdom of Saudi Arabia
Bassma Alsulami - King Abdulaziz University, Jeddah, 21589, Kingdom of Saudi Arabia
Khalid Thabit - King Abdulaziz University, Jeddah, 21589, Kingdom of Saudi Arabia


Citation Format:



DOI: http://dx.doi.org/10.62527/joiv.9.1.2811

Abstract


The application of Generative Pre-trained Transformer (GPT) models is significantly essential in automating drafting course assessment based on Bloom’s Taxonomy, specifically GPT-3.5-turbo, GPT-4, and GPT-4o. Therefore, this study aimed to explore the interaction between Artificial Intelligence (AI) models and educational content using refined prompt engineering methods to enhance the accuracy and relevance of the generated questions. For the investigation, the processing 146 Course Learning Outcomes (CLOs) method was applied through each model using OpenAI Application Programming Interface (API). Metrics such as 'Accuracy', 'Precision', 'Recall', and 'F1 Score' were used to assess the performance of each model. The results showed that GPT-4 was suitable for complex course assessments, showing superior performance in delivering detailed and precise responses. A cost-effective solution was obtained using GPT-3.5-turbo for generating simpler course assessment, while GPT-4o provided a middle ground, balancing cost, and performance. The results showed the potential of AI to reduce the administrative burden on instructors by streamlining the creation and refinement of course assessments. The enhancement of course assessments was also facilitated by automation, thereby supporting more adaptive questions. The potential for broader AI integration into educational practices promised a transformative impact on traditional course assessment drafting methods, enabling more dynamic and educational experiences. Moreover, further studies were recommended to explore the ethical dimensions of AI in education, the ability to handle diverse tasks, as well as assess the long-term impacts on learning outcomes and educational equity.

Keywords


GPT Models;Prompt Engineering;Bloom’s Taxonomy;Educational Technology

Full Text:

PDF

References


H. Goss, “Student Learning Outcomes Assessment in Higher Education and in Academic Libraries: A Review of the Literature,” The Journal of Academic Librarianship, vol. 48, no. 2, p. 102485, Mar. 2022, doi: 10.1016/J.ACALIB.2021.102485.

M. Karam, H. Fares, and S. Al-Majeed, “Quality Assurance Framework for the Design and Delivery of Virtual, Real-Time Courses,” Information 2021, Vol. 12, Page 93, vol. 12, no. 2, p. 93, Feb. 2021, doi: 10.3390/INFO12020093.

S. L. Zorluoglu and Ç. Güven, “Analysis of 5th Grade Science Learning Outcomes and Exam Questions According to Revised Bloom Taxonomy.,” Journal of Educational Issues, vol. 6, no. 1, pp. 58–69, 2020, doi: 10.5296/jei.v6i1.16197.

A. Alyasin, R. Nasser, M. El Hajj, and H. Harb, “Assessing Learning Outcomes in Higher Education: From Practice to Systematization,” TEM Journal, vol. 12, no. 3, pp. 1593–1604, 2023, doi: 10.18421/TEM123-41.

N. B. Mendoza, Z. Yan, and R. B. King, “Domain-specific motivation and self-assessment practice as mechanisms linking perceived need-supportive teaching to student achievement,” European Journal of Psychology of Education, vol. 38, no. 2, pp. 607–630, Jun. 2023, doi: 10.1007/S10212-022-00620-1.

O. Sychev, N. Penskoy, A. Anikin, M. Denisov, and A. Prokudin, “Improving comprehension: Intelligent tutoring system explaining the domain rules when students break them,” Educ Sci (Basel), vol. 11, no. 11, Nov. 2021, doi: 10.3390/EDUCSCI11110719.

K. Banujan, S. Kumara, S. Prasanth, and N. Ravikumar, “Revolutionising Educational Assessment: Automated Question Classification using Bloom’s Taxonomy and Deep Learning Techniques-A Case Study on Undergraduate Examination Questions,” International Journal of Education and Development using Information and Communication Technology (IJEDICT), vol. 19, pp. 259–278, 2023.

K. Jayakodi, M. Bandara, I. Perera, and D. Meedeniya, “WordNet and Cosine Similarity based Classifier of Exam Questions using Bloom’s Taxonomy,” International Journal of Emerging Technologies in Learning (iJET), vol. 11, no. 4, pp. 142–149, 2016, doi: 10.3991/IJET.V11I04.5654.

N. Omar et al., “Automated Analysis of Exam Questions According to Bloom’s Taxonomy,” Procedia Soc Behav Sci, vol. 59, pp. 297–303, Oct. 2012, doi: 10.1016/J.SBSPRO.2012.09.278.

K. Jayakodi, M. Bandara, and D. Meedeniya, “An automatic classifier for exam questions with WordNet and Cosine similarity,” Moratuwa Engineering Research Conference, pp. 12–17, May 2016, doi: 10.1109/MERCON.2016.7480108.

M. Jain, R. Beniwal, A. Ghosh, T. Grover, and U. Tyagi, “Classifying Question Papers with Bloom’s Taxonomy Using Machine Learning Techniques,” Communications in Computer and Information Science, vol. 1046, pp. 399–408, 2019, doi: 10.1007/978-981-13-9942-8_38/FIGURES/6.

M. Hmoud and A. Shaqour, “The International Journal of Technologies in Learning AIEd Bloom’s Taxonomy: A Proposed Model for Enhancing Educational Efficiency and Effectiveness in the Artificial Intelligence Era”, doi: 10.18848/2327-0144/CGP/v31i02/111-128.

F. Kilipiris, S. Avdimiotis, E. Christou, A. Tragouda, and I. Konstantinidis, “Bloom’s Taxonomy Student Persona Responses to Blended Learning Methods Employing the Metaverse and Flipped Classroom Tools,” Educ Sci (Basel), vol. 14, no. 4, Apr. 2024, doi: 10.3390/EDUCSCI14040418.

M. Gummineni, “Implementing Bloom’s Taxonomy Tool for Better Learning Outcomes of PLC and Robotics Course”, doi: 10.3991/ijet.v15i05.12173.

“GPT-4.” Accessed: Mar. 05, 2024. [Online]. Available: https://openai.com/gpt-4

R. L. Cox, K. L. Hunt, and R. R. Hill, “Comparative Analysis of NCLEX-RN Questions: A Duel Between ChatGPT and Human Expertise,” Journal of Nursing Education, vol. 62, no. 12, pp. 679–687, 2023, doi: 10.3928/01484834-20231006-07.

B. H. H. Cheung et al., “ChatGPT versus human in generating medical graduate exam multiple choice questions—A multinational prospective study (Hong Kong S. A.R., Singapore, Ireland, and the United Kingdom),” PLoS One, vol. 18, no. 8 August, Aug. 2023, doi: 10.1371/JOURNAL.PONE.0290691.

A. Spanos, “BloomGPT: Using ChatGPT as Learning Assistant in Relation to Bloom’s Taxonomy of Educational Objectives”.

J. White et al., “A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT,” 2023.

“Prompt engineering - OpenAI API.” Accessed: Feb. 25, 2024. [Online]. Available: https://platform.openai.com/docs/guides/prompt-engineering/strategy-write-clear-instructions

Y. Liu et al., “Jailbreaking ChatGPT via Prompt Engineering: An Empirical Study”, Accessed: May 22, 2024. [Online]. Available: https://learnprompting.org/docs/prompt_hacking/jailbreaking

V. Liu and L. B. Chilton, “Design Guidelines for Prompt Engineering Text-to-Image Generative Models”, doi: 10.1145/3491102.3501825.

S. Ekin, “Prompt Engineering For ChatGPT: A Quick Guide To Techniques, Tips, And Best Practices,” Authorea Preprints, Oct. 2023, doi: 10.36227/TECHRXIV.22683919.V2.

“Bloom’s Taxonomy Verb Chart | Teaching Innovation and Pedagogical Support.” Accessed: Jun. 02, 2024. [Online]. Available: https://tips.uark.edu/blooms-taxonomy-verb-chart/

P. Singh, “Exploring WSL2,” Learn Windows Subsystem for Linux, pp. 75–98, 2020, doi: 10.1007/978-1-4842-6038-8_5.

Y. Sonoda et al., “Diagnostic Performances of GPT-4o, Claude 3 Opus, and Gemini 1.5 Pro in ‘Diagnosis Please’ Cases,” medRxiv, p. 2024.05.26.24307915, May 2024, doi: 10.1101/2024.05.26.24307915.

A. Herrmann-Werner et al., “Assessing ChatGPT’s Mastery of Bloom’s Taxonomy Using Psychosomatic Medicine Exam Questions: Mixed-Methods Study,” J Med Internet Res, vol. 26, no. 1, Jan. 2024, doi: 10.2196/52113.

“National Center for Academic Accreditation and Evaluation - Program Accreditation.” Accessed: Jun. 05, 2024. [Online]. Available: https://etec.gov.sa/en/service/accreditation/servicedocuments

“Introduction to JavaScript Object Notation: A To-the-Point Guide to JSON - Lindsay Bassett - Google Books.” Accessed: Jun. 05, 2024. [Online]. Available: https://books.google.com.sa/books?hl=en&lr=&id=Qv9PCgAAQBAJ&oi=fnd&pg=PP1&dq=javascript+object+notation+&ots=g5zUOMQs7I&sig=Wc00ujTYj81sbgYYrARlqZ4wKJI&redir_esc=y#v=onepage&q=javascript%20object%20notation&f=false

T. B. Alakus and I. Turkoglu, “Comparison of deep learning approaches to predict COVID-19 infection,” Chaos Solitons Fractals, vol. 140, p. 110120, Nov. 2020, doi: 10.1016/J.CHAOS.2020.110120.

M. A. Naji, S. El Filali, K. Aarika, E. H. Benlahmar, R. A. Abdelouhahid, and O. Debauche, “Machine Learning Algorithms For Breast Cancer Prediction And Diagnosis,” Procedia Comput Sci, vol. 191, pp. 487–492, Jan. 2021, doi: 10.1016/J.PROCS.2021.07.062.

D. M. W. Powers and Ailab, “Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation,” Oct. 2020, Accessed: Jun. 05, 2024. [Online]. Available: https://arxiv.org/abs/2010.16061v1

R. Yacouby Amazon Alexa and D. Axman Amazon Alexa, “Probabilistic Extension of Precision, Recall, and F1 Score for More Thorough Evaluation of Classification Models,” pp. 79–91, Nov. 2020, doi: 10.18653/V1/2020.EVAL4NLP-1.9.

Z. Jiang, Z. Xu, Z. Pan, J. He, and K. Xie, “Exploring the Role of Artificial Intelligence in Facilitating Assessment of Writing Performance in Second Language Learning,” Languages, vol. 8, no. 4, Dec. 2023, doi: 10.3390/languages8040247.

“Pricing | OpenAI.” Accessed: Jul. 22, 2024. [Online]. Available: https://openai.com/api/pricing/