Exploring the Potential of Using AI Language Models in Democratising Global Language Test Preparation
DOI:
https://doi.org/10.54855/ijte.24447Keywords:
Democratising language assessment , AI technology in language assessment, IELTS essay evaluationAbstract
This paper delves into the potential of AI language models for democratising global language test preparation, focusing on the accuracy and consistency of assessment in the context of writing essays for IELTS. This quantitative study compares the assessment scores generated by a Human Examiner (HE) and four AI Language Models: ChatGPT, Google Bard, Writing9.com, and Upscore.ai. Evaluation uses Mean Absolute Errors (MEA) and Bland Altman analysis. The findings reveal varying levels of accuracy, with Upscore.ai showcasing the lowest MEA of 0.5, followed by Google Bard at 0.85, ChatGPT at 0.9, and Writing9.com at 1.9. Bland Altman Plots visually represent the agreements between each alternative evaluation system and the Human Examiner, shedding light on their alignment. These results hold significant implications for assisting IELTS test takers in their preparation and advancing the democratisation of IELTS and global language assessment by harnessing AI technology to provide more accessible evaluation methods. AI evaluation systems can support teaching and learning by providing automated feedback when human assistance is unavailable, helping students practice independently. However, the findings show that AI's accuracy is not absolute and varies between models, meaning human involvement remains crucial for comprehensive evaluation.References
Ahmed, S., et al. (2023). The evolving capabilities of AI in coding and debugging: A comparative study of ChatGPT and Google Bard. Tech Monitor.
Alsagoafi, A. (2021). Exploring Saudi Students’ perceptions of national exams: a washback study. Revista Românească pentru Educaţie Multidimensională, 13(1Sup1), 213-234.
Barrot, J. (2023). Using ChatGPT for second language writing: Pitfalls and potentials. Journal of Writing Research.
Carr, D. F. (2024, October 17). ChatGPT on track for 2 billion visits in May, after topping 100 million daily visits twice last week. Similarweb. Retrieved from https://www.similarweb.com
Chomsky, N., Roberts, I., & Watumull, J. (2023, March 8). The false promise of ChatGPT. The New York Times. https://www.nytimes.com/2023/03/08/opinion/noam-chomsky-chatgpt-ai.html
Cotos, E. (2014). Genre-based automated writing evaluation for L2 research writing. Education and Information Technologies, 28(2), 1-20. https://doi.org/10.1007/s10639-022-11260-9
da Silva, G. S., & Ulbricht, V. R. (2024). Learning with conversational AI: ChatGPT and Bard/Gemini in education. In P. Isaias, D. G. Sampson, & D. Ifenthaler (Eds.), Artificial intelligence for supporting human cognition and exploratory learning in the digital age, 101-117. Springer. https://doi.org/10.1007/978-3-031-66462-5_6
Fraiwan, M., & Khasawneh, N. (2023). A Review of ChatGPT Applications in Education, Marketing, Software Engineering, and Healthcare: Benefits, Drawbacks, and Research Directions. arXiv. https://doi.org/10.48550/arXiv.2305.00237
Fusion Chat (2023). Google Bard vs ChatGPT: A comparison of AI chatbot services. Fusion Chat. https://fusionchat.ai/news/google-bard-vs-chatgpt-a-comparison-of-ai-chatbot-services
Giannakopoulos, K., et al. (2023). Evaluation of the Performance of Generative AI Large Language Models in Supporting Evidence-Based Dentistry. Journal of Medical Internet Research, 25, e51580. https://doi.org/10.2196/51580
Gozalo-Brizuela, R., & Garrido-Merchan, E. C. (2023). ChatGPT is not all you need: A State of the Art Review of large Generative AI models. arXiv. https://doi.org/10.48550/arXiv.2301.04655
Graphic News. (2023). ChatGPT is fastest growing internet app. Retrieved from https://www.graphicnews.com/en/pages/43884/tech-chatgpt-is-fastest-growing-internet-app
Green, A. (2019). Restoring perspective on the IELTS test. ELT Journal, 73(2), 207-215. DOI: doi.org/10.1093/elt/ccz008
Guan, Z. (2022). A Brief Discussion of the Social Impact of the IELTS in the Society of China. World Journal of Educational Research, 9(1), 97-104. DOI: https://doi.org/10.22158/wjer.v9n1p97
Hamid, M. O. (2016). Policies of global English tests: Test-takers’ perspectives on the IELTS retake policy. Discourse: Studies in the Cultural Politics of Education, 37(3), 472-487. DOI: doi.org/10.1080/01596306.2015.1061978
Hamid, M. O., & Hoang, N. T. (2018). Humanising Language Testing. TESL-EJ, 22(1), n1.
Hamid, M. O., Hardy, I., & Reyes, V. (2019). Test-takers’ perspectives on a global test of English: questions of fairness, justice and validity. Language testing in Asia, 9(1), 1- 20.
Ho, P. X. P. (2024). Using ChatGPT in English language learning: A study on I.T. students’ attitudes, habits, and perceptions. International Journal of TESOL & Education, 4(1), 55-68. https://doi.org/10.54855/ijte.24414
Hu, K. (2023). ChatGPT sets record for fastest-growing user base. Reuters, February 2, 2023. Accessed in June 2023 at https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/
Huang, J., Saleh, S., & Liu, Y. (2021). A review on artificial intelligence in education. Academic Journal of Interdisciplinary Studies, 10(206). DOI: https://doi.org/10.36941/ajis-2021-0077
International English Language Testing System. (2024). Official IELTS website. IELTS. https://ielts.org/
Instructive Tech (2023). ChatGPT vs. Google Bard: Battle of AI-Language Models. Instructive Tech. https://instructivetech.com/chatgpt-vs-google-bard
Inoue, C., Khabbazbashi, N., Lam, D. M., & Nakatsuhara, F. (2021). Towards new avenues
for the IELTS Speaking Test: insights from examiners’ voices. IELTS Partners.
Koraishi, O. (2023). Teaching English in the Age of AI: Embracing ChatGPT to Optimize EFL Materials and Assessment. Language Education & Technology (LET Journal), 3(1), 55-72.
Lam, D. M., Green, A., Murray, N., & Gayton, A. (2021). How are IELTS scores set and used for university admissions selection: A cross-institutional case study. IELTS Research Reports Online Series, No. 3.
Liao, H., Saleh, S., & Liu, Y. (2021). A review on artificial intelligence in education. Academic Journal of Interdisciplinary Studies, 10(6), 55-68. https://doi.org/10.36941/ajis-2021-0126
Liao, H., Xiao, H., & Hu, B. (2023). Revolutionizing ESL Teaching with Generative Artificial Intelligence—Take ChatGPT as an Example. International Journal of New Developments in Education, 5(20), 39-46. DOI: https://doi.org/10.25236/IJNDE.2023.052008
Lo, C. K. (2023). What is the impact of ChatGPT on education? A rapid review of the literature. Education Sciences, 13(4), 410. https://doi.org/10.3390/educsci13040410
Luu, Q. K., & Luu, N. B. T. (2022). Learning strategies of ELT students for IELTS test preparation to meet English learning outcomes. International Journal of TESOL & Education, 2(3), 308-323. DOI: https://doi.org/10.54855/ijte.222321
Mc.Murtie (2022). AI and the Future of Undergraduate Writing. The Chronicels of Higher Education, December 12, 2022. Accessed in July 2023 at https://www.chronicle.com/article/ai-and-the-future-of-undergraduate-writing
Pearson, W. S. (2019). Critical perspectives on the IELTS test. ELT Journal, 73(2), 197-206. DOI: https://doi.org/10.1093/elt/ccz006
Rahman, M. M., & Watanobe, Y. (2023). ChatGPT for Education and Research: Opportunities, Threats, and Strategies. Applied Sciences, 13(9), 5783. https://doi.org/10.3390/app13095783
Richardson, M., & Clesham, R. (2021). Rise of the machines? The evolving role of Artificial Intelligence (AI) technologies in high stakes assessment. London Review of Education, 19(1), 1-13. DOI: https://doi.org/10.14324/LRE.19.1.10
Sharples, M. (2022). Automated essay writing: An AIED opinion. International journal of artificial intelligence in education, 32(4), 1119-1126. https://link.springer.com/article/10.1007/s40593-022-00300-7
Shi, H., & Aryadoust, V. (2022). A systematic review of automated writing evaluation systems. Educational Technology Research and Development, 70(1), 1-14.
Waisberg, E., Ong, J., Masalkhi, M., Kamran, S. A., Zaman, N., Sarker, P., & Lee, A. G. (2024). Google’s AI chatbot “Bard”: A side-by-side comparison with ChatGPT and its utilization in ophthalmology. Eye, 38(4), 642–645. https://doi.org/10.1038/s41433-023-02760-0
Watters, C., & Lemanski, M. K. (2023). Universal skepticism of ChatGPT: A review of early literature on chat generative pre-trained transformer. Frontiers in Big Data. https://doi.org/10.3389/fdata.2023.1224976
Wei, H., Li, M., & Liu, S. (2023). The impact of automated writing evaluation on second language writing skills of Chinese EFL learners: A randomized controlled trial. Journal of Educational Technology & Society, 26(3), 31-45.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Amalia Novita Sari
This work is licensed under a Creative Commons Attribution 4.0 International License.
The copyright of all articles published in the International Journal of TESOL & Education (ijte) remains with the Authors, i.e. Authors retain full ownership of their article. Permitted third-party reuse of the open access articles is defined by the applicable Creative Commons (CC) end-user license which is accepted by the Authors upon submission of their paper. All articles in the ijte are published under the CC BY-NC 4.0 license, meaning that end users can freely share an article (i.e. copy and redistribute the material in any medium or format) and adapt it (i.e. remix, transform and build upon the material) on the condition that proper attribution is given (i.e. appropriate credit, a link to the applicable license and an indication if any changes were made; all in such a way that does not suggest that the licensor endorses the user or the use) and the material is only used for non-commercial purposes.
Authors retain copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository, in a journal or publish it in a book), with an acknowledgment of its initial publication in this journal.