Exploring the Potential of Using AI Language Models in Democratising Global Language Test Preparation

Authors

DOI:

https://doi.org/10.54855/ijte.24447

Keywords:

Democratising language assessment , AI technology in language assessment, IELTS essay evaluation

Abstract

This paper delves into the potential of AI language models for democratising global language test preparation, focusing on the accuracy and consistency of assessment in the context of writing essays for IELTS. This quantitative study compares the assessment scores generated by a Human Examiner (HE) and four AI Language Models: ChatGPT, Google Bard, Writing9.com, and Upscore.ai. Evaluation uses Mean Absolute Errors (MEA) and Bland Altman analysis. The findings reveal varying levels of accuracy, with Upscore.ai showcasing the lowest MEA of 0.5, followed by Google Bard at 0.85, ChatGPT at 0.9, and Writing9.com at 1.9. Bland Altman Plots visually represent the agreements between each alternative evaluation system and the Human Examiner, shedding light on their alignment. These results hold significant implications for assisting IELTS test takers in their preparation and advancing the democratisation of IELTS and global language assessment by harnessing AI technology to provide more accessible evaluation methods. AI evaluation systems can support teaching and learning by providing automated feedback when human assistance is unavailable, helping students practice independently. However, the findings show that AI's accuracy is not absolute and varies between models, meaning human involvement remains crucial for comprehensive evaluation.

Author Biography

  • Amalia Novita Sari, Sagara Abhipraya Edu Lab, Tangerang Selatan, Indonesia

    Amalia N. Sari   is the founder and director of Sagara Abhipraya (SA) Edu Lab, a private education institution focusing on language education and training in Tangerang Selatan, Indonesia. She is currently a PhD student at the University of Queensland. Her research focuses on policy in language education, specifically in assessment practices.

References

Ahmed, S., et al. (2023). The evolving capabilities of AI in coding and debugging: A comparative study of ChatGPT and Google Bard. Tech Monitor.

Alsagoafi, A. (2021). Exploring Saudi Students’ perceptions of national exams: a washback study. Revista Românească pentru Educaţie Multidimensională, 13(1Sup1), 213-234.

Barrot, J. (2023). Using ChatGPT for second language writing: Pitfalls and potentials. Journal of Writing Research.

Carr, D. F. (2024, October 17). ChatGPT on track for 2 billion visits in May, after topping 100 million daily visits twice last week. Similarweb. Retrieved from https://www.similarweb.com

Chomsky, N., Roberts, I., & Watumull, J. (2023, March 8). The false promise of ChatGPT. The New York Times. https://www.nytimes.com/2023/03/08/opinion/noam-chomsky-chatgpt-ai.html

Cotos, E. (2014). Genre-based automated writing evaluation for L2 research writing. Education and Information Technologies, 28(2), 1-20. https://doi.org/10.1007/s10639-022-11260-9

da Silva, G. S., & Ulbricht, V. R. (2024). Learning with conversational AI: ChatGPT and Bard/Gemini in education. In P. Isaias, D. G. Sampson, & D. Ifenthaler (Eds.), Artificial intelligence for supporting human cognition and exploratory learning in the digital age, 101-117. Springer. https://doi.org/10.1007/978-3-031-66462-5_6

Fraiwan, M., & Khasawneh, N. (2023). A Review of ChatGPT Applications in Education, Marketing, Software Engineering, and Healthcare: Benefits, Drawbacks, and Research Directions. arXiv. https://doi.org/10.48550/arXiv.2305.00237

Fusion Chat (2023). Google Bard vs ChatGPT: A comparison of AI chatbot services. Fusion Chat. https://fusionchat.ai/news/google-bard-vs-chatgpt-a-comparison-of-ai-chatbot-services

Giannakopoulos, K., et al. (2023). Evaluation of the Performance of Generative AI Large Language Models in Supporting Evidence-Based Dentistry. Journal of Medical Internet Research, 25, e51580. https://doi.org/10.2196/51580

Gozalo-Brizuela, R., & Garrido-Merchan, E. C. (2023). ChatGPT is not all you need: A State of the Art Review of large Generative AI models. arXiv. https://doi.org/10.48550/arXiv.2301.04655

Graphic News. (2023). ChatGPT is fastest growing internet app. Retrieved from https://www.graphicnews.com/en/pages/43884/tech-chatgpt-is-fastest-growing-internet-app

Green, A. (2019). Restoring perspective on the IELTS test. ELT Journal, 73(2), 207-215. DOI: doi.org/10.1093/elt/ccz008

Guan, Z. (2022). A Brief Discussion of the Social Impact of the IELTS in the Society of China. World Journal of Educational Research, 9(1), 97-104. DOI: https://doi.org/10.22158/wjer.v9n1p97

Hamid, M. O. (2016). Policies of global English tests: Test-takers’ perspectives on the IELTS retake policy. Discourse: Studies in the Cultural Politics of Education, 37(3), 472-487. DOI: doi.org/10.1080/01596306.2015.1061978

Hamid, M. O., & Hoang, N. T. (2018). Humanising Language Testing. TESL-EJ, 22(1), n1.

Hamid, M. O., Hardy, I., & Reyes, V. (2019). Test-takers’ perspectives on a global test of English: questions of fairness, justice and validity. Language testing in Asia, 9(1), 1- 20.

Ho, P. X. P. (2024). Using ChatGPT in English language learning: A study on I.T. students’ attitudes, habits, and perceptions. International Journal of TESOL & Education, 4(1), 55-68. https://doi.org/10.54855/ijte.24414

Hu, K. (2023). ChatGPT sets record for fastest-growing user base. Reuters, February 2, 2023. Accessed in June 2023 at https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/

Huang, J., Saleh, S., & Liu, Y. (2021). A review on artificial intelligence in education. Academic Journal of Interdisciplinary Studies, 10(206). DOI: https://doi.org/10.36941/ajis-2021-0077

International English Language Testing System. (2024). Official IELTS website. IELTS. https://ielts.org/

Instructive Tech (2023). ChatGPT vs. Google Bard: Battle of AI-Language Models. Instructive Tech. https://instructivetech.com/chatgpt-vs-google-bard

Inoue, C., Khabbazbashi, N., Lam, D. M., & Nakatsuhara, F. (2021). Towards new avenues

for the IELTS Speaking Test: insights from examiners’ voices. IELTS Partners.

Koraishi, O. (2023). Teaching English in the Age of AI: Embracing ChatGPT to Optimize EFL Materials and Assessment. Language Education & Technology (LET Journal), 3(1), 55-72.

Lam, D. M., Green, A., Murray, N., & Gayton, A. (2021). How are IELTS scores set and used for university admissions selection: A cross-institutional case study. IELTS Research Reports Online Series, No. 3.

Liao, H., Saleh, S., & Liu, Y. (2021). A review on artificial intelligence in education. Academic Journal of Interdisciplinary Studies, 10(6), 55-68. https://doi.org/10.36941/ajis-2021-0126

Liao, H., Xiao, H., & Hu, B. (2023). Revolutionizing ESL Teaching with Generative Artificial Intelligence—Take ChatGPT as an Example. International Journal of New Developments in Education, 5(20), 39-46. DOI: https://doi.org/10.25236/IJNDE.2023.052008

Lo, C. K. (2023). What is the impact of ChatGPT on education? A rapid review of the literature. Education Sciences, 13(4), 410. https://doi.org/10.3390/educsci13040410

Luu, Q. K., & Luu, N. B. T. (2022). Learning strategies of ELT students for IELTS test preparation to meet English learning outcomes. International Journal of TESOL & Education, 2(3), 308-323. DOI: https://doi.org/10.54855/ijte.222321

Mc.Murtie (2022). AI and the Future of Undergraduate Writing. The Chronicels of Higher Education, December 12, 2022. Accessed in July 2023 at https://www.chronicle.com/article/ai-and-the-future-of-undergraduate-writing

Pearson, W. S. (2019). Critical perspectives on the IELTS test. ELT Journal, 73(2), 197-206. DOI: https://doi.org/10.1093/elt/ccz006

Rahman, M. M., & Watanobe, Y. (2023). ChatGPT for Education and Research: Opportunities, Threats, and Strategies. Applied Sciences, 13(9), 5783. https://doi.org/10.3390/app13095783

Richardson, M., & Clesham, R. (2021). Rise of the machines? The evolving role of Artificial Intelligence (AI) technologies in high stakes assessment. London Review of Education, 19(1), 1-13. DOI: https://doi.org/10.14324/LRE.19.1.10

Sharples, M. (2022). Automated essay writing: An AIED opinion. International journal of artificial intelligence in education, 32(4), 1119-1126. https://link.springer.com/article/10.1007/s40593-022-00300-7

Shi, H., & Aryadoust, V. (2022). A systematic review of automated writing evaluation systems. Educational Technology Research and Development, 70(1), 1-14.

Waisberg, E., Ong, J., Masalkhi, M., Kamran, S. A., Zaman, N., Sarker, P., & Lee, A. G. (2024). Google’s AI chatbot “Bard”: A side-by-side comparison with ChatGPT and its utilization in ophthalmology. Eye, 38(4), 642–645. https://doi.org/10.1038/s41433-023-02760-0

Watters, C., & Lemanski, M. K. (2023). Universal skepticism of ChatGPT: A review of early literature on chat generative pre-trained transformer. Frontiers in Big Data. https://doi.org/10.3389/fdata.2023.1224976

Wei, H., Li, M., & Liu, S. (2023). The impact of automated writing evaluation on second language writing skills of Chinese EFL learners: A randomized controlled trial. Journal of Educational Technology & Society, 26(3), 31-45.

Downloads

Published

01-11-2024

Issue

Section

Research Article

How to Cite

Sari, A. N. (2024). Exploring the Potential of Using AI Language Models in Democratising Global Language Test Preparation. International Journal of TESOL & Education, 4(4), 111-126. https://doi.org/10.54855/ijte.24447

Similar Articles

1-10 of 148

You may also start an advanced similarity search for this article.