TY - JOUR
T1 - ChatGPT and L2 Chinese writing
T2 - evaluating the impact of model version and prompt language on automated corrective feedback
AU - Yang, Christine Ting Yu
AU - Chen, Howard Hao Jan
N1 - Publisher Copyright:
© 2025 Informa UK Limited, trading as Taylor & Francis Group.
PY - 2025
Y1 - 2025
N2 - The rapid emergence of generative artificial intelligence (GAI) models like ChatGPT has sparked significant interest in their application for language learning, particularly for second language (L2) writing. Given the urgent need for effective tools in Chinese grammar checking to assist L2 learners, this study evaluated the impact of both model version (ChatGPT-3.5 vs 4.0) and prompt language (Chinese vs English) on the effectiveness of automated corrective feedback (ACF) for L2 Chinese writing. Utilizing a dataset of 153 erroneous single-sentence examples from a Routledge-published textbook on Chinese, we assessed error corrections and corrective feedback generated by both ChatGPT versions under different language prompts. Three experienced language teachers evaluated the output corrections for grammaticality, fluency, minimal alterations, and over-correction, and the output feedback for correctness, understandability, and detail. Findings revealed that although both model versions produced grammatically correct and fluent corrections, ChatGPT-4.0 demonstrated superior performance in generating more accurate, detailed, and understandable corrective feedback compared to ChatGPT 3.5. The results suggest that model version significantly influences ChatGPT’s effectiveness as a multilingual ACF tool, more so than prompt language. This study highlights the potential of advanced GAI, such as ChatGPT-4.0, in enhancing language instruction and error correction for languages beyond English. It advocates for further research on the application of such models in diverse linguistic and educational contexts.
AB - The rapid emergence of generative artificial intelligence (GAI) models like ChatGPT has sparked significant interest in their application for language learning, particularly for second language (L2) writing. Given the urgent need for effective tools in Chinese grammar checking to assist L2 learners, this study evaluated the impact of both model version (ChatGPT-3.5 vs 4.0) and prompt language (Chinese vs English) on the effectiveness of automated corrective feedback (ACF) for L2 Chinese writing. Utilizing a dataset of 153 erroneous single-sentence examples from a Routledge-published textbook on Chinese, we assessed error corrections and corrective feedback generated by both ChatGPT versions under different language prompts. Three experienced language teachers evaluated the output corrections for grammaticality, fluency, minimal alterations, and over-correction, and the output feedback for correctness, understandability, and detail. Findings revealed that although both model versions produced grammatically correct and fluent corrections, ChatGPT-4.0 demonstrated superior performance in generating more accurate, detailed, and understandable corrective feedback compared to ChatGPT 3.5. The results suggest that model version significantly influences ChatGPT’s effectiveness as a multilingual ACF tool, more so than prompt language. This study highlights the potential of advanced GAI, such as ChatGPT-4.0, in enhancing language instruction and error correction for languages beyond English. It advocates for further research on the application of such models in diverse linguistic and educational contexts.
KW - Automated corrective feedback
KW - ChatGPT
KW - Chinese as a second language
KW - Generative artificial intelligence
KW - grammar errors
UR - http://www.scopus.com/inward/record.url?scp=85216569224&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85216569224&partnerID=8YFLogxK
U2 - 10.1080/09588221.2025.2453205
DO - 10.1080/09588221.2025.2453205
M3 - Article
AN - SCOPUS:85216569224
SN - 0958-8221
JO - Computer Assisted Language Learning
JF - Computer Assisted Language Learning
ER -