A Preliminary Study on Chinese Learners’ Written Errors Based on an Error-Tagged Learner Corpus

Ting Yu Yang*, Hui Mei Yang, Wei Jei Lee, Chen Yu Liu, Howard Hao Jan Chen

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingChapter


With the development of technology, the need for compiling computer-based learner corpora has gradually gained more attention from language teachers and researchers. A learner corpus can reflect learners’ authentic use of a target language, which provides useful information for language teachers, researchers, and textbook editors. Limitations of retrieving errors in learner corpora, however, still exist. For example, it is difficult to retrieve omission errors if a corpus is not error-tagged beforehand. To offer researchers an error-tagged learner corpus of Chinese, this study manually error-tagged the two-million-word Chinese Learner Written Corpus of National Taiwan Normal University. A preliminary analysis of errors tagged in the learner corpus shows a total of 48,266 errors distributed to 119 tags. These 48,266 errors are mostly distributed to the incorrect selection of words or the missing of necessary word-level components, and the misuse of nouns, action verbs, adverbs, and structural particles is especially common. Among the 119 tags, the top 12 common error tags (i.e., occurring more than 1,000 times) accounted for more than 50% of the total errors, and incorrect selections of nouns and action verbs together constituted more than 27% of the total errors. These 12 common error types, especially the wrong choice of nouns and action verbs, should thus be regarded to be particularly difficult for second language (L2) learners of Chinese to acquire. Analysis of the top 12 common errors also reveals that learners’ misuse of verbs, adverbs, and structural particles were somewhat varied (i.e., involving different types of target modification, such as missing, redundant, and incorrect selection), whereas their misuse of nouns mostly resulted from an incorrect selection. A comparison between the top 10 common error types in this study with those in Lee et al. (2016) reveals that, regardless of some discrepancies in ranking, 90% of the top 10 error tags overlapped in the two studies, suggesting that these error types are indeed difficult for L2 Chinese learners to acquire and should be investigated further. Based on the findings yielded in this study, suggestions for further research on L2 Chinese learners’ errors are provided.

Original languageEnglish
Title of host publicationChinese Language Learning Sciences
Number of pages17
Publication statusPublished - 2023

Publication series

NameChinese Language Learning Sciences
ISSN (Print)2520-1719
ISSN (Electronic)2520-1727


  • Chinese teaching
  • Error analysis
  • Error-tagging
  • Learner corpus

ASJC Scopus subject areas

  • Language and Linguistics
  • Education
  • Linguistics and Language
  • Computer Science Applications


Dive into the research topics of 'A Preliminary Study on Chinese Learners’ Written Errors Based on an Error-Tagged Learner Corpus'. Together they form a unique fingerprint.

Cite this