Developing learner corpus annotation for Chinese grammatical errors

Lung Hao Lee, Li Ping Chang, Yuen-Hsien Tseng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

This study describes the construction of the TOCFL (Test Of Chinese as a Foreign Language) learner corpus, including the collection and grammatical error annotation of 2,837 essays written by Chinese language learners originating from a total of 46 different mother-Tongue languages. We propose hierarchical tagging sets to manually annotate grammatical errors, resulting in 33,835 inappropriate usages. Our built corpus has been provided for the shared tasks on Chinese grammatical error diagnosis. These demonstrate the usability of our learner corpus annotation.

Original languageEnglish
Title of host publicationProceedings of the 2016 International Conference on Asian Language Processing, IALP 2016
EditorsMinghui Dong, Chung-Hsien Wu, Yanfeng Lu, Haizhou Li, Yuen-Hsien Tseng, Liang-Chih Yu, Lung-Hao Lee
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages254-257
Number of pages4
ISBN (Electronic)9781509009213
DOIs
Publication statusPublished - 2017 Mar 10
Event20th International Conference on Asian Language Processing, IALP 2016 - Tainan, Taiwan
Duration: 2016 Nov 212016 Nov 23

Publication series

NameProceedings of the 2016 International Conference on Asian Language Processing, IALP 2016

Other

Other20th International Conference on Asian Language Processing, IALP 2016
CountryTaiwan
CityTainan
Period16/11/2116/11/23

Fingerprint

mother tongue
language
foreign language

Keywords

  • computer-Assisted language learning
  • error schema
  • error tagging
  • grammatical error diagnosis
  • interlanguage
  • second language acquisition

ASJC Scopus subject areas

  • Signal Processing
  • Computer Vision and Pattern Recognition
  • Linguistics and Language
  • Artificial Intelligence

Cite this

Lee, L. H., Chang, L. P., & Tseng, Y-H. (2017). Developing learner corpus annotation for Chinese grammatical errors. In M. Dong, C-H. Wu, Y. Lu, H. Li, Y-H. Tseng, L-C. Yu, & L-H. Lee (Eds.), Proceedings of the 2016 International Conference on Asian Language Processing, IALP 2016 (pp. 254-257). [7875980] (Proceedings of the 2016 International Conference on Asian Language Processing, IALP 2016). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IALP.2016.7875980

Developing learner corpus annotation for Chinese grammatical errors. / Lee, Lung Hao; Chang, Li Ping; Tseng, Yuen-Hsien.

Proceedings of the 2016 International Conference on Asian Language Processing, IALP 2016. ed. / Minghui Dong; Chung-Hsien Wu; Yanfeng Lu; Haizhou Li; Yuen-Hsien Tseng; Liang-Chih Yu; Lung-Hao Lee. Institute of Electrical and Electronics Engineers Inc., 2017. p. 254-257 7875980 (Proceedings of the 2016 International Conference on Asian Language Processing, IALP 2016).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Lee, LH, Chang, LP & Tseng, Y-H 2017, Developing learner corpus annotation for Chinese grammatical errors. in M Dong, C-H Wu, Y Lu, H Li, Y-H Tseng, L-C Yu & L-H Lee (eds), Proceedings of the 2016 International Conference on Asian Language Processing, IALP 2016., 7875980, Proceedings of the 2016 International Conference on Asian Language Processing, IALP 2016, Institute of Electrical and Electronics Engineers Inc., pp. 254-257, 20th International Conference on Asian Language Processing, IALP 2016, Tainan, Taiwan, 16/11/21. https://doi.org/10.1109/IALP.2016.7875980
Lee LH, Chang LP, Tseng Y-H. Developing learner corpus annotation for Chinese grammatical errors. In Dong M, Wu C-H, Lu Y, Li H, Tseng Y-H, Yu L-C, Lee L-H, editors, Proceedings of the 2016 International Conference on Asian Language Processing, IALP 2016. Institute of Electrical and Electronics Engineers Inc. 2017. p. 254-257. 7875980. (Proceedings of the 2016 International Conference on Asian Language Processing, IALP 2016). https://doi.org/10.1109/IALP.2016.7875980
Lee, Lung Hao ; Chang, Li Ping ; Tseng, Yuen-Hsien. / Developing learner corpus annotation for Chinese grammatical errors. Proceedings of the 2016 International Conference on Asian Language Processing, IALP 2016. editor / Minghui Dong ; Chung-Hsien Wu ; Yanfeng Lu ; Haizhou Li ; Yuen-Hsien Tseng ; Liang-Chih Yu ; Lung-Hao Lee. Institute of Electrical and Electronics Engineers Inc., 2017. pp. 254-257 (Proceedings of the 2016 International Conference on Asian Language Processing, IALP 2016).
@inproceedings{2f43e2e1285840d0b5bb22fa90904199,
title = "Developing learner corpus annotation for Chinese grammatical errors",
abstract = "This study describes the construction of the TOCFL (Test Of Chinese as a Foreign Language) learner corpus, including the collection and grammatical error annotation of 2,837 essays written by Chinese language learners originating from a total of 46 different mother-Tongue languages. We propose hierarchical tagging sets to manually annotate grammatical errors, resulting in 33,835 inappropriate usages. Our built corpus has been provided for the shared tasks on Chinese grammatical error diagnosis. These demonstrate the usability of our learner corpus annotation.",
keywords = "computer-Assisted language learning, error schema, error tagging, grammatical error diagnosis, interlanguage, second language acquisition",
author = "Lee, {Lung Hao} and Chang, {Li Ping} and Yuen-Hsien Tseng",
year = "2017",
month = "3",
day = "10",
doi = "10.1109/IALP.2016.7875980",
language = "English",
series = "Proceedings of the 2016 International Conference on Asian Language Processing, IALP 2016",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "254--257",
editor = "Minghui Dong and Chung-Hsien Wu and Yanfeng Lu and Haizhou Li and Yuen-Hsien Tseng and Liang-Chih Yu and Lung-Hao Lee",
booktitle = "Proceedings of the 2016 International Conference on Asian Language Processing, IALP 2016",

}

TY - GEN

T1 - Developing learner corpus annotation for Chinese grammatical errors

AU - Lee, Lung Hao

AU - Chang, Li Ping

AU - Tseng, Yuen-Hsien

PY - 2017/3/10

Y1 - 2017/3/10

N2 - This study describes the construction of the TOCFL (Test Of Chinese as a Foreign Language) learner corpus, including the collection and grammatical error annotation of 2,837 essays written by Chinese language learners originating from a total of 46 different mother-Tongue languages. We propose hierarchical tagging sets to manually annotate grammatical errors, resulting in 33,835 inappropriate usages. Our built corpus has been provided for the shared tasks on Chinese grammatical error diagnosis. These demonstrate the usability of our learner corpus annotation.

AB - This study describes the construction of the TOCFL (Test Of Chinese as a Foreign Language) learner corpus, including the collection and grammatical error annotation of 2,837 essays written by Chinese language learners originating from a total of 46 different mother-Tongue languages. We propose hierarchical tagging sets to manually annotate grammatical errors, resulting in 33,835 inappropriate usages. Our built corpus has been provided for the shared tasks on Chinese grammatical error diagnosis. These demonstrate the usability of our learner corpus annotation.

KW - computer-Assisted language learning

KW - error schema

KW - error tagging

KW - grammatical error diagnosis

KW - interlanguage

KW - second language acquisition

UR - http://www.scopus.com/inward/record.url?scp=85017232343&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85017232343&partnerID=8YFLogxK

U2 - 10.1109/IALP.2016.7875980

DO - 10.1109/IALP.2016.7875980

M3 - Conference contribution

T3 - Proceedings of the 2016 International Conference on Asian Language Processing, IALP 2016

SP - 254

EP - 257

BT - Proceedings of the 2016 International Conference on Asian Language Processing, IALP 2016

A2 - Dong, Minghui

A2 - Wu, Chung-Hsien

A2 - Lu, Yanfeng

A2 - Li, Haizhou

A2 - Tseng, Yuen-Hsien

A2 - Yu, Liang-Chih

A2 - Lee, Lung-Hao

PB - Institute of Electrical and Electronics Engineers Inc.

ER -