Leveling L2 Texts Through Readability: Combining Multilevel Linguistic Features with the CEFR

Yao Ting Sung, Wei Chun Lin, Scott Benjamin Dyson, Kuo En Chang, Yu Chia Chen

Research output: Contribution to journalArticlepeer-review

50 Citations (Scopus)


Selecting appropriate texts for L2 (second/foreign language) learners is an important approach to enhancing motivation and, by extension, learning. There is currently no tool for classifying foreign language texts according to a language proficiency framework, which makes it difficult for students and educators to determine the precise difficulty/complexity levels of an unclassified text. Taking the Chinese language as an example, this study aimed to create a readability assessment system, called the Chinese Readability Index Explorer for Chinese as a Foreign Language (CRIE-CFL), in order to level-that is, to sort by proficiency level-texts that will be used for instructional purposes. The framework of choice in this project is the Common European Framework of Reference (CEFR). A team of expert CFL teachers first classified 1,578 CFL texts into their appropriate CEFR levels. A set of 30 CFL readability features was then developed or drawn from previous research, and sorted according to importance using F-scores. In addition, a support vector machine model was trained by sequentially integrating the features into the model to optimize accuracy. The empirical evaluation of CRIE-CFL revealed average exact- and adjacent-level accuracies of 74.97% and 99.62%, respectively, for predicting the expert classification of a text. The functionalities of CRIE-CFL are introduced and discussed.

Original languageEnglish
Pages (from-to)371-391
Number of pages21
JournalModern Language Journal
Issue number2
Publication statusPublished - 2015 Jun 1


  • CEFR
  • Leveling
  • Mandarin Chinese
  • Readability

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language


Dive into the research topics of 'Leveling L2 Texts Through Readability: Combining Multilevel Linguistic Features with the CEFR'. Together they form a unique fingerprint.

Cite this