Constructing and validating readability models: the method of integrating multilevel linguistic features with machine learning

Yao Ting Sung*, Ju Ling Chen, Ji Her Cha, Hou Chiang Tseng, Tao Hsing Chang, Kuo En Chang

*此作品的通信作者

研究成果: 雜誌貢獻期刊論文同行評審

23 引文 斯高帕斯(Scopus)

摘要

Multilevel linguistic features have been proposed for discourse analysis, but there have been few applications of multilevel linguistic features to readability models and also few validations of such models. Most traditional readability formulae are based on generalized linear models (GLMs; e.g., discriminant analysis and multiple regression), but these models have to comply with certain statistical assumptions about data properties and include all of the data in formulae construction without pruning the outliers in advance. The use of such readability formulae tends to produce a low text classification accuracy, while using a support vector machine (SVM) in machine learning can enhance the classification outcome. The present study constructed readability models by integrating multilevel linguistic features with SVM, which is more appropriate for text classification. Taking the Chinese language as an example, this study developed 31 linguistic features as the predicting variables at the word, semantic, syntax, and cohesion levels, with grade levels of texts as the criterion variable. The study compared four types of readability models by integrating unilevel and multilevel linguistic features with GLMs and an SVM. The results indicate that adopting a multilevel approach in readability analysis provides a better representation of the complexities of both texts and the reading comprehension process.

原文英語
頁(從 - 到)340-354
頁數15
期刊Behavior Research Methods
47
發行號2
DOIs
出版狀態已發佈 - 2015 六月 1

ASJC Scopus subject areas

  • 實驗與認知心理學
  • 發展與教育心理學
  • 藝術與人文(雜項)
  • 心理學(雜項)
  • 心理學(全部)

指紋

深入研究「Constructing and validating readability models: the method of integrating multilevel linguistic features with machine learning」主題。共同形成了獨特的指紋。

引用此