摘要
Multilevel linguistic features have been proposed for discourse analysis, but there have been few applications of multilevel linguistic features to readability models and also few validations of such models. Most traditional readability formulae are based on generalized linear models (GLMs; e.g., discriminant analysis and multiple regression), but these models have to comply with certain statistical assumptions about data properties and include all of the data in formulae construction without pruning the outliers in advance. The use of such readability formulae tends to produce a low text classification accuracy, while using a support vector machine (SVM) in machine learning can enhance the classification outcome. The present study constructed readability models by integrating multilevel linguistic features with SVM, which is more appropriate for text classification. Taking the Chinese language as an example, this study developed 31 linguistic features as the predicting variables at the word, semantic, syntax, and cohesion levels, with grade levels of texts as the criterion variable. The study compared four types of readability models by integrating unilevel and multilevel linguistic features with GLMs and an SVM. The results indicate that adopting a multilevel approach in readability analysis provides a better representation of the complexities of both texts and the reading comprehension process.
原文 | 英語 |
---|---|
頁(從 - 到) | 340-354 |
頁數 | 15 |
期刊 | Behavior Research Methods |
卷 | 47 |
發行號 | 2 |
DOIs | |
出版狀態 | 已發佈 - 2015 6月 1 |
ASJC Scopus subject areas
- 實驗與認知心理學
- 發展與教育心理學
- 藝術與人文(雜項)
- 心理學(雜項)
- 一般心理學