TY - GEN
T1 - 可讀性預測於中小學國語文教科書及優良課外讀物之研究
AU - Liu, Yi Nian
AU - Chen, Kuan Yu
AU - Tseng, Ho Chiang
AU - Chen, Berlin
N1 - Publisher Copyright:
© Proceedings of the 27th Conference on Computational Linguistics and Speech Processing, ROCLING 2015.
PY - 2015/10/1
Y1 - 2015/10/1
N2 - Readability is basically concerned with readers' comprehension of given textual materials: the higher the readability of a document, the easier the document can be understood. It may be affected by various factors, such as document length, word difficulty, sentence structure and whether the content of a document meets the prior knowledge of a reader or not. However, simple surface linguistic features cannot always account for these factors in an appropriate manner. To cater for this, we explore in this study a variety of extra features, including syntactic analysis, parts of speech, word embedding, semantic role features and well-written features. The experimental datasets are composed of two parts: one is textbooks of the Chinese language for elementary and junior high schools (K1 to K9) in Taiwan, compiled from three publishers in the academic year of 2009; the other is excellent extracurricular reading materials for students of elementary and junior high schools, collected by the Ministry of Culture in Taiwan. Two readability prediction models, viz. stepwise regression and support vector machine, are evaluated and compared, while the combination of these two models is also investigated so as to further enhance the accuracy of readability prediction. Experimental results reveal that our proposed approach can yield consistently better performance than traditional ones merely with simple surface linguistic features in evaluating text difficulty.
AB - Readability is basically concerned with readers' comprehension of given textual materials: the higher the readability of a document, the easier the document can be understood. It may be affected by various factors, such as document length, word difficulty, sentence structure and whether the content of a document meets the prior knowledge of a reader or not. However, simple surface linguistic features cannot always account for these factors in an appropriate manner. To cater for this, we explore in this study a variety of extra features, including syntactic analysis, parts of speech, word embedding, semantic role features and well-written features. The experimental datasets are composed of two parts: one is textbooks of the Chinese language for elementary and junior high schools (K1 to K9) in Taiwan, compiled from three publishers in the academic year of 2009; the other is excellent extracurricular reading materials for students of elementary and junior high schools, collected by the Ministry of Culture in Taiwan. Two readability prediction models, viz. stepwise regression and support vector machine, are evaluated and compared, while the combination of these two models is also investigated so as to further enhance the accuracy of readability prediction. Experimental results reveal that our proposed approach can yield consistently better performance than traditional ones merely with simple surface linguistic features in evaluating text difficulty.
KW - Readability
KW - Stepwise Regression
KW - Support Vector Machine
KW - Textual Features
UR - http://www.scopus.com/inward/record.url?scp=85017197249&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85017197249&partnerID=8YFLogxK
M3 - 會議論文篇章
AN - SCOPUS:85017197249
T3 - Proceedings of the 27th Conference on Computational Linguistics and Speech Processing, ROCLING 2015
SP - 71
EP - 86
BT - Proceedings of the 27th Conference on Computational Linguistics and Speech Processing, ROCLING 2015
A2 - Chen, Sin-Horng
A2 - Wang, Hsin-Min
A2 - Chien, Jen-Tzung
A2 - Kao, Hung-Yu
A2 - Chang, Wen-Whei
A2 - Wang, Yih-Ru
A2 - Wu, Shih-Hung
PB - The Association for Computational Linguistics and Chinese Language Processing (ACLCLP)
T2 - 27th Conference on Computational Linguistics and Speech Processing, ROCLING 2015
Y2 - 1 October 2015 through 2 October 2015
ER -