TY - GEN
T1 - 特徵選取演算法對可讀性模型的影響
AU - Tai, Tsai Ning
AU - Tseng, Hou Chiang
AU - Sung, Yao Ting
N1 - Publisher Copyright:
© 2023 ROCLING 2023 - Proceedings of the 35th Conference on Computational Linguistics and Speech Processing. All rights reserved.
PY - 2023
Y1 - 2023
N2 - Reading is one of the most important ways of acquiring knowledge. Researchers have pointed out that to promote the effectiveness of reading, it is very important to provide materials of the right level of difficulty. If the reading materials are too easy, readers usually cannot acquire new knowledge in the process of reading; on the other hand, if the materials are too difficult, it will cause excessive cognitive burden to the readers, affecting their learning effectiveness. Therefore, giving readers appropriate reading is an important issue. To address this issue, many scholars have begun to develop readability models and found that feature selection enhances the accuracy of readability models. However, the interaction between various feature algorithms and classifiers has yet to be much explored in past studies. Therefore, in this study, three feature selection algorithms, Chi-squared test, ANOVA, Mutual Information, and 25 classifiers, were applied to compare the accuracy of readability models for grades 1-12 in the textbooks of the Chinese language. The experimental results show the feature selection algorithm and the paired classifiers with the highest accuracy. This study found that using ANOVA as the feature selection algorithm and LGBM as the classifier can have 48% accuracy, 73% adjacent accuracy, and 85% reduction in the number of features.
AB - Reading is one of the most important ways of acquiring knowledge. Researchers have pointed out that to promote the effectiveness of reading, it is very important to provide materials of the right level of difficulty. If the reading materials are too easy, readers usually cannot acquire new knowledge in the process of reading; on the other hand, if the materials are too difficult, it will cause excessive cognitive burden to the readers, affecting their learning effectiveness. Therefore, giving readers appropriate reading is an important issue. To address this issue, many scholars have begun to develop readability models and found that feature selection enhances the accuracy of readability models. However, the interaction between various feature algorithms and classifiers has yet to be much explored in past studies. Therefore, in this study, three feature selection algorithms, Chi-squared test, ANOVA, Mutual Information, and 25 classifiers, were applied to compare the accuracy of readability models for grades 1-12 in the textbooks of the Chinese language. The experimental results show the feature selection algorithm and the paired classifiers with the highest accuracy. This study found that using ANOVA as the feature selection algorithm and LGBM as the classifier can have 48% accuracy, 73% adjacent accuracy, and 85% reduction in the number of features.
KW - Chinese Readability
KW - Classifier
KW - Feature Selection
KW - Machine Learning
UR - http://www.scopus.com/inward/record.url?scp=85184838064&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85184838064&partnerID=8YFLogxK
M3 - 會議論文篇章
AN - SCOPUS:85184838064
T3 - ROCLING 2023 - Proceedings of the 35th Conference on Computational Linguistics and Speech Processing
SP - 106
EP - 115
BT - ROCLING 2023 - Proceedings of the 35th Conference on Computational Linguistics and Speech Processing
A2 - Wu, Jheng-Long
A2 - Su, Ming-Hsiang
A2 - Huang, Hen-Hsen
A2 - Tsao, Yu
A2 - Tseng, Hou-Chiang
A2 - Chang, Chia-Hui
A2 - Lee, Lung-Hao
A2 - Liao, Yuan-Fu
A2 - Ma, Wei-Yun
PB - The Association for Computational Linguistics and Chinese Language Processing (ACLCLP)
T2 - 35th Conference on Computational Linguistics and Speech Processing, ROCLING 2023
Y2 - 20 October 2023 through 21 October 2023
ER -