Constructing a novel Chinese readability classification model using principal component analysis and genetic programming

Yi Shian Lee, Hou Chiang Tseng, Ju Ling Chen, Chun Yi Peng, Tao Hsing Chang, Yao-Ting Sung

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

The studies of readability aim to measure the level of text difficulty. Although traditional formulae such as the Flesch-Kincaid formula can properly predict text readability, they are only effective for English text. Other formulae with very few features may result in inaccurate text classification. The study takes into account multiple linguistic features, and attempts to increase the level of accuracy in text classification by adopting a new model which integrates Principal Component Analysis (PCA) with Genetic Programming (GP). Empirical data are utilized to demonstrate the performance of the proposed model.

Original languageEnglish
Title of host publicationProceedings of the 12th IEEE International Conference on Advanced Learning Technologies, ICALT 2012
Pages164-166
Number of pages3
DOIs
Publication statusPublished - 2012 Oct 8
Event12th IEEE International Conference on Advanced Learning Technologies, ICALT 2012 - Rome, Italy
Duration: 2012 Jul 42012 Jul 6

Publication series

NameProceedings of the 12th IEEE International Conference on Advanced Learning Technologies, ICALT 2012

Other

Other12th IEEE International Conference on Advanced Learning Technologies, ICALT 2012
CountryItaly
CityRome
Period12/7/412/7/6

Fingerprint

Genetic programming
Principal component analysis
programming
Linguistics
linguistics
performance

Keywords

  • Genetic programming
  • Principal component analysis
  • Readability
  • Text analysis component

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Education

Cite this

Lee, Y. S., Tseng, H. C., Chen, J. L., Peng, C. Y., Chang, T. H., & Sung, Y-T. (2012). Constructing a novel Chinese readability classification model using principal component analysis and genetic programming. In Proceedings of the 12th IEEE International Conference on Advanced Learning Technologies, ICALT 2012 (pp. 164-166). [6268065] (Proceedings of the 12th IEEE International Conference on Advanced Learning Technologies, ICALT 2012). https://doi.org/10.1109/ICALT.2012.134

Constructing a novel Chinese readability classification model using principal component analysis and genetic programming. / Lee, Yi Shian; Tseng, Hou Chiang; Chen, Ju Ling; Peng, Chun Yi; Chang, Tao Hsing; Sung, Yao-Ting.

Proceedings of the 12th IEEE International Conference on Advanced Learning Technologies, ICALT 2012. 2012. p. 164-166 6268065 (Proceedings of the 12th IEEE International Conference on Advanced Learning Technologies, ICALT 2012).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Lee, YS, Tseng, HC, Chen, JL, Peng, CY, Chang, TH & Sung, Y-T 2012, Constructing a novel Chinese readability classification model using principal component analysis and genetic programming. in Proceedings of the 12th IEEE International Conference on Advanced Learning Technologies, ICALT 2012., 6268065, Proceedings of the 12th IEEE International Conference on Advanced Learning Technologies, ICALT 2012, pp. 164-166, 12th IEEE International Conference on Advanced Learning Technologies, ICALT 2012, Rome, Italy, 12/7/4. https://doi.org/10.1109/ICALT.2012.134
Lee YS, Tseng HC, Chen JL, Peng CY, Chang TH, Sung Y-T. Constructing a novel Chinese readability classification model using principal component analysis and genetic programming. In Proceedings of the 12th IEEE International Conference on Advanced Learning Technologies, ICALT 2012. 2012. p. 164-166. 6268065. (Proceedings of the 12th IEEE International Conference on Advanced Learning Technologies, ICALT 2012). https://doi.org/10.1109/ICALT.2012.134
Lee, Yi Shian ; Tseng, Hou Chiang ; Chen, Ju Ling ; Peng, Chun Yi ; Chang, Tao Hsing ; Sung, Yao-Ting. / Constructing a novel Chinese readability classification model using principal component analysis and genetic programming. Proceedings of the 12th IEEE International Conference on Advanced Learning Technologies, ICALT 2012. 2012. pp. 164-166 (Proceedings of the 12th IEEE International Conference on Advanced Learning Technologies, ICALT 2012).
@inproceedings{288974564ee44271851b2f8010fc6e56,
title = "Constructing a novel Chinese readability classification model using principal component analysis and genetic programming",
abstract = "The studies of readability aim to measure the level of text difficulty. Although traditional formulae such as the Flesch-Kincaid formula can properly predict text readability, they are only effective for English text. Other formulae with very few features may result in inaccurate text classification. The study takes into account multiple linguistic features, and attempts to increase the level of accuracy in text classification by adopting a new model which integrates Principal Component Analysis (PCA) with Genetic Programming (GP). Empirical data are utilized to demonstrate the performance of the proposed model.",
keywords = "Genetic programming, Principal component analysis, Readability, Text analysis component",
author = "Lee, {Yi Shian} and Tseng, {Hou Chiang} and Chen, {Ju Ling} and Peng, {Chun Yi} and Chang, {Tao Hsing} and Yao-Ting Sung",
year = "2012",
month = "10",
day = "8",
doi = "10.1109/ICALT.2012.134",
language = "English",
isbn = "9780769547022",
series = "Proceedings of the 12th IEEE International Conference on Advanced Learning Technologies, ICALT 2012",
pages = "164--166",
booktitle = "Proceedings of the 12th IEEE International Conference on Advanced Learning Technologies, ICALT 2012",

}

TY - GEN

T1 - Constructing a novel Chinese readability classification model using principal component analysis and genetic programming

AU - Lee, Yi Shian

AU - Tseng, Hou Chiang

AU - Chen, Ju Ling

AU - Peng, Chun Yi

AU - Chang, Tao Hsing

AU - Sung, Yao-Ting

PY - 2012/10/8

Y1 - 2012/10/8

N2 - The studies of readability aim to measure the level of text difficulty. Although traditional formulae such as the Flesch-Kincaid formula can properly predict text readability, they are only effective for English text. Other formulae with very few features may result in inaccurate text classification. The study takes into account multiple linguistic features, and attempts to increase the level of accuracy in text classification by adopting a new model which integrates Principal Component Analysis (PCA) with Genetic Programming (GP). Empirical data are utilized to demonstrate the performance of the proposed model.

AB - The studies of readability aim to measure the level of text difficulty. Although traditional formulae such as the Flesch-Kincaid formula can properly predict text readability, they are only effective for English text. Other formulae with very few features may result in inaccurate text classification. The study takes into account multiple linguistic features, and attempts to increase the level of accuracy in text classification by adopting a new model which integrates Principal Component Analysis (PCA) with Genetic Programming (GP). Empirical data are utilized to demonstrate the performance of the proposed model.

KW - Genetic programming

KW - Principal component analysis

KW - Readability

KW - Text analysis component

UR - http://www.scopus.com/inward/record.url?scp=84867026124&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84867026124&partnerID=8YFLogxK

U2 - 10.1109/ICALT.2012.134

DO - 10.1109/ICALT.2012.134

M3 - Conference contribution

AN - SCOPUS:84867026124

SN - 9780769547022

T3 - Proceedings of the 12th IEEE International Conference on Advanced Learning Technologies, ICALT 2012

SP - 164

EP - 166

BT - Proceedings of the 12th IEEE International Conference on Advanced Learning Technologies, ICALT 2012

ER -