TY - GEN
T1 - Chinese text summarization using a trainable summarizer and latent semantic analysis
AU - Yeh, Jen Yuan
AU - Ke, Hao Ren
AU - Yang, Wei Pang
PY - 2002/1/1
Y1 - 2002/1/1
N2 - In this paper, two novel approaches are proposed to extract important sentences from a document to create its summary. The first is a corpus-based approach using feature analysis. It brings up three new ideas: 1) to employ ranked position to emphasize the significance of sentence position, 2) to reshape word unit to achieve higher accuracy of keyword importance, and 3) to train a score function by the genetic algorithm for obtaining a suitable combination of feature weights. The second approach combines the ideas of latent semantic analysis and text relationship maps to interpret conceptual structures of a document. Both approaches are applied to Chinese text summarization. The two approaches were evaluated by using a data corpus composed of 100 articles about politics from New Taiwan Weekly, and when the compression ratio was 30%, average recalls of 52.0% and 45.6% were achieved respectively.
AB - In this paper, two novel approaches are proposed to extract important sentences from a document to create its summary. The first is a corpus-based approach using feature analysis. It brings up three new ideas: 1) to employ ranked position to emphasize the significance of sentence position, 2) to reshape word unit to achieve higher accuracy of keyword importance, and 3) to train a score function by the genetic algorithm for obtaining a suitable combination of feature weights. The second approach combines the ideas of latent semantic analysis and text relationship maps to interpret conceptual structures of a document. Both approaches are applied to Chinese text summarization. The two approaches were evaluated by using a data corpus composed of 100 articles about politics from New Taiwan Weekly, and when the compression ratio was 30%, average recalls of 52.0% and 45.6% were achieved respectively.
UR - http://www.scopus.com/inward/record.url?scp=84949183406&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84949183406&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84949183406
SN - 3540002618
SN - 9783540002611
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 76
EP - 87
BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PB - Springer Verlag
T2 - 5th International Conference on Asian Digital Libraries, ICADL 2002
Y2 - 11 December 2002 through 14 December 2002
ER -