Chinese text summarization using a trainable summarizer and latent semantic analysis

Jen Yuan Yeh, Hao Ren Ke, Wei Pang Yang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

20 Citations (Scopus)

Abstract

In this paper, two novel approaches are proposed to extract important sentences from a document to create its summary. The first is a corpus-based approach using feature analysis. It brings up three new ideas: 1) to employ ranked position to emphasize the significance of sentence position, 2) to reshape word unit to achieve higher accuracy of keyword importance, and 3) to train a score function by the genetic algorithm for obtaining a suitable combination of feature weights. The second approach combines the ideas of latent semantic analysis and text relationship maps to interpret conceptual structures of a document. Both approaches are applied to Chinese text summarization. The two approaches were evaluated by using a data corpus composed of 100 articles about politics from New Taiwan Weekly, and when the compression ratio was 30%, average recalls of 52.0% and 45.6% were achieved respectively.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer Verlag
Pages76-87
Number of pages12
ISBN (Print)3540002618, 9783540002611
Publication statusPublished - 2002 Jan 1
Event5th International Conference on Asian Digital Libraries, ICADL 2002 - Singapore, Singapore
Duration: 2002 Dec 112002 Dec 14

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2555
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other5th International Conference on Asian Digital Libraries, ICADL 2002
CountrySingapore
CitySingapore
Period02/12/1102/12/14

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Chinese text summarization using a trainable summarizer and latent semantic analysis'. Together they form a unique fingerprint.

  • Cite this

    Yeh, J. Y., Ke, H. R., & Yang, W. P. (2002). Chinese text summarization using a trainable summarizer and latent semantic analysis. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 76-87). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 2555). Springer Verlag.