Toward generic title generation for clustered documents

Yuen Hsien Tseng*, Chi Jen Lin, Hsiu Han Chen, Yu I. Lin

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

24 Citations (Scopus)

Abstract

A cluster labeling algorithm for creating generic titles based on external resources such as WordNet is proposed. Our method first extracts category-specific terms as cluster descriptors. These descriptors are then mapped to generic terms based on a hypernym search algorithm. The proposed method has been evaluated on a patent document collection and a subset of the Reuters-21578 collection. Experimental results revealed that our method performs as anticipated. Real-case applications of these generic terms show promising in assisting humans in interpreting the clustered topics. Our method is general enough such that it can be easily extended to use other hierarchical resources for adaptable label generation.

Original languageEnglish
Title of host publicationInformation Retrieval Technology - Third Asia Information Retrieval Symposium, AIRS 2006, Proceedings
PublisherSpringer Verlag
Pages145-157
Number of pages13
ISBN (Print)3540457801, 9783540457800
DOIs
Publication statusPublished - 2006
Event3rd Asia Information Retrieval Symposium, AIRS 2006 - Singapore, Singapore
Duration: 2006 Oct 162006 Oct 18

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4182 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other3rd Asia Information Retrieval Symposium, AIRS 2006
Country/TerritorySingapore
CitySingapore
Period2006/10/162006/10/18

Keywords

  • Correlation coefficient
  • Hypernym search
  • WordNet

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Toward generic title generation for clustered documents'. Together they form a unique fingerprint.

Cite this