Comparison of word and subword indexing techniques for mandarin Chinese spoken document retrieval

Hsin Min Wang, Berlin Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we investigate the use of words and subwords (including both characters and syllables) in audio indexing for Mandarin Chinese spoken document retrieval. Two retrieval approaches, including the well-known vector space model approach and the newly proposed HMM/N-gram-based approach, are used in the present work. We focus on the use of an entire Chinese textual story (from a newspaper) as a query to retrieve Mandarin Chinese spoken documents (from news broadcasts). Experiments are based on the Topic Detection and Tracking Corpora.

Original languageEnglish
Title of host publicationAdvances in Multimedia Information Processing - PCM 2001 - 2nd IEEE Pacific Rim Conference on Multimedia, Proceedings
EditorsHeung-Yeung Shum, Mark Liao, Shih-Fu Chang
PublisherSpringer Verlag
Pages606-613
Number of pages8
ISBN (Print)3540426809, 9783540426806
DOIs
Publication statusPublished - 2001
Externally publishedYes
Event2nd IEEE Pacific-Rim Conference on Multimedia, IEEE-PCM 2001 - Beijing, China
Duration: 2001 Oct 242001 Oct 26

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2195
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other2nd IEEE Pacific-Rim Conference on Multimedia, IEEE-PCM 2001
Country/TerritoryChina
CityBeijing
Period2001/10/242001/10/26

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Comparison of word and subword indexing techniques for mandarin Chinese spoken document retrieval'. Together they form a unique fingerprint.

Cite this