A discriminative HMM/N-gram-based retrieval approach for Mandarin spoken documents

Berlin Chen*, Hsin Min Wang, Lin Shan Lee

*此作品的通信作者

研究成果: 雜誌貢獻期刊論文同行評審

29 引文 斯高帕斯(Scopus)

摘要

In recent years, statistical modeling approaches have steadily gained in popularity in the field of information retrieval. This article presents an HMM/N-gram-based retrieval approach for Mandarin spoken documents. The underlying characteristics and the various structures of this approach were extensively investigated and analyzed. The retrieval capabilities were verified by tests with word- and syllable-level indexing features and comparisons to the conventional vector-space model approach. To further improve the discrimination capabilities of the HMMs, both the expectation-maximization (EM) and minimum classification error (MCE) training algorithms were introduced in training. Fusion of information via indexing word- and syllable-level features was also investigated. The spoken document retrieval experiments were performed on the Topic Detection and Tracking Corpora (TDT-2 and TDT-3). Very encouraging retrieval performance was obtained.

原文英語
頁(從 - 到)128-145
頁數18
期刊ACM Transactions on Asian Language Information Processing
3
發行號2
DOIs
出版狀態已發佈 - 2004 六月

ASJC Scopus subject areas

  • 電腦科學(全部)

指紋

深入研究「A discriminative HMM/N-gram-based retrieval approach for Mandarin spoken documents」主題。共同形成了獨特的指紋。

引用此