TY - GEN
T1 - Topic modeling for spoken document retrieval using word- and syllable-level information
AU - Lin, Shih Hsiang
AU - Chen, Berlin
PY - 2009
Y1 - 2009
N2 - Topic modeling for information retrieval (IR) has attracted significant attention and demonstrated good performance in a wide variety of tasks over the years. In this article, we first present a comprehensive comparison among various topic modeling approaches, including the so-called document topic models (DTM) and word topic models (WTM), for Chinese spoken document retrieval (SDR). Moreover, in order to lessen SDR performance degradation when using imperfect recognition transcripts, we also leverage different levels of indexing features for topic modeling, including words, syllable-level units and their combinations. All the experiments are performed on the TDT Chinese collection.
AB - Topic modeling for information retrieval (IR) has attracted significant attention and demonstrated good performance in a wide variety of tasks over the years. In this article, we first present a comprehensive comparison among various topic modeling approaches, including the so-called document topic models (DTM) and word topic models (WTM), for Chinese spoken document retrieval (SDR). Moreover, in order to lessen SDR performance degradation when using imperfect recognition transcripts, we also leverage different levels of indexing features for topic modeling, including words, syllable-level units and their combinations. All the experiments are performed on the TDT Chinese collection.
KW - Document topic models
KW - Information retrieval
KW - Speech recognition
KW - Spoken document retrieval
KW - Word topic models
UR - http://www.scopus.com/inward/record.url?scp=72249090206&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=72249090206&partnerID=8YFLogxK
U2 - 10.1145/1631127.1631129
DO - 10.1145/1631127.1631129
M3 - Conference contribution
AN - SCOPUS:72249090206
SN - 9781605587622
T3 - 3rd Workshop on Searching Spontaneous Conversational Speech, SSCS'09, Co-located with the 2009 ACM International Conference on Multimedia, MM'09
SP - 3
EP - 10
BT - 3rd Workshop on Searching Spontaneous Conversational Speech, SSCS'09, Co-located with the 2009 ACM International Conference on Multimedia, MM'09
T2 - 3rd Workshop on Searching Spontaneous Conversational Speech, SSCS'09, Co-located with the 2009 ACM International Conference on Multimedia, MM'09
Y2 - 19 October 2009 through 24 October 2009
ER -