TY - GEN
T1 - I-vector based language modeling for spoken document retrieval
AU - Chen, Kuan Yu
AU - Lee, Hung Shin
AU - Wang, Hsin Min
AU - Chen, Berlin
AU - Chen, Hsin His
PY - 2014
Y1 - 2014
N2 - Since more and more multimedia data associated with spoken documents have been made available to the public, spoken document retrieval (SDR) has become an important research subject in the past two decades. The i-vector based framework has been proposed and introduced to language identification (LID) and speaker recognition (SR) tasks recently. The major contribution of the i-vector framework is to reduce a series of acoustic feature vectors of a speech utterance to a low-dimensional vector representation, and then numbers of well-developed postprocessing techniques (such as probabilistic linear discriminative analysis, PLDA) can be readily and effectively used. However, to our best knowledge, there is no research up to date on applying the i-vector framework for SDR or information retrieval (IR). In this paper, we make a step forward to formulate an i-vector based language modeling (IVLM) framework for SDR. Furthermore, we evaluate the proposed IVLM framework with both inductive and transductive learning strategies. We also exploit multi-levels of index features, including word- and subword-level units, in concert with the proposed framework. The results of SDR experiments conducted on the TDT-2 (Topic Detection and Tracking) collection demonstrate the performance merits of our proposed framework when compared to several existing approaches.
AB - Since more and more multimedia data associated with spoken documents have been made available to the public, spoken document retrieval (SDR) has become an important research subject in the past two decades. The i-vector based framework has been proposed and introduced to language identification (LID) and speaker recognition (SR) tasks recently. The major contribution of the i-vector framework is to reduce a series of acoustic feature vectors of a speech utterance to a low-dimensional vector representation, and then numbers of well-developed postprocessing techniques (such as probabilistic linear discriminative analysis, PLDA) can be readily and effectively used. However, to our best knowledge, there is no research up to date on applying the i-vector framework for SDR or information retrieval (IR). In this paper, we make a step forward to formulate an i-vector based language modeling (IVLM) framework for SDR. Furthermore, we evaluate the proposed IVLM framework with both inductive and transductive learning strategies. We also exploit multi-levels of index features, including word- and subword-level units, in concert with the proposed framework. The results of SDR experiments conducted on the TDT-2 (Topic Detection and Tracking) collection demonstrate the performance merits of our proposed framework when compared to several existing approaches.
KW - Spoken document retrieval
KW - i-vector
KW - inductive
KW - language modeling
KW - transductive
UR - http://www.scopus.com/inward/record.url?scp=84905232829&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84905232829&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2014.6854974
DO - 10.1109/ICASSP.2014.6854974
M3 - Conference contribution
AN - SCOPUS:84905232829
SN - 9781479928927
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 7083
EP - 7087
BT - 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
Y2 - 4 May 2014 through 9 May 2014
ER -