Statistical language model adaptation for Mandarin broadcast news transcription

Berlin Chen, Wen Hung Tsai, Jen Wei Kuo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

This paper investigates statistical language model adaptation for Mandarin broadcast news transcription. A topical mixture model was proposed to explore the long-span latent topical information for dynamic language model adaptation. The underlying characteristics and various kinds of model complexities were extensively investigated, while their performance was verified by comparison with the conventional MAP-based adaptation approaches, which are devoted to extracting the short-span n-gram information. The speech recognition experiments were conducted on the broadcast news collected in Taiwan. Very promising results in both perplexity and word error rate reductions were initially obtained.

Original languageEnglish
Title of host publication2004 International Symposium on Chinese Spoken Language Processing - Proceedings
Pages313-316
Number of pages4
Publication statusPublished - 2004 Dec 1
Event2004 International Symposium on Chinese Spoken Language Processing - Hong Kong, China, Hong Kong
Duration: 2004 Dec 152004 Dec 18

Publication series

Name2004 International Symposium on Chinese Spoken Language Processing - Proceedings

Other

Other2004 International Symposium on Chinese Spoken Language Processing
CountryHong Kong
CityHong Kong, China
Period04/12/1504/12/18

Fingerprint

Transcription
Speech recognition
Experiments

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Chen, B., Tsai, W. H., & Kuo, J. W. (2004). Statistical language model adaptation for Mandarin broadcast news transcription. In 2004 International Symposium on Chinese Spoken Language Processing - Proceedings (pp. 313-316). [L7.3] (2004 International Symposium on Chinese Spoken Language Processing - Proceedings).

Statistical language model adaptation for Mandarin broadcast news transcription. / Chen, Berlin; Tsai, Wen Hung; Kuo, Jen Wei.

2004 International Symposium on Chinese Spoken Language Processing - Proceedings. 2004. p. 313-316 L7.3 (2004 International Symposium on Chinese Spoken Language Processing - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Chen, B, Tsai, WH & Kuo, JW 2004, Statistical language model adaptation for Mandarin broadcast news transcription. in 2004 International Symposium on Chinese Spoken Language Processing - Proceedings., L7.3, 2004 International Symposium on Chinese Spoken Language Processing - Proceedings, pp. 313-316, 2004 International Symposium on Chinese Spoken Language Processing, Hong Kong, China, Hong Kong, 04/12/15.
Chen B, Tsai WH, Kuo JW. Statistical language model adaptation for Mandarin broadcast news transcription. In 2004 International Symposium on Chinese Spoken Language Processing - Proceedings. 2004. p. 313-316. L7.3. (2004 International Symposium on Chinese Spoken Language Processing - Proceedings).
Chen, Berlin ; Tsai, Wen Hung ; Kuo, Jen Wei. / Statistical language model adaptation for Mandarin broadcast news transcription. 2004 International Symposium on Chinese Spoken Language Processing - Proceedings. 2004. pp. 313-316 (2004 International Symposium on Chinese Spoken Language Processing - Proceedings).
@inproceedings{fb1c6062959344fdaa93d12a6624a149,
title = "Statistical language model adaptation for Mandarin broadcast news transcription",
abstract = "This paper investigates statistical language model adaptation for Mandarin broadcast news transcription. A topical mixture model was proposed to explore the long-span latent topical information for dynamic language model adaptation. The underlying characteristics and various kinds of model complexities were extensively investigated, while their performance was verified by comparison with the conventional MAP-based adaptation approaches, which are devoted to extracting the short-span n-gram information. The speech recognition experiments were conducted on the broadcast news collected in Taiwan. Very promising results in both perplexity and word error rate reductions were initially obtained.",
author = "Berlin Chen and Tsai, {Wen Hung} and Kuo, {Jen Wei}",
year = "2004",
month = "12",
day = "1",
language = "English",
isbn = "0780386787",
series = "2004 International Symposium on Chinese Spoken Language Processing - Proceedings",
pages = "313--316",
booktitle = "2004 International Symposium on Chinese Spoken Language Processing - Proceedings",

}

TY - GEN

T1 - Statistical language model adaptation for Mandarin broadcast news transcription

AU - Chen, Berlin

AU - Tsai, Wen Hung

AU - Kuo, Jen Wei

PY - 2004/12/1

Y1 - 2004/12/1

N2 - This paper investigates statistical language model adaptation for Mandarin broadcast news transcription. A topical mixture model was proposed to explore the long-span latent topical information for dynamic language model adaptation. The underlying characteristics and various kinds of model complexities were extensively investigated, while their performance was verified by comparison with the conventional MAP-based adaptation approaches, which are devoted to extracting the short-span n-gram information. The speech recognition experiments were conducted on the broadcast news collected in Taiwan. Very promising results in both perplexity and word error rate reductions were initially obtained.

AB - This paper investigates statistical language model adaptation for Mandarin broadcast news transcription. A topical mixture model was proposed to explore the long-span latent topical information for dynamic language model adaptation. The underlying characteristics and various kinds of model complexities were extensively investigated, while their performance was verified by comparison with the conventional MAP-based adaptation approaches, which are devoted to extracting the short-span n-gram information. The speech recognition experiments were conducted on the broadcast news collected in Taiwan. Very promising results in both perplexity and word error rate reductions were initially obtained.

UR - http://www.scopus.com/inward/record.url?scp=21444452767&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=21444452767&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:21444452767

SN - 0780386787

SN - 9780780386785

T3 - 2004 International Symposium on Chinese Spoken Language Processing - Proceedings

SP - 313

EP - 316

BT - 2004 International Symposium on Chinese Spoken Language Processing - Proceedings

ER -