Content-based language models for spoken document retrieval

Hsin Min Wang, Berlin Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Spoken document retrieval (SDR) has been extensively studied in recent years because of its potential use in navigating large multimedia collections in the near future. This paper presents a novel concept of applying the content-based language models to spoken document retrieval. In an example task for retrieval of Mandarin broadcast news, the content-based language models either trained with the automatic transcriptions of the spoken documents or adapted from the baseline language models using the automatic transcriptions of the spoken documents were used to create the more accurate recognition results and indexing terms from both the spoken documents and the speech queries. We report on some interesting findings obtained in this research.

Original languageEnglish
Title of host publicationProceedings of the 5th international Workshop on Information Retrieval with Asian Languages, IRAL 2000
PublisherAssociation for Computing Machinery, Inc
Pages149-155
Number of pages7
ISBN (Electronic)1581133006, 9781581133004
DOIs
Publication statusPublished - 2000 Nov 1
Event5th International Workshop on Information Retrieval with Asian Languages, IRAL 2000 - Hong Kong, China
Duration: 2000 Sep 302000 Oct 1

Publication series

NameProceedings of the 5th international Workshop on Information Retrieval with Asian Languages, IRAL 2000

Other

Other5th International Workshop on Information Retrieval with Asian Languages, IRAL 2000
CountryChina
CityHong Kong
Period00/9/3000/10/1

Fingerprint

Transcription

Keywords

  • Content-based language models
  • Speech recognition
  • Spoken document retrieval (SDR)

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems

Cite this

Wang, H. M., & Chen, B. (2000). Content-based language models for spoken document retrieval. In Proceedings of the 5th international Workshop on Information Retrieval with Asian Languages, IRAL 2000 (pp. 149-155). (Proceedings of the 5th international Workshop on Information Retrieval with Asian Languages, IRAL 2000). Association for Computing Machinery, Inc. https://doi.org/10.1145/355214.355236

Content-based language models for spoken document retrieval. / Wang, Hsin Min; Chen, Berlin.

Proceedings of the 5th international Workshop on Information Retrieval with Asian Languages, IRAL 2000. Association for Computing Machinery, Inc, 2000. p. 149-155 (Proceedings of the 5th international Workshop on Information Retrieval with Asian Languages, IRAL 2000).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Wang, HM & Chen, B 2000, Content-based language models for spoken document retrieval. in Proceedings of the 5th international Workshop on Information Retrieval with Asian Languages, IRAL 2000. Proceedings of the 5th international Workshop on Information Retrieval with Asian Languages, IRAL 2000, Association for Computing Machinery, Inc, pp. 149-155, 5th International Workshop on Information Retrieval with Asian Languages, IRAL 2000, Hong Kong, China, 00/9/30. https://doi.org/10.1145/355214.355236
Wang HM, Chen B. Content-based language models for spoken document retrieval. In Proceedings of the 5th international Workshop on Information Retrieval with Asian Languages, IRAL 2000. Association for Computing Machinery, Inc. 2000. p. 149-155. (Proceedings of the 5th international Workshop on Information Retrieval with Asian Languages, IRAL 2000). https://doi.org/10.1145/355214.355236
Wang, Hsin Min ; Chen, Berlin. / Content-based language models for spoken document retrieval. Proceedings of the 5th international Workshop on Information Retrieval with Asian Languages, IRAL 2000. Association for Computing Machinery, Inc, 2000. pp. 149-155 (Proceedings of the 5th international Workshop on Information Retrieval with Asian Languages, IRAL 2000).
@inproceedings{d5abad5648c74677a31db2412d6f4d72,
title = "Content-based language models for spoken document retrieval",
abstract = "Spoken document retrieval (SDR) has been extensively studied in recent years because of its potential use in navigating large multimedia collections in the near future. This paper presents a novel concept of applying the content-based language models to spoken document retrieval. In an example task for retrieval of Mandarin broadcast news, the content-based language models either trained with the automatic transcriptions of the spoken documents or adapted from the baseline language models using the automatic transcriptions of the spoken documents were used to create the more accurate recognition results and indexing terms from both the spoken documents and the speech queries. We report on some interesting findings obtained in this research.",
keywords = "Content-based language models, Speech recognition, Spoken document retrieval (SDR)",
author = "Wang, {Hsin Min} and Berlin Chen",
year = "2000",
month = "11",
day = "1",
doi = "10.1145/355214.355236",
language = "English",
series = "Proceedings of the 5th international Workshop on Information Retrieval with Asian Languages, IRAL 2000",
publisher = "Association for Computing Machinery, Inc",
pages = "149--155",
booktitle = "Proceedings of the 5th international Workshop on Information Retrieval with Asian Languages, IRAL 2000",

}

TY - GEN

T1 - Content-based language models for spoken document retrieval

AU - Wang, Hsin Min

AU - Chen, Berlin

PY - 2000/11/1

Y1 - 2000/11/1

N2 - Spoken document retrieval (SDR) has been extensively studied in recent years because of its potential use in navigating large multimedia collections in the near future. This paper presents a novel concept of applying the content-based language models to spoken document retrieval. In an example task for retrieval of Mandarin broadcast news, the content-based language models either trained with the automatic transcriptions of the spoken documents or adapted from the baseline language models using the automatic transcriptions of the spoken documents were used to create the more accurate recognition results and indexing terms from both the spoken documents and the speech queries. We report on some interesting findings obtained in this research.

AB - Spoken document retrieval (SDR) has been extensively studied in recent years because of its potential use in navigating large multimedia collections in the near future. This paper presents a novel concept of applying the content-based language models to spoken document retrieval. In an example task for retrieval of Mandarin broadcast news, the content-based language models either trained with the automatic transcriptions of the spoken documents or adapted from the baseline language models using the automatic transcriptions of the spoken documents were used to create the more accurate recognition results and indexing terms from both the spoken documents and the speech queries. We report on some interesting findings obtained in this research.

KW - Content-based language models

KW - Speech recognition

KW - Spoken document retrieval (SDR)

UR - http://www.scopus.com/inward/record.url?scp=85027119404&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85027119404&partnerID=8YFLogxK

U2 - 10.1145/355214.355236

DO - 10.1145/355214.355236

M3 - Conference contribution

AN - SCOPUS:85027119404

T3 - Proceedings of the 5th international Workshop on Information Retrieval with Asian Languages, IRAL 2000

SP - 149

EP - 155

BT - Proceedings of the 5th international Workshop on Information Retrieval with Asian Languages, IRAL 2000

PB - Association for Computing Machinery, Inc

ER -