This paper presents a system for speech retrieval of Mandarin broadcast news. First, several data-driven and unsupervised approaches are integrated into the broadcast news transcription system to improve the speech recognition accuracy and efficiency. Then, a multi-scale indexing paradigm for broadcast news retrieval is proposed to make use of the special structural properties of the Chinese language as well as to alleviate the problems caused by the speech recognition errors. Finally, we use the PDA as the platform and Mandarin broadcast news stories collected in Taiwan as the document collection to establish a speech-based multimedia information retrieval prototype system. Very encouraging results are obtained.
|Number of pages||4|
|Publication status||Published - 2005 Dec 1|
|Event||9th European Conference on Speech Communication and Technology - Lisbon, Portugal|
Duration: 2005 Sep 4 → 2005 Sep 8
|Other||9th European Conference on Speech Communication and Technology|
|Period||05/9/4 → 05/9/8|
ASJC Scopus subject areas