Generating phonetic cognates to handle named entities in English-Chinese cross-language spoken document retrieval

H. M. Meng, Wai Kit Lo, Berlin Chen, K. Tang

研究成果: 書貢獻/報告類型會議論文篇章

83 引文 斯高帕斯(Scopus)

摘要

We have developed a technique for automatic transliteration of named entities for English-Chinese cross-language spoken document retrieval (CL-SDR). Our retrieval system integrates machine translation, speech recognition and information retrieval technologies. An English news story forms a textual query that is automatically translated into Chinese words, which are mapped into Mandarin syllables by pronunciation dictionary lookup. Mandarin radio news broadcasts form spoken documents that are indexed by word and syllable recognition. The information retrieval engine performs matching in both word and syllable scales. The English queries contain many named entities that tend to be out-of-vocabulary words for machine translation and speech recognition, and are omitted in retrieval. Names are often transliterated across languages and are generally important for retrieval. We present a technique that takes in a name spelling and automatically generates a phonetic cognate in terms of Chinese syllables to be used in retrieval. Experiments show consistent retrieval performance improvement by including the use of named entities in this way.

原文英語
主出版物標題2001 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2001 - Conference Proceedings
發行者Institute of Electrical and Electronics Engineers Inc.
頁面311-314
頁數4
ISBN(電子)078037343X, 9780780373433
DOIs
出版狀態已發佈 - 2001
對外發佈
事件IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2001 - Madonna di Campiglio, 意大利
持續時間: 2001 12月 92001 12月 13

出版系列

名字2001 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2001 - Conference Proceedings

其他

其他IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2001
國家/地區意大利
城市Madonna di Campiglio
期間2001/12/092001/12/13

ASJC Scopus subject areas

  • 硬體和架構
  • 電氣與電子工程

指紋

深入研究「Generating phonetic cognates to handle named entities in English-Chinese cross-language spoken document retrieval」主題。共同形成了獨特的指紋。

引用此