TY - GEN
T1 - Transliteration retrieval model for cross lingual information retrieval
AU - Jan, Ea Ee
AU - Lin, Shih Hsiang
AU - Chen, Berlin
PY - 2010
Y1 - 2010
N2 - The performance of transliteration from a source language to a target language builds the ground work in support of proper name Cross Lingual Information Retrieval (CLIR). Traditionally, this task is accomplished by two separate modules: transliteration and retrieval. Queries are first transliterated to target language using one or multiple hypotheses. The retrieval is then carried out based on translated queries. The transliteration often results in 30-50% errors with top 1 hypothesis, thus leading to significant performance degradation in CLIR. Therefore, we proposed a unified transliteration retrieval model that incorporates the transliteration similarity measurement into the relevance scoring function. In addition, we presented an efficient and robust method in similarity measurement for a given proper name pair using the Hidden Markov Model (HMM) based alignment and a Statistical Machine Translation (SMT) framework. Experimental data showed significant results with the proposed integrated method on the NTCIR7 IR4QA task, which demonstrated a greater flexibility and acceptance in transliteration.
AB - The performance of transliteration from a source language to a target language builds the ground work in support of proper name Cross Lingual Information Retrieval (CLIR). Traditionally, this task is accomplished by two separate modules: transliteration and retrieval. Queries are first transliterated to target language using one or multiple hypotheses. The retrieval is then carried out based on translated queries. The transliteration often results in 30-50% errors with top 1 hypothesis, thus leading to significant performance degradation in CLIR. Therefore, we proposed a unified transliteration retrieval model that incorporates the transliteration similarity measurement into the relevance scoring function. In addition, we presented an efficient and robust method in similarity measurement for a given proper name pair using the Hidden Markov Model (HMM) based alignment and a Statistical Machine Translation (SMT) framework. Experimental data showed significant results with the proposed integrated method on the NTCIR7 IR4QA task, which demonstrated a greater flexibility and acceptance in transliteration.
KW - NTCIR
KW - cross lingual information retrieval (CLIR)
KW - retrieval model
KW - statistical machine translation (SMT)
KW - transliteration
UR - http://www.scopus.com/inward/record.url?scp=78650878733&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78650878733&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-17187-1_17
DO - 10.1007/978-3-642-17187-1_17
M3 - Conference contribution
AN - SCOPUS:78650878733
SN - 3642171869
SN - 9783642171864
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 183
EP - 192
BT - Information Retrieval Technology - 6th Asia Information Retrieval Societies Conference, AIRS 2010, Proceedings
T2 - 6th Asia Information Retrieval Societies Conference, AIRS 2010
Y2 - 1 December 2010 through 3 December 2010
ER -