TY - GEN
T1 - Using corpus-based linguistic approaches in sense prediction study
AU - Hong, Jia Fei
AU - Ker, Sue Jin
AU - Huang, Chu Ren
AU - Ahrens, Kathleen
PY - 2010
Y1 - 2010
N2 - In this study, we propose to use two corpus-based linguistic approaches for a sense prediction study. We will concentrate on the character similarity clustering approach and concept similarity clustering approach to predict the senses of non-assigned words by using corpora and tools, such as Chinese Gigaword Corpus, and HowNet. In this study, we would then like to evaluate their predictions via the sense divisions of Chinese Wordnet and Xiandai Hanyu Cidian. Using these corpora, we will determine the clusters of our four target words ---- chi1 "eat", wan2 "play", huan4 "change" and shao1 "burn" in order to predict their all possible senses and evaluate them. This requirement will demonstrate the visibility of the corpus-based approaches.
AB - In this study, we propose to use two corpus-based linguistic approaches for a sense prediction study. We will concentrate on the character similarity clustering approach and concept similarity clustering approach to predict the senses of non-assigned words by using corpora and tools, such as Chinese Gigaword Corpus, and HowNet. In this study, we would then like to evaluate their predictions via the sense divisions of Chinese Wordnet and Xiandai Hanyu Cidian. Using these corpora, we will determine the clusters of our four target words ---- chi1 "eat", wan2 "play", huan4 "change" and shao1 "burn" in order to predict their all possible senses and evaluate them. This requirement will demonstrate the visibility of the corpus-based approaches.
KW - Character similarity clustering
KW - Concept similarity clustering
KW - Corpus-based approach
KW - Evaluation
KW - Lexical ambiguity
KW - Sense prediction
UR - http://www.scopus.com/inward/record.url?scp=84863882698&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84863882698&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84863882698
SN - 9784905166009
T3 - PACLIC 24 - Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation
SP - 399
EP - 407
BT - PACLIC 24 - Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation
T2 - 24th Pacific Asia Conference on Language, Information and Computation, PACLIC 24
Y2 - 4 November 2010 through 7 November 2010
ER -