Using corpus-based linguistic approaches in sense prediction study

Jia Fei Hong, Sue Jin Ker, Chu Ren Huang, Kathleen Ahrens

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this study, we propose to use two corpus-based linguistic approaches for a sense prediction study. We will concentrate on the character similarity clustering approach and concept similarity clustering approach to predict the senses of non-assigned words by using corpora and tools, such as Chinese Gigaword Corpus, and HowNet. In this study, we would then like to evaluate their predictions via the sense divisions of Chinese Wordnet and Xiandai Hanyu Cidian. Using these corpora, we will determine the clusters of our four target words ---- chi1 "eat", wan2 "play", huan4 "change" and shao1 "burn" in order to predict their all possible senses and evaluate them. This requirement will demonstrate the visibility of the corpus-based approaches.

Original languageEnglish
Title of host publicationPACLIC 24 - Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation
Pages399-407
Number of pages9
Publication statusPublished - 2010 Dec 1
Externally publishedYes
Event24th Pacific Asia Conference on Language, Information and Computation, PACLIC 24 - Sendai, Japan
Duration: 2010 Nov 42010 Nov 7

Publication series

NamePACLIC 24 - Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation

Conference

Conference24th Pacific Asia Conference on Language, Information and Computation, PACLIC 24
CountryJapan
CitySendai
Period10/11/410/11/7

    Fingerprint

Keywords

  • Character similarity clustering
  • Concept similarity clustering
  • Corpus-based approach
  • Evaluation
  • Lexical ambiguity
  • Sense prediction

ASJC Scopus subject areas

  • Language and Linguistics
  • Computer Science (miscellaneous)

Cite this

Hong, J. F., Ker, S. J., Huang, C. R., & Ahrens, K. (2010). Using corpus-based linguistic approaches in sense prediction study. In PACLIC 24 - Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation (pp. 399-407). (PACLIC 24 - Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation).