Using corpus-based linguistic approaches in sense prediction study

Jia Fei Hong*, Sue Jin Ker, Chu Ren Huang, Kathleen Ahrens

*Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

Abstract

In this study, we propose to use two corpus-based linguistic approaches for a sense prediction study. We will concentrate on the character similarity clustering approach and concept similarity clustering approach to predict the senses of non-assigned words by using corpora and tools, such as Chinese Gigaword Corpus, and HowNet. In this study, we would then like to evaluate their predictions via the sense divisions of Chinese Wordnet and Xiandai Hanyu Cidian. Using these corpora, we will determine the clusters of our four target words ---- chi1 "eat", wan2 "play", huan4 "change" and shao1 "burn" in order to predict their all possible senses and evaluate them. This requirement will demonstrate the visibility of the corpus-based approaches.

Original languageEnglish
Title of host publicationPACLIC 24 - Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation
Pages399-407
Number of pages9
Publication statusPublished - 2010
Event24th Pacific Asia Conference on Language, Information and Computation, PACLIC 24 - Sendai, Japan
Duration: 4 Nov 20107 Nov 2010

Publication series

NamePACLIC 24 - Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation

Conference

Conference24th Pacific Asia Conference on Language, Information and Computation, PACLIC 24
Country/TerritoryJapan
CitySendai
Period4/11/107/11/10

Scopus Subject Areas

  • Language and Linguistics
  • Computer Science (miscellaneous)

User-Defined Keywords

  • Character similarity clustering
  • Concept similarity clustering
  • Corpus-based approach
  • Evaluation
  • Lexical ambiguity
  • Sense prediction

Fingerprint

Dive into the research topics of 'Using corpus-based linguistic approaches in sense prediction study'. Together they form a unique fingerprint.

Cite this