CLAD: A corpus-derived Chinese Lexical Association Database

Shu Yen Lin, Hsueh Chih Chen, Tao Hsing Chang, Wei En Lee, Yao Ting Sung*

*此作品的通信作者

研究成果: 雜誌貢獻期刊論文同行評審

12 引文 斯高帕斯(Scopus)

摘要

The application of word associations has become increasingly widespread. However, the association norms produced by traditional free association tests tend not to exceed 10,000 stimulus words, making the number of associated words too small to be representative of the overall language. In this study we used text corpora totaling over 400 million Chinese words, along with a multitude of association measures, to automatically construct a Chinese Lexical Association Database (CLAD) comprising the lexical association of over 80,000 words. Comparison of the CLAD with a database of traditional Chinese word association norms shows that word associations extracted from large text corpora are similar in strength to those elicited from free association tests but contain a much greater number of associative word pairs. Additionally, the relatively small numbers of participants involved in the creation of traditional norms result in relatively coarse scales of association measurement, whereas the differentiation of association strengths is greatly enhanced in the CLAD. The CLAD provides researchers with a great supplement to traditional word association norms. A query website at www.chinesereadability.net/LexicalAssociation/CLAD/ affords access to the database.

原文英語
頁(從 - 到)2310-2336
頁數27
期刊Behavior Research Methods
51
發行號5
DOIs
出版狀態已發佈 - 2019 10月 1

ASJC Scopus subject areas

  • 實驗與認知心理學
  • 發展與教育心理學
  • 藝術與人文(雜項)
  • 心理學(雜項)
  • 一般心理學

指紋

深入研究「CLAD: A corpus-derived Chinese Lexical Association Database」主題。共同形成了獨特的指紋。

引用此