Building a “Corpus of 7 Types Emotion Co-occurrences Words” of Chinese Emotional Words with Big Data Corpus

Ching Hui Chen, Yu Lin Chang, Yen Cheng Chen, Meng Ning Tsai, Yao Ting Sung, Shu Yen Lin, Shu Ling Cho, Tao Hsing Chang, Hsueh Chih Chen*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Past studies used human rated as the way of establishing a corpus which costs a lot of time and money but contains insufficient words, also the Categorical Approach was seldom used for building corpus, which may also lead to study bias. Therefore, study 1 of present study has used the Spreading Activation Model as the structure, and used big data of text corpus and word co-occurrences to build a corpus that contains more categories of emotions and much more words. First, study 1 selected the words that can clearly describe the meanings or can effectively evoke the feeling of its emotion category for seven emotions, including Happiness, Surprise, Sadness, Anger, Disgust, Fear, and Love. Then study 1 calculated the averages of co-occurrences for selected words and text corpora by seven emotions categories (measure is Baroni-Urbani, unit is chunk), it computes the averages of co-occurrences by emotional categories for 33669 words, it represents the conceptual consonance of words and the emotions. Study 2 has investigated the practical use of the corpus built in study 1, and used C-LIWC dictionary which was built by human rated as a comparison, taking the posts of Happy Board, Sad Board, Hate Board of PTT Bulletin Board System into the analyses of emotions recognition, result showed that Corpus of 7 Types Emotion Co-occurrences Words” built in study 1 had higher correct rate than human rated corpus. Present study has also compared the correct rates between the Corpus of 7 Types Emotion Co-occurrences Words and CLIWC (Chinese Linguistic Inquiry and Word Count), result showed correct rates of two databases were significant different, the corpus of present study has higher correct rate. Present study has built a text corpus for the material of emotion research, and the results also supports a potential of building the corpora of emotional words with big data measures.

Original languageEnglish
Title of host publicationHCI in Business, Government and Organizations - 9th International Conference, HCIBGO 2022, Held as Part of the 24th HCI International Conference, HCII 2022, Proceedings
EditorsFiona Fui-Hoon Nah, Keng Siau
PublisherSpringer Science and Business Media Deutschland GmbH
Pages163-181
Number of pages19
ISBN (Print)9783031055430
DOIs
Publication statusPublished - 2022
Event9th International Conference on HCI in Business, Government and Organizations, HCIBGO 2022 Held as Part of the 24th HCI International Conference, HCII 2022 - Virtual, Online
Duration: 2022 Jun 262022 Jul 1

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13327 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference9th International Conference on HCI in Business, Government and Organizations, HCIBGO 2022 Held as Part of the 24th HCI International Conference, HCII 2022
CityVirtual, Online
Period2022/06/262022/07/01

Keywords

  • Big data
  • Chinese
  • Co-occurrence
  • Corpus
  • Emotional words

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Building a “Corpus of 7 Types Emotion Co-occurrences Words” of Chinese Emotional Words with Big Data Corpus'. Together they form a unique fingerprint.

Cite this