TY - GEN
T1 - Building a “Corpus of 7 Types Emotion Co-occurrences Words” of Chinese Emotional Words with Big Data Corpus
AU - Chen, Ching Hui
AU - Chang, Yu Lin
AU - Chen, Yen Cheng
AU - Tsai, Meng Ning
AU - Sung, Yao Ting
AU - Lin, Shu Yen
AU - Cho, Shu Ling
AU - Chang, Tao Hsing
AU - Chen, Hsueh Chih
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - Past studies used human rated as the way of establishing a corpus which costs a lot of time and money but contains insufficient words, also the Categorical Approach was seldom used for building corpus, which may also lead to study bias. Therefore, study 1 of present study has used the Spreading Activation Model as the structure, and used big data of text corpus and word co-occurrences to build a corpus that contains more categories of emotions and much more words. First, study 1 selected the words that can clearly describe the meanings or can effectively evoke the feeling of its emotion category for seven emotions, including Happiness, Surprise, Sadness, Anger, Disgust, Fear, and Love. Then study 1 calculated the averages of co-occurrences for selected words and text corpora by seven emotions categories (measure is Baroni-Urbani, unit is chunk), it computes the averages of co-occurrences by emotional categories for 33669 words, it represents the conceptual consonance of words and the emotions. Study 2 has investigated the practical use of the corpus built in study 1, and used C-LIWC dictionary which was built by human rated as a comparison, taking the posts of Happy Board, Sad Board, Hate Board of PTT Bulletin Board System into the analyses of emotions recognition, result showed that Corpus of 7 Types Emotion Co-occurrences Words” built in study 1 had higher correct rate than human rated corpus. Present study has also compared the correct rates between the Corpus of 7 Types Emotion Co-occurrences Words and CLIWC (Chinese Linguistic Inquiry and Word Count), result showed correct rates of two databases were significant different, the corpus of present study has higher correct rate. Present study has built a text corpus for the material of emotion research, and the results also supports a potential of building the corpora of emotional words with big data measures.
AB - Past studies used human rated as the way of establishing a corpus which costs a lot of time and money but contains insufficient words, also the Categorical Approach was seldom used for building corpus, which may also lead to study bias. Therefore, study 1 of present study has used the Spreading Activation Model as the structure, and used big data of text corpus and word co-occurrences to build a corpus that contains more categories of emotions and much more words. First, study 1 selected the words that can clearly describe the meanings or can effectively evoke the feeling of its emotion category for seven emotions, including Happiness, Surprise, Sadness, Anger, Disgust, Fear, and Love. Then study 1 calculated the averages of co-occurrences for selected words and text corpora by seven emotions categories (measure is Baroni-Urbani, unit is chunk), it computes the averages of co-occurrences by emotional categories for 33669 words, it represents the conceptual consonance of words and the emotions. Study 2 has investigated the practical use of the corpus built in study 1, and used C-LIWC dictionary which was built by human rated as a comparison, taking the posts of Happy Board, Sad Board, Hate Board of PTT Bulletin Board System into the analyses of emotions recognition, result showed that Corpus of 7 Types Emotion Co-occurrences Words” built in study 1 had higher correct rate than human rated corpus. Present study has also compared the correct rates between the Corpus of 7 Types Emotion Co-occurrences Words and CLIWC (Chinese Linguistic Inquiry and Word Count), result showed correct rates of two databases were significant different, the corpus of present study has higher correct rate. Present study has built a text corpus for the material of emotion research, and the results also supports a potential of building the corpora of emotional words with big data measures.
KW - Big data
KW - Chinese
KW - Co-occurrence
KW - Corpus
KW - Emotional words
UR - http://www.scopus.com/inward/record.url?scp=85133294524&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85133294524&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-05544-7_13
DO - 10.1007/978-3-031-05544-7_13
M3 - Conference contribution
AN - SCOPUS:85133294524
SN - 9783031055430
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 163
EP - 181
BT - HCI in Business, Government and Organizations - 9th International Conference, HCIBGO 2022, Held as Part of the 24th HCI International Conference, HCII 2022, Proceedings
A2 - Fui-Hoon Nah, Fiona
A2 - Siau, Keng
PB - Springer Science and Business Media Deutschland GmbH
T2 - 9th International Conference on HCI in Business, Government and Organizations, HCIBGO 2022 Held as Part of the 24th HCI International Conference, HCII 2022
Y2 - 26 June 2022 through 1 July 2022
ER -