The Corpus of Emotional Valences for 33,669 Chinese Words Based on Big Data

Chia Yueh Chang, Yen Cheng Chen, Meng Ning Tsai, Yao Ting Sung, Yu Lin Chang, Shu Yen Lin, Shu Ling Cho, Tao Hsing Chang, Hsueh Chih Chen*

*此作品的通信作者

研究成果: 書貢獻/報告類型會議論文篇章

1 引文 斯高帕斯(Scopus)

摘要

Emotion theories are mainly classified as categorical or dimensional approaches. Given the importance of emotional words in emotion research, researchers have constructed a co-occurrence corpus of 7 types of emotion words through word co-occurrence and big data corpora. However, in addition to the categorical approach, the dimensional approach plays an important role in natural language processing. In particular, valence has an important influence on the study of emotion and language. In this study, the co-occurrence corpus of 7 types of emotion words constructed by Chen et al. [1] was expanded to create a corpus of emotional valences. Then, stepwise multiple regression analysis was performed with the predicted criterion variables and 15 predictor variables. The criterion variables were the emotional valences of 553 frequently occurring stimulus words included in the Chinese Word Association Norms [2]. The predictor variables included the emotion co-occurrences scores for 2 clusters (a cluster of literal emotion words and a cluster of metaphorical emotion words) and 7 types of emotions (happiness, love, surprise, sadness, anger, disgust, and fear) [the emotional words were common words from both the co-occurrence corpus of 7 types of emotion words constructed by Chen et al. [1] and the Chinese Word Association Norms established by Hu et al. [2]] and the virtue word co-occurrences score. The results showed that the scores for literal happiness word co-occurrences, metaphorical happiness word co-occurrences, literal disgust word co-occurrences, literal fear word co-occurrences, and virtue word co-occurrences could predict the valence values of emotion words, with the multiple correlation coefficients of multiple regression analyses reaching.729. Subsequently, the valence values of 33,669 words were established using the formula obtained from the multiple regression analysis of the 553 words. Next, the correlation between the actual valence values and the predicted valence values was analyzed to test the cross-validity of the established valences using the common words in the norm established by Lee and Lee [3] for the emotionality ratings and free associations of 267 common Chinese words. The results showed that the correlation between the 2 was.755, indicating that the predicted values generated by the big data corpora and word co-occurrence had a degree of similarity with the manually determined values. Based on theories and tests, this study used the co-occurrence data of 7 emotions and virtue to construct the corpus of emotional valences for 33,669 Chinese words. The results showed that the combined use of big data corpora and word co-occurrence can effectively expand existing corpora that were established based on emotional categories, improve the efficiency of manual construction of corpora, and establish a larger corpus of emotional words.

原文英語
主出版物標題HCI in Business, Government and Organizations - 9th International Conference, HCIBGO 2022, Held as Part of the 24th HCI International Conference, HCII 2022, Proceedings
編輯Fiona Fui-Hoon Nah, Keng Siau
發行者Springer Science and Business Media Deutschland GmbH
頁面141-152
頁數12
ISBN(列印)9783031055430
DOIs
出版狀態已發佈 - 2022
事件9th International Conference on HCI in Business, Government and Organizations, HCIBGO 2022 Held as Part of the 24th HCI International Conference, HCII 2022 - Virtual, Online
持續時間: 2022 6月 262022 7月 1

出版系列

名字Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
13327 LNCS
ISSN(列印)0302-9743
ISSN(電子)1611-3349

會議

會議9th International Conference on HCI in Business, Government and Organizations, HCIBGO 2022 Held as Part of the 24th HCI International Conference, HCII 2022
城市Virtual, Online
期間2022/06/262022/07/01

ASJC Scopus subject areas

  • 理論電腦科學
  • 一般電腦科學

指紋

深入研究「The Corpus of Emotional Valences for 33,669 Chinese Words Based on Big Data」主題。共同形成了獨特的指紋。

引用此