TY - GEN
T1 - The Corpus of Emotional Valences for 33,669 Chinese Words Based on Big Data
AU - Chang, Chia Yueh
AU - Chen, Yen Cheng
AU - Tsai, Meng Ning
AU - Sung, Yao Ting
AU - Chang, Yu Lin
AU - Lin, Shu Yen
AU - Cho, Shu Ling
AU - Chang, Tao Hsing
AU - Chen, Hsueh Chih
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - Emotion theories are mainly classified as categorical or dimensional approaches. Given the importance of emotional words in emotion research, researchers have constructed a co-occurrence corpus of 7 types of emotion words through word co-occurrence and big data corpora. However, in addition to the categorical approach, the dimensional approach plays an important role in natural language processing. In particular, valence has an important influence on the study of emotion and language. In this study, the co-occurrence corpus of 7 types of emotion words constructed by Chen et al. [1] was expanded to create a corpus of emotional valences. Then, stepwise multiple regression analysis was performed with the predicted criterion variables and 15 predictor variables. The criterion variables were the emotional valences of 553 frequently occurring stimulus words included in the Chinese Word Association Norms [2]. The predictor variables included the emotion co-occurrences scores for 2 clusters (a cluster of literal emotion words and a cluster of metaphorical emotion words) and 7 types of emotions (happiness, love, surprise, sadness, anger, disgust, and fear) [the emotional words were common words from both the co-occurrence corpus of 7 types of emotion words constructed by Chen et al. [1] and the Chinese Word Association Norms established by Hu et al. [2]] and the virtue word co-occurrences score. The results showed that the scores for literal happiness word co-occurrences, metaphorical happiness word co-occurrences, literal disgust word co-occurrences, literal fear word co-occurrences, and virtue word co-occurrences could predict the valence values of emotion words, with the multiple correlation coefficients of multiple regression analyses reaching.729. Subsequently, the valence values of 33,669 words were established using the formula obtained from the multiple regression analysis of the 553 words. Next, the correlation between the actual valence values and the predicted valence values was analyzed to test the cross-validity of the established valences using the common words in the norm established by Lee and Lee [3] for the emotionality ratings and free associations of 267 common Chinese words. The results showed that the correlation between the 2 was.755, indicating that the predicted values generated by the big data corpora and word co-occurrence had a degree of similarity with the manually determined values. Based on theories and tests, this study used the co-occurrence data of 7 emotions and virtue to construct the corpus of emotional valences for 33,669 Chinese words. The results showed that the combined use of big data corpora and word co-occurrence can effectively expand existing corpora that were established based on emotional categories, improve the efficiency of manual construction of corpora, and establish a larger corpus of emotional words.
AB - Emotion theories are mainly classified as categorical or dimensional approaches. Given the importance of emotional words in emotion research, researchers have constructed a co-occurrence corpus of 7 types of emotion words through word co-occurrence and big data corpora. However, in addition to the categorical approach, the dimensional approach plays an important role in natural language processing. In particular, valence has an important influence on the study of emotion and language. In this study, the co-occurrence corpus of 7 types of emotion words constructed by Chen et al. [1] was expanded to create a corpus of emotional valences. Then, stepwise multiple regression analysis was performed with the predicted criterion variables and 15 predictor variables. The criterion variables were the emotional valences of 553 frequently occurring stimulus words included in the Chinese Word Association Norms [2]. The predictor variables included the emotion co-occurrences scores for 2 clusters (a cluster of literal emotion words and a cluster of metaphorical emotion words) and 7 types of emotions (happiness, love, surprise, sadness, anger, disgust, and fear) [the emotional words were common words from both the co-occurrence corpus of 7 types of emotion words constructed by Chen et al. [1] and the Chinese Word Association Norms established by Hu et al. [2]] and the virtue word co-occurrences score. The results showed that the scores for literal happiness word co-occurrences, metaphorical happiness word co-occurrences, literal disgust word co-occurrences, literal fear word co-occurrences, and virtue word co-occurrences could predict the valence values of emotion words, with the multiple correlation coefficients of multiple regression analyses reaching.729. Subsequently, the valence values of 33,669 words were established using the formula obtained from the multiple regression analysis of the 553 words. Next, the correlation between the actual valence values and the predicted valence values was analyzed to test the cross-validity of the established valences using the common words in the norm established by Lee and Lee [3] for the emotionality ratings and free associations of 267 common Chinese words. The results showed that the correlation between the 2 was.755, indicating that the predicted values generated by the big data corpora and word co-occurrence had a degree of similarity with the manually determined values. Based on theories and tests, this study used the co-occurrence data of 7 emotions and virtue to construct the corpus of emotional valences for 33,669 Chinese words. The results showed that the combined use of big data corpora and word co-occurrence can effectively expand existing corpora that were established based on emotional categories, improve the efficiency of manual construction of corpora, and establish a larger corpus of emotional words.
KW - Big data
KW - Chinese
KW - Emotion
KW - Valence
KW - Word co-occurrence
UR - http://www.scopus.com/inward/record.url?scp=85133278646&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85133278646&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-05544-7_11
DO - 10.1007/978-3-031-05544-7_11
M3 - Conference contribution
AN - SCOPUS:85133278646
SN - 9783031055430
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 141
EP - 152
BT - HCI in Business, Government and Organizations - 9th International Conference, HCIBGO 2022, Held as Part of the 24th HCI International Conference, HCII 2022, Proceedings
A2 - Fui-Hoon Nah, Fiona
A2 - Siau, Keng
PB - Springer Science and Business Media Deutschland GmbH
T2 - 9th International Conference on HCI in Business, Government and Organizations, HCIBGO 2022 Held as Part of the 24th HCI International Conference, HCII 2022
Y2 - 26 June 2022 through 1 July 2022
ER -