TY - JOUR
T1 - CNERVis
T2 - a visual diagnosis tool for Chinese named entity recognition
AU - Lo, Pei Shan
AU - Wu, Jian Lin
AU - Deng, Syu Ting
AU - Wang, Ko Chih
N1 - Publisher Copyright:
© 2021, The Visualization Society of Japan.
PY - 2022/6
Y1 - 2022/6
N2 - Abstract: Named entity recognition (NER) is a crucial initial task that identifies both spans and types of named entities to extract the specific information, such as organization, person, location, and time. Nowadays, the NER task achieves state-of-the-art performance by deep learning approaches for capturing contextual features. However, the complex structures of deep learning make a black-box problem and limit researchers’ ability to improve it. Unlike the Latin alphabet, Chinese (or other languages such as Korean and Japanese) do not have an explicit word boundary. Therefore, some preliminary works, such as word segmentation (WS) and part-of-speech tagging (POS), are needed before the Chinese NER task. The correctness of preliminary works importantly influences the final NER prediction. Thus, investigating the model behavior of the Chinese NER task becomes more complicated and challenging. In this paper, we present CNERVis, a visual analysis tool that allows users to interactively inspect the WS-POS-NER pipeline and understand how and why a NER prediction is made. Also, CNERVis allows users to load the numerous testing data and explores the critical instances to facilitate the analysis from large datasets. Our tool’s usability and effectiveness are demonstrated through case studies. Graphic abstract: [Figure not available: see fulltext.].
AB - Abstract: Named entity recognition (NER) is a crucial initial task that identifies both spans and types of named entities to extract the specific information, such as organization, person, location, and time. Nowadays, the NER task achieves state-of-the-art performance by deep learning approaches for capturing contextual features. However, the complex structures of deep learning make a black-box problem and limit researchers’ ability to improve it. Unlike the Latin alphabet, Chinese (or other languages such as Korean and Japanese) do not have an explicit word boundary. Therefore, some preliminary works, such as word segmentation (WS) and part-of-speech tagging (POS), are needed before the Chinese NER task. The correctness of preliminary works importantly influences the final NER prediction. Thus, investigating the model behavior of the Chinese NER task becomes more complicated and challenging. In this paper, we present CNERVis, a visual analysis tool that allows users to interactively inspect the WS-POS-NER pipeline and understand how and why a NER prediction is made. Also, CNERVis allows users to load the numerous testing data and explores the critical instances to facilitate the analysis from large datasets. Our tool’s usability and effectiveness are demonstrated through case studies. Graphic abstract: [Figure not available: see fulltext.].
KW - BiLSTM
KW - Chinese named entity recognition
KW - natural language processing
KW - sequence labeling
KW - visual analytics
UR - http://www.scopus.com/inward/record.url?scp=85119247774&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85119247774&partnerID=8YFLogxK
U2 - 10.1007/s12650-021-00799-3
DO - 10.1007/s12650-021-00799-3
M3 - Article
AN - SCOPUS:85119247774
SN - 1343-8875
VL - 25
SP - 653
EP - 669
JO - Journal of Visualization
JF - Journal of Visualization
IS - 3
ER -