Building a confused character set for Chinese spell checking

Lung Hao Lee, Wun Syuan Wu, Jian Hong Li, Yu Chi Lin, Yuen Hsien Tseng*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

In this paper, we describe the construction details of a confused character set for Chinese spell checking. The SIGHAN 2013-2015 bakeoff datasets are adopted to measure the performance of correct character suggestions. Our confusion set significantly outperforms the existing confusion set in candidate selection for automatic spelling checkers.

Original languageEnglish
Title of host publicationICCE 2019 - 27th International Conference on Computers in Education, Proceedings
EditorsMaiga Chang, Hyo-Jeong So, Lung-Hsiang Wong, Fu-Yun Yu, Ju-Ling Shih, Ivica Boticki, Ming-Puu Chen, Ali Dewan, Stian Haklev, Elizabeth Koh, Tomoko Kojiri, Kuo-Chen Li, Daner Sun, Yun Wen
PublisherAsia-Pacific Society for Computers in Education
Pages703-705
Number of pages3
ISBN (Electronic)9789869721431
Publication statusPublished - 2019 Nov 19
Event27th International Conference on Computers in Education, ICCE 2019 - Kenting, Taiwan
Duration: 2019 Dec 22019 Dec 6

Publication series

NameICCE 2019 - 27th International Conference on Computers in Education, Proceedings
Volume1

Conference

Conference27th International Conference on Computers in Education, ICCE 2019
Country/TerritoryTaiwan
CityKenting
Period2019/12/022019/12/06

Keywords

  • Chinese spell checking
  • Confusion set
  • Pronunciation similarity
  • Shape similarity

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Computer Science Applications
  • Information Systems
  • Hardware and Architecture
  • Education

Fingerprint

Dive into the research topics of 'Building a confused character set for Chinese spell checking'. Together they form a unique fingerprint.

Cite this