A Semi-Supervised Learning Approach for Traditional Chinese Scene Text Detection

Chia Fu Yeh, Mei Chen Yeh

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

With the advancement of multimedia technology, the information in surrounding environment has becoming accessible. In particular, automatic scene text detection is essential for subsequent text recognition, understanding and analysis. However, most existing methods are primarily designed for English, while those for other languages are scarce. In this paper we present a traditional Chinese scene text detector, built upon a robust object detector trained with labeled and unlabeled data via semi-supervised learning. Moreover, we expand the limited labeled data by data synthesis and a data augmentation method. We demonstrate the effectiveness of the proposed method through extensive experiments, and examine the design choices in developing a practical system that can instantly and accurately detect traditional Chinese texts in complex scenes.

Original languageEnglish
Title of host publication2022 IEEE 24th International Workshop on Multimedia Signal Processing, MMSP 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665471893
DOIs
Publication statusPublished - 2022
Event24th IEEE International Workshop on Multimedia Signal Processing, MMSP 2022 - Shanghai, China
Duration: 2022 Sept 262022 Sept 28

Publication series

Name2022 IEEE 24th International Workshop on Multimedia Signal Processing, MMSP 2022

Conference

Conference24th IEEE International Workshop on Multimedia Signal Processing, MMSP 2022
Country/TerritoryChina
CityShanghai
Period2022/09/262022/09/28

Keywords

  • Deep learning
  • object detection
  • scene text detection
  • semi-supervised learning

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Signal Processing
  • Media Technology

Fingerprint

Dive into the research topics of 'A Semi-Supervised Learning Approach for Traditional Chinese Scene Text Detection'. Together they form a unique fingerprint.

Cite this