Error correction in a Chinese OCR test collection

Research output: Contribution to journalArticle

Abstract

This article proposes a technique for correcting Chinese OCR errors to support retrieval of scanned documents. The technique uses a completely automatic technique (no manually constructed lexicons or confusion resources) to identify both keywords and confusable terms. Improved retrieval effectiveness on a single term query experiment is demonstrated.

Original languageEnglish
Pages (from-to)429-430
Number of pages2
JournalSIGIR Forum (ACM Special Interest Group on Information Retrieval)
Publication statusPublished - 2002
Externally publishedYes

Fingerprint

Optical character recognition
Error correction
Experiments
Test collections

Keywords

  • Chinese
  • Confusing pair
  • Error correction
  • Term clustering

ASJC Scopus subject areas

  • Management Information Systems
  • Hardware and Architecture

Cite this

@article{5b8f6d422e494c4fa5d3867f29cdd9ae,
title = "Error correction in a Chinese OCR test collection",
abstract = "This article proposes a technique for correcting Chinese OCR errors to support retrieval of scanned documents. The technique uses a completely automatic technique (no manually constructed lexicons or confusion resources) to identify both keywords and confusable terms. Improved retrieval effectiveness on a single term query experiment is demonstrated.",
keywords = "Chinese, Confusing pair, Error correction, Term clustering",
author = "Tseng, {Yuen Hsien}",
year = "2002",
language = "English",
pages = "429--430",
journal = "SIGIR Forum (ACM Special Interest Group on Information Retrieval)",
issn = "0163-5840",
publisher = "Association for Computing Machinery (ACM)",

}

TY - JOUR

T1 - Error correction in a Chinese OCR test collection

AU - Tseng, Yuen Hsien

PY - 2002

Y1 - 2002

N2 - This article proposes a technique for correcting Chinese OCR errors to support retrieval of scanned documents. The technique uses a completely automatic technique (no manually constructed lexicons or confusion resources) to identify both keywords and confusable terms. Improved retrieval effectiveness on a single term query experiment is demonstrated.

AB - This article proposes a technique for correcting Chinese OCR errors to support retrieval of scanned documents. The technique uses a completely automatic technique (no manually constructed lexicons or confusion resources) to identify both keywords and confusable terms. Improved retrieval effectiveness on a single term query experiment is demonstrated.

KW - Chinese

KW - Confusing pair

KW - Error correction

KW - Term clustering

UR - http://www.scopus.com/inward/record.url?scp=0036992592&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0036992592&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:0036992592

SP - 429

EP - 430

JO - SIGIR Forum (ACM Special Interest Group on Information Retrieval)

JF - SIGIR Forum (ACM Special Interest Group on Information Retrieval)

SN - 0163-5840

ER -