Patent · US Active

Detection of diacritics in OCR systems with assignment to the correct text line

US8977057B1 · kind B1 · utility

9Cited by
3References
21Claims
0Family size

Assignee

Inventor

Key dates

Filing dateNov 9, 2012
Grant dateMar 10, 2015
Priority date
Expiry dateFeb 6, 2033

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/293
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system and method of assigning diacritics in an electronic image using optical character recognition (OCR) is disclosed. In one example, the method comprises analyzing, by a computer system, the electronic image to generate a plurality of bounding blocks associated with text lines within the electronic image. The method further comprises establishing a plurality of bounding boxes for diacritics and base text with the electronic image. The method also comprises determining a distance from a diacritic to a nearest base text character and a nearest text line. The method also comprises evaluating a base box distance and the nearest text line distance to assign the diacritic to a correct text line in the electronic image.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.