Method and system for preparing text images for optical-character recognition
US10430948B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Aug 16, 2016 |
| Grant date | Oct 1, 2019 |
| Priority date | — |
| Expiry date | Aug 16, 2036 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V30/10
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
The current document is directed to methods and systems that straighten in the text lines of text-containing digital images. Initial processing of a text-containing image identifies the outline of a text-containing page. Next, aggregations of symbols, including words and word fragments, are identified within the outlined page image. The centroids and inclination angles of the symbol aggregations are determined, allowing each symbol aggregation to be circumscribed by a closest-fitting rectangle oriented in conformance with the inclination angle determined for the circumscribed symbol aggregation. A model is constructed for the text-line curvature within the text image based on the circumscribed symbol aggregations and is refined using additional information extracted from the text image. The model, essentially an inclination-angle map, allows for assigning local displacements to pixels within the page image which are then used to straighten the text lines in the text image.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.