Text script and orientation recognition
US8744171B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 24, 2010 |
| Grant date | Jun 3, 2014 |
| Priority date | — |
| Expiry date | Jun 27, 2031 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F18/2431
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A page layout module receives a page image displaying text in an unknown script and unknown orientation, determines a text section in the received image and transmits the text section to an orientation and script module. The orientation and script module comprises a training module, a classifier and a recognition module. The training module trains the classifier to identify connected components that include a connected portion of one or more characters of text. The recognition module uses the trained classifier to identify in the received text section a set of connected components. The recognition determines the likely orientation and script for the connected components and then uses the determined information to determine the orientation and script for the text section. The determined orientation and script for the text section is transmitted to the OCR module. The OCR module uses the determined orientation and script to recognize text in the text section.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.