Method for identifying and resolving erroneous characters output by an optical character recognition system
US5418864A · kind A · utility
Assignee
Inventors
Key dates
| Filing date | Jul 11, 1994 |
| Grant date | May 23, 1995 |
| Priority date | — |
| Expiry date | Jul 11, 2014 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F18/254
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A post-processing method for an optical character recognition (OCR) method for combining different OCR engines to identify and resolve characters and attributes of the characters that are erroneously recognized by multiple optical character recognition engines. The characters can originate from many different types of character environments. OCR engine outputs are synchronized in order to detect matches and mismatches between said OCR engine outputs by using synchronization heuristics. The mismatches are resolved using resolution heuristics and neural networks. The resolution heuristics and neural networks are based on observing many different conventional OCR engines in different character environments to find what specific OCR engine correctly identifies a certain character having particular attributes. The results are encoded into the resolution heuristics and neural networks to create an optimal OCR post-processing solution.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.