Document image decoding using an integrated stochastic language model
US6678415B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | May 12, 2000 |
| Grant date | Jan 13, 2004 |
| Priority date | — |
| Expiry date | May 12, 2020 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V30/10
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A text recognition system represents the decoded message of a document image as a path through an image network. A method for integrating a language model into the network selectively expands the network to accommodate the language model only for certain ones of the paths in the network, effectively managing the memory storage requirements and computational complexities of integrating the language model efficiently into the network. The language model generates probability distributions indicating the probability of a certain character occurring in a string, given one or more previous characters in the string. Selectively expanding the image network is achieved by initially using upper bounds on the language model probabilities on the branches of an unexpanded image network. A best path search operation is then performed to determine an estimated best path through the image network using these upper bound scores. After decoding, only the nodes on the estimated best path are expanded with new nodes and with branches incoming to the new nodes that accommodate new language model scores reflecting actual character histories in place of the upper bound scores. Decoding and selectively e…
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.