Document image processing apparatus
US8160402B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jan 10, 2008 |
| Grant date | Apr 17, 2012 |
| Priority date | — |
| Expiry date | Feb 17, 2031 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V30/287
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
An image of a character string composed of M pieces of characters is clipped from a document image, and the image is divided character by character, and image features of each character image are extracted. On the basis of the image features, N (N>1, integer) pieces of character images in descending order of degree of similarity are selected as candidate characters from a character image feature dictionary which stores the image features of character image in units of character, and the first index matrix of M×N cells is prepared. A candidate character string composed of a plurality of candidate characters constituting the first column of the first index matrix, is subjected to a lexical analysis according to a predetermined language model, whereby a second index matrix adjusted into a character string which makes sense is prepared to be utilized for searching.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.