Method for matching text images and documents using character shape codes
US5438628A · kind A · utility
Assignees
Inventors
Key dates
| Filing date | Mar 31, 1994 |
| Grant date | Aug 1, 1995 |
| Priority date | — |
| Expiry date | Mar 31, 2014 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V30/242
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A first method for exact and inexact matching of documents stored in a document database includes the step of converting the documents in the database to a compacted tokenized form. A search string or search document is then converted to the compact tokenized form and compared to determine if the test string occurs in the documents of the database or whether the documents in the database correspond to the test document. A second method for inexact matching of a test document to the documents in the database includes generating sets of one or more floating point values for each document in the database and for the test document. The sets of floating point numbers for the database are then compared to the set for the test document to determine a degree of matching. A threshold value is established and each document in the database which generates a matching value closer to the test document that the threshold is considered to be an inexact match of the test document.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.