Patent · US Expired

Method for matching text images and documents using character shape codes

US5438628A · kind A · utility

26Cited by
1References
6Claims
0Family size

Assignees

Inventors

Key dates

Filing dateMar 31, 1994
Grant dateAug 1, 1995
Priority date
Expiry dateMar 31, 2014

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/242
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A first method for exact and inexact matching of documents stored in a document database includes the step of converting the documents in the database to a compacted tokenized form. A search string or search document is then converted to the compact tokenized form and compared to determine if the test string occurs in the documents of the database or whether the documents in the database correspond to the test document. A second method for inexact matching of a test document to the documents in the database includes generating sets of one or more floating point values for each document in the database and for the test document. The sets of floating point numbers for the database are then compared to the set for the test document to determine a degree of matching. A threshold value is established and each document in the database which generates a matching value closer to the test document that the threshold is considered to be an inexact match of the test document.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.