Document fingerprints using block encoding of text
US8838657B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 7, 2012 |
| Grant date | Sep 16, 2014 |
| Priority date | — |
| Expiry date | Mar 12, 2033 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/93
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Methods and apparatus for document encoding using block encoding of text are disclosed. A computing device is configured to detect, within a digitized image object, a plurality of element groups, where each group comprises one or more text image elements and is separated from other groups by at least one delimiter. The device generates a numerical representation of the groups, comprising a plurality of numerical values, where a particular value corresponding to a particular group is determined based at least in part on a combined size of text image elements of the particular group. The device stores at least a subset of the numerical representation as a fingerprint representing text contents of the digitized image object.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.