Document fingerprinting with asymmetric selection of anchor points
US8359472B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 25, 2010 |
| Grant date | Jan 22, 2013 |
| Priority date | — |
| Expiry date | Apr 3, 2031 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/9014
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
One embodiment relates to a computer-implemented process for generating document fingerprints. A document is normalized to create a normalized text string. A first hash function with a sliding hash window is applied to the normalized text string to generate an array of hash values. Candidate anchoring points are selected by applying a first filter to the array of hash values. The anchoring points are chosen by applying a second filter to the candidate anchoring points. Finally, a second hash function is applied to substrings located at the chosen anchoring points to generate hash values for use as fingerprints for the document. Other embodiments and aspects are also disclosed.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.