Systems and methods for efficient data searching, storage and reduction
US10649854B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Aug 1, 2016 |
| Grant date | May 12, 2020 |
| Priority date | — |
| Expiry date | Dec 4, 2038 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99953
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A computer-implemented method, according to one embodiment includes, for each repository data chunk in repository data that comprises a plurality of the repository data chunks, generating a corresponding set of repository distinguishing characteristics (RDCs). Each set of RDCs is generated by: applying a hash function to the respective input data chunk or repository data chunk to generate a plurality of hashes, each hash comprising a hash value and a hash position within the data chunk, applying a first function to the plurality of generated hashes to identify a first subset of hashes distributed across the data chunk, applying a second function to the hash positions of the hashes of the first subset to identify a second subset of the plurality of generated hashes, and defining the second subset of hashes as the set of RDCs.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.