Deduplication using nearest neighbor cluster
US11029871B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | May 15, 2019 |
| Grant date | Jun 8, 2021 |
| Priority date | — |
| Expiry date | May 15, 2039 |
Classification
- Technology area (CPC H)Electricity
- CPC primaryH04L9/0894
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Disclosed are techniques for data deduplication, which include methods, systems, or computer products for reducing data redundancy in a data storage system comprising searching a cluster of nearest neighbors, wherein the cluster has been created using a locality sensitive hashing algorithm, to determine if a data block has been stored in the data storage system prior to writing the data block. In alternate embodiments, the nearest neighbor clusters could be created using one or more of the following algorithms: k-means clustering algorithm, a k-medoids clustering algorithm, a mean shift algorithm, a generalized method of moment (GMM) algorithm, or a density based spatial clustering of applications with noise (DBSCAN) algorithm.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.