Patent · US Active

Deduplication using nearest neighbor cluster

US11029871B2 · kind B2 · utility

1Cited by
0References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMay 15, 2019
Grant dateJun 8, 2021
Priority date
Expiry dateMay 15, 2039

Classification

  • Technology area (CPC H)Electricity
  • CPC primaryH04L9/0894
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Disclosed are techniques for data deduplication, which include methods, systems, or computer products for reducing data redundancy in a data storage system comprising searching a cluster of nearest neighbors, wherein the cluster has been created using a locality sensitive hashing algorithm, to determine if a data block has been stored in the data storage system prior to writing the data block. In alternate embodiments, the nearest neighbor clusters could be created using one or more of the following algorithms: k-means clustering algorithm, a k-medoids clustering algorithm, a mean shift algorithm, a generalized method of moment (GMM) algorithm, or a density based spatial clustering of applications with noise (DBSCAN) algorithm.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.