Patent · US Active

Data compression using dictionaries

US11620263B2 · kind B2 · utility

0Cited by
0References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateAug 17, 2021
Grant dateApr 4, 2023
Priority date
Expiry dateAug 17, 2041

Classification

  • Technology area (CPC H)Electricity
  • CPC primaryH03M7/3077
  • WIPO fieldBasic communication processes
  • WIPO sectorElectrical engineering

Abstract

Data units of a dataset may be compressed by clustering the data units into clusters, selecting a reference unit for each unit cluster, and compressing data units of each unit cluster using the reference unit of the unit cluster as a dictionary. The computational efficiency of the clustering algorithm may be improved by not applying it to data units themselves, but rather to hash values of the data units, where the hash values have a much smaller size than the data units. The hash function may be a locality-sensitive hash (LSH) function. The reference unit of a cluster may be determined in any of a variety of ways, for example, by selecting a centroid or exemplar of the cluster. Clusters, including their references values, may be indexed in a cluster index (e.g., a Faiss index), which may be searched to assign future added or modified data units to clusters.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.