Patent · US Active

Searching in multilevel clustered vector-based data

US11449704B2 · kind B2 · utility

0Cited by
2References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJan 16, 2020
Grant dateSep 20, 2022
Priority date
Expiry dateOct 18, 2040

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V10/771
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A multilevel clustered data set for multidimensional vectors is created by defining a plurality of clusters based on each of the signed dimensions of the vectors, each dimension functioning as an axis. Vectors are assigned to each cluster by measuring cosine similarity between a vector and each axis. Sub-clusters are defined as ranges of cosine similarity values within a cluster, and each vector is assigned into the appropriate range based on their cosine similarity value with the axis of the cluster. Searching for a matching vector to a new vector is efficiently achieved in near-constant time by measuring cosine similarity for the new vector with each axis to identify the closest cluster, reusing the cosine similarity of the new vector and axis to determine which sub-cluster corresponds to the appropriate range of values, and then comparing each vector within the sub-cluster until a match is found or ruled out.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.