Patent · US Active

String similarity based weighted min-hashing

US12299410B2 · kind B2 · utility

0Cited by
12References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJun 30, 2022
Grant dateMay 13, 2025
Priority date
Expiry dateNov 3, 2042

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/2458
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A computer-implemented method for generating hash values to determine string similarity is disclosed. The computer-implemented method includes converting a first text string of a first data set into a first set of shingles. The computer-implemented method further includes determining a weight associated with each shingle in the first set of shingles based, at least in part, on a particular record field associated with a shingle. The computer-implemented method further includes generating, based on a hash function, a hash value for each shingle in the first set of shingles. The computer-implemented method further includes reducing the hash value generated for each shingle in the first set of shingles, based, at least in part on the weight associated with the shingle.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.