String similarity based weighted min-hashing
US12299410B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jun 30, 2022 |
| Grant date | May 13, 2025 |
| Priority date | — |
| Expiry date | Nov 3, 2042 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/2458
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A computer-implemented method for generating hash values to determine string similarity is disclosed. The computer-implemented method includes converting a first text string of a first data set into a first set of shingles. The computer-implemented method further includes determining a weight associated with each shingle in the first set of shingles based, at least in part, on a particular record field associated with a shingle. The computer-implemented method further includes generating, based on a hash function, a hash value for each shingle in the first set of shingles. The computer-implemented method further includes reducing the hash value generated for each shingle in the first set of shingles, based, at least in part on the weight associated with the shingle.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.