Similarity matching systems and methods for record linkage
US11182395B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | May 15, 2018 |
| Grant date | Nov 23, 2021 |
| Priority date | — |
| Expiry date | Sep 21, 2040 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N5/022
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A given query entity of a query database and a set of reference entities from a master database are accessed; each entity accessed corresponds to an entry in a respective database, which is mapped to a set of words that are decomposed into tokens. For each reference entity, a closest token is identified therein for each token of the given query entity, via a given string metric. A number of closest tokens are thus respectively associated with highest scores of similarity between tokens of the query entity and tokens of each reference entity. An entity similarity score is computed based on said highest scores. A reference entity of the master database is identified, which is closest to said given query entity, based on the entity similarity score. Records of the given query entity are linked to records of the master database, based on the closest reference entity identified.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.