Methods and systems for matching records and normalizing names
US8190538B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jan 30, 2009 |
| Grant date | May 29, 2012 |
| Priority date | — |
| Expiry date | Nov 23, 2030 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/90344
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Methods and systems are provided for normalizing strings and for matching records. In one implementation, a string is tokenized into components. Sequences of tags are generated by assigning tags to the components. A sequence of states is determined based on the sequences of tags. A normalized string is generated by normalizing the sequence of the states. A key record including key fields is extracted from a first data source. A candidate record including candidate fields is extracted from a second data source. A numerical record including numerical fields is computed by comparing the key fields and the candidate fields using comparison functions. Matching functions determined by an additive logistic regression method are applied to the numerical fields. Whether the key record and the candidate record are a match is determined based on a sum of results of the matching functions.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.