Real-time detection of duplicate data records
US11341547B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Jun 19, 2019 |
| Grant date | May 24, 2022 |
| Priority date | — |
| Expiry date | Aug 30, 2039 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N20/00
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Disclosed are various embodiments for real-time detection of duplicate data records. A duplicate detection application generates a set of clusters from a set of data records by grouping each data record in the set of data records according to similarity to respective centroid data records of the set of clusters. The duplicate detection application determines whether a particular data record has a potential duplicate in the set of data records by first comparing the particular data record to the respective centroid data records to identify a most similar cluster in the set of clusters. The duplicate detection application then compares the particular data record to each data record in the most similar cluster.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.