Patent · US Active

Systems and methods for automatic clustering and canonical designation of related data in various data structures

US12038933B2 · kind B2 · utility

1Cited by
64References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMay 30, 2023
Grant dateJul 16, 2024
Priority date
Expiry dateMay 30, 2043

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F18/23
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Computer implemented systems and methods are disclosed for automatically clustering and canonically identifying related data in various data structures. Data structures may include a plurality of records, wherein each record is associated with a respective entity. In accordance with some embodiments, the systems and methods further comprise identifying clusters of records associated with a respective entity by grouping the records into pairs, analyzing the respective pairs to determine a probability that both members of the pair relate to a common entity, and identifying a cluster of overlapping pairs to generate a collection of records relating to a common entity. Clusters may further be analyzed to determine canonical names or other properties for the respective entities by analyzing record fields and identifying similarities.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.