Patent · US Active

Deduplicating records received from multiple data sources

US12182088B2 · kind B2 · utility

1Cited by
5References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateSep 15, 2023
Grant dateDec 31, 2024
Priority date
Expiry dateSep 15, 2043

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/2379
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method includes generating a plurality of pages from a plurality of records received from a plurality of data sources. Deduplication of the plurality of pages is facilitated based on a plurality of page metadata of the plurality of pages based on, for the each page of the plurality of pages. A filtered set of potentially-intersecting pages is identified for each given page as a proper subset of the plurality of pages stored in the page storage system based on first comparison parameters, and an intersecting set of pages that include a row number intersection with the given page is identified as a proper subset of the filtered set of potentially-intersecting pages based on second comparison parameters. Records with records with row numbers included in row number intersections with other pages in the intersecting set of pages are removed from the each page.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.