Patent · US Active

Duplicate filtering in a data processing environment

US8484171B2 · kind B2 · utility

0Cited by
5References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateApr 2, 2012
Grant dateJul 9, 2013
Priority date
Expiry dateApr 2, 2032

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/24568
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A data processing method is provided. The method comprises collecting a stream of data records received from one or more data sources connected in a communications network; dividing the stream of data records into sets of data records for parallel processing by a plurality of concurrently running tasks, wherein a first task loads a persistent index associated with a first set of data records into memory to generate an in-memory version of the first persistent index for the first set of data records; and identifying duplicate and non-duplicate data records in the first set of data records, based on searching the in-memory version of the first persistent index.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.