Large scale data join service within a service provider network
US10467191B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Dec 27, 2016 |
| Grant date | Nov 5, 2019 |
| Priority date | — |
| Expiry date | Jun 21, 2037 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/2456
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Technologies are disclosed for providing a large scale data join service within a service provider network. A data set includes first and second sets of files that correspond to each other. Each file includes a first identifier (ID) and a second ID. The first set of files is partitioned based at least in part upon the first ID into a plurality of first subsets of files and the second set of files is partitioned based at least in part upon the first ID into a plurality of second subsets of files. Files within a first group of the plurality of first subsets and files within a second group of the plurality of second subsets are encoded into first and second bitsets, respectively, based at least in part upon the second IDs. An exclusive-or operation is performed on the first and second bitsets to find discrepancies between the data files.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.