Datasets profiling tools, methods, and systems
US10318388B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | May 20, 2014 |
| Grant date | Jun 11, 2019 |
| Priority date | — |
| Expiry date | Nov 8, 2035 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/2291
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A dataset profiling tool configured to identify unique and non-unique column combinations in a dataset which includes a plurality of tuples, the tool including: an inserts handler module configured to: receive one or more new tuples for insertion into the dataset, receive one or more minimal uniques and one or more maximal non-uniques for the dataset, identify and group, for each minimal unique, any tuples of the dataset and any of the one or more new tuples which contain duplicate values in the column combinations of the minimal unique, to form grouped tuples which are grouped according to the minimal unique to which the tuples relate, validate the grouped tuples to identify supersets of the minimal uniques for which duplicate values were identified, to generate a new set of one or more minimal uniques and one or more maximal non-uniques, and output the new set of one or more updated minimal uniques and one or more maximal non-uniques.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.