System and method for fast identification of variable roles during initial data exploration
US8918410B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Feb 21, 2013 |
| Grant date | Dec 23, 2014 |
| Priority date | — |
| Expiry date | Jun 11, 2033 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06Q10/06
- WIPO fieldIT methods for management
- WIPO sectorElectrical engineering
Abstract
Systems and methods are provided for identifying data variable rules during initial data exploration. In one example, a computer-implemented method of determining a role for a data variable is disclosed. The method comprises identifying to a plurality of data nodes a set of data records containing data values assigned to each data node, a maximum number of levels to record in a sorted data structure at the data nodes, and the data node responsible for each of a plurality of variables. The method further comprises receiving for each variable from the data node responsible for the variable a plurality of unique data values for the variable, a count for each of the unique data values and an overflow count for the variable, wherein the number of unique data values does not exceed the maximum number of levels. A role for a variable can be determined based upon the unique data values, counts and overflow count for the variable.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.