Patent · US Active

System and method for fast identification of variable roles during initial data exploration

US8918410B2 · kind B2 · utility

2Cited by
0References
94Claims
0Family size

Assignee

Inventors

Key dates

Filing dateFeb 21, 2013
Grant dateDec 23, 2014
Priority date
Expiry dateJun 11, 2033

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06Q10/06
  • WIPO fieldIT methods for management
  • WIPO sectorElectrical engineering

Abstract

Systems and methods are provided for identifying data variable rules during initial data exploration. In one example, a computer-implemented method of determining a role for a data variable is disclosed. The method comprises identifying to a plurality of data nodes a set of data records containing data values assigned to each data node, a maximum number of levels to record in a sorted data structure at the data nodes, and the data node responsible for each of a plurality of variables. The method further comprises receiving for each variable from the data node responsible for the variable a plurality of unique data values for the variable, a count for each of the unique data values and an overflow count for the variable, wherein the number of unique data values does not exceed the maximum number of levels. A role for a variable can be determined based upon the unique data values, counts and overflow count for the variable.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.