Patent · US Active

Automatic discovery of relevant data in massive datasets

US9558245B1 · kind B1 · utility

10Cited by
4References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 7, 2015
Grant dateJan 31, 2017
Priority date
Expiry dateDec 7, 2035

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/285
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

An approach for discovery of relevant data in massive datasets. Compare datasets including compare key fields, compare data fields and a core dataset including target data field(s) and core field(s) are received. The compare datasets are categorized into direct and indirect related dataset pools based on the target data field(s) correlation strength with matching compare and core fields. The direct related dataset pool and the core dataset are transformed into reduction datasets based on statistical measure of values of target data fields, shared key fields and compare data fields. Target correlations of the reduction datasets are creating based on a reduction compare and target data fields. Statistical relationship strength of core dataset and the direct related dataset pool are created based on a statistical mean of target correlations and a relevancy data store is created.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.