Patent · US Active

Sampling for preprocessing big data based on features of transformation results

US10459942B1 · kind B1 · utility

2Cited by
1References
26Claims
0Family size

Assignee

Inventors

Key dates

Filing dateApr 29, 2016
Grant dateOct 29, 2019
Priority date
Expiry dateApr 30, 2037

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/26
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system determines samples of datasets that are typically processed by big data analysis systems. The samples are for use for development and testing of transformations for preprocessing the datasets in preparation for analysis by big data systems. The system receives one or more transform operations input datasets for the transform operations. The system determines samples associated with the transform operations. According to a sampling strategy, the system determines samples that return at least a threshold number of records in the result set obtained by applying a transformation. According to another sampling strategy, the system receives criteria describing the result of the transform operations and determines sample sets that generate result sets satisfying the criteria as a result of applying the transform operations.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.