Record profiling for dataset sampling
US10846298B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Oct 28, 2016 |
| Grant date | Nov 24, 2020 |
| Priority date | — |
| Expiry date | Jun 16, 2038 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/285
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method for generating a smaller dataset from a larger dataset, each dataset holding a plurality of records, includes profiling the larger dataset to identify a plurality of patterns, each of which is descriptive of one or more records held in the larger dataset. A plurality of slots of the smaller dataset is filled with records held in the larger dataset. Multiple records held in the larger dataset are individually retrieved, and for each retrieved record it is determined whether to place the retrieved record into a slot of the smaller dataset and evict a record already occupying that slot, or not place the retrieved record into the smaller dataset. This determination is based on a pattern of the retrieved record and a representation status of the pattern in the smaller dataset.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.