Patent · US Active

Apparatus and method for sampling large data sets in a distributed data storage system

US10866874B1 · kind B1 · utility

0Cited by
0References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJun 27, 2019
Grant dateDec 15, 2020
Priority date
Expiry dateJun 27, 2039

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F3/067
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system includes a distributed data storage system disseminated across worker machines connected by a network. A distributed data storage management module has instructions executed by a processor to utilize data block identifiers to track data block accesses to the distributed data storage system. A sampling module with instructions executed by the processor receives a new sample request from a client machine connected to the network. Initial data block samples are gathered from the distributed data storage system during a first time period. A revised sample request is received from the client machine during the first time period. The initial data block samples are gathered. New data block samples are collected from the distributed data storage system. The initial data block samples and the new data block samples are combined to form cumulative data block sample results. The cumulative data block sample results are supplied to the client machine.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.