Patent · US Expired

Random sampling of rows in a parallel processing database system

US6564221B1 · kind B1 · utility

20Cited by
3References
66Claims
0Family size

Assignee

Inventor

Key dates

Filing dateDec 8, 1999
Grant dateMay 13, 2003
Priority date
Expiry dateDec 8, 2019

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99943
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method, apparatus, and article of manufacture for random sampling of rows stored in a table, wherein the table has a plurality of partitions. A row count is determined for each of the partitions of the table and a total number of rows in the table is determined from the row count for each of the partitions of the table. A proportional allocation of a sample size is computed for each of the partitions based on the row count and the total number of rows. A sample set of rows of the sample size is retrieved from the table, wherein each of the partitions of the table contributes its proportional allocation of rows to the sample set of rows. Preferably, the computer system is a parallel processing database system, wherein each of its processing units manages a partition of the table, and some of the above steps can be performed in parallel by the processing units.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.