Patent · US Expired

Efficient sampling of a relational database

US6993516B2 · kind B2 · utility

11Cited by
14References
21Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 26, 2002
Grant dateJan 31, 2006
Priority date
Expiry dateJun 1, 2024

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99936
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system, method and computer readable medium for sampling data from a relational database are disclosed, where an information processing system chooses rows from a table in a relational database for sampling, wherein data values are arranged into rows, rows are arranged into pages, and pages are arranged into tables. Pages are chosen for sampling according to a probability P and rows in a selected page are chosen for sampling according to a probability R, so that the overall probability of choosing a row for sampling is Q=PR. The probabilities P and R are based on the desired precision of estimates computed from a sample, as well as processing speed. The probabilities P and R are further based on either catalog statistics of the relational database or a pilot sample of rows from the relational database.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.