Efficient sampling of a relational database
US6993516B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Dec 26, 2002 |
| Grant date | Jan 31, 2006 |
| Priority date | — |
| Expiry date | Jun 1, 2024 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99936
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A system, method and computer readable medium for sampling data from a relational database are disclosed, where an information processing system chooses rows from a table in a relational database for sampling, wherein data values are arranged into rows, rows are arranged into pages, and pages are arranged into tables. Pages are chosen for sampling according to a probability P and rows in a selected page are chosen for sampling according to a probability R, so that the overall probability of choosing a row for sampling is Q=PR. The probabilities P and R are based on the desired precision of estimates computed from a sample, as well as processing speed. The probabilities P and R are further based on either catalog statistics of the relational database or a pilot sample of rows from the relational database.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.