Patent · US Expired

Detecting correlation from data

US7647293B2 · kind B2 · utility

39Cited by
6References
13Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJun 10, 2004
Grant dateJan 12, 2010
Priority date
Expiry dateSep 19, 2024

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99932
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system and method of discovering dependencies between relational database column pairs and application of discoveries to query optimization is provided. For each candidate column pair remaining after simultaneously generating column pairs, pruning pairs not satisfying specified heuristic constraints, and eliminating pairs with trivial instances of correlation, a random sample of data values is collected. A candidate column pair is tested for the existence of a soft functional dependency (FD), and if a dependency is not found, statistically tested for correlation using a robust chi-squared statistic. Column pairs for which either a soft FD or a statistical correlation exists are prioritized for recommendation to a query optimizer, based on any of: strength of dependency, degree of correlation, or adjustment factor; statistics for recommended columns pairs are tracked to improve selectivity estimates. Additionally, a dependency graph representing correlations and dependencies as edges and column pairs as nodes is provided.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.