Patent · US Expired

Method and apparatus for determining distinct cardinality dual hash bitmaps

US5802521A · kind A · utility

29Cited by
3References
23Claims
0Family size

Assignee

Inventors

Key dates

Filing dateOct 7, 1996
Grant dateSep 1, 1998
Priority date
Expiry dateOct 7, 2016

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99943
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method and apparatus for determining distinct cardinality of a data sample using dual hash bitmaps. Two different bitmaps determine the distinct cardinality of the data sample (e.g., a column of data within a data table of a database). A small sized bitmap, M*sqrt(C*K), is used where M is a programmable value that reduces collision error, C is the size of the column ("data sample size"), and K is a key density value. Once selected, both M and K are constant. Sample entry values are hashed by a hash function and a modulo function determines an entry into the first bitmap. Based on the bitmap's bit entries, a first counter is updated, or not, to maintain a first distinct cardinality value. A large bitmap is used having a size, M*C, but only a small fraction is actually used, M*sqrt(C*K). Only hashed column entries falling inside the fraction are processed as above to maintain a second counter. At the end of the data sample entry processing, the second counter is extrapolated to the large bitmap size. Expected collision error compensation is computed and compensated for the first and second counters. Distribution error is computed for the second counter and added with its compensate…

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.