Optimized full-spectrum loglog-based cardinality estimation
US10853362B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Apr 18, 2016 |
| Grant date | Dec 1, 2020 |
| Priority date | — |
| Expiry date | May 15, 2039 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/2255
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Systems and methods are disclosed for optimizing full-spectrum cardinality approximations on big data utilizing an optimized LogLog counting technique. To accomplish the foregoing, a multiset of objects that each corresponds to one of a plurality of objects associated with a resource are obtained. A compound data object is populated at least in part with data that is derived based on generated hash values that correspond to each object in the obtained multiset. The populated compound data object is processed with a full-spectrum harmonic mean estimation operation that can accurately determine a cardinality estimate for the obtained multiset using less resources and time when compared to traditional techniques. The determination is further made without the need to employ linear counting or bias correction operations on low or high cardinalities. An estimated number of unique objects in the obtained multiset is determined as a result of the processing, and subsequently provided for display or further manipulation.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.