Patent · US Active

Efficient data infrastructure for high dimensional data analysis

US7870114B2 · kind B2 · utility

7Cited by
65References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJun 15, 2007
Grant dateJan 11, 2011
Priority date
Expiry dateJul 4, 2028

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/283
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Described is a technology by which high dimensional source data corresponding to rows of records with identifiers, and columns comprising dimensions of data values, are processed into a file model for efficient access. An inverted index corresponding to any dimension is built by mapping data from raw dimension values to mapped values based on mapping entries in a dimension table. The record identifiers are arranged into subgroups based on their mapped value; a count and/or an offset may be maintained for locating each of the subgroups. The raw values for a dimension are maintained within a raw value file. For sparse data, the raw value file may be compressed, e.g., by excluding nulls and associating a record identifier with each non-null. A data manager provides access to data in the data files, such as by offering various functions, using caching for efficiency.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.