Patent · US Active

Parallel processing of data sets

US8868470B2 · kind B2 · utility

3Cited by
0References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateNov 9, 2010
Grant dateOct 21, 2014
Priority date
Expiry dateAug 17, 2032

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F9/5061
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Systems, methods, and devices are described for implementing learning algorithms on data sets. A data set may be partitioned into a plurality of data partitions that may be distributed to two or more processors, such as a graphics processing unit. The data partitions may be processed in parallel by each of the processors to determine local counts associated with the data partitions. The local counts may then be aggregated to form a global count that reflects the local counts for the data set. The partitioning may be performed by a data partition algorithm and the processing and the aggregating may be performed by a parallel collapsed Gibbs sampling (CGS) algorithm and/or a parallel collapsed variational Bayesian (CVB) algorithm. In addition, the CGS and/or the CVB algorithms may be associated with the data partition algorithm and may be parallelized to train a latent Dirichlet allocation model.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.