Parallel processing of data sets
US8868470B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Nov 9, 2010 |
| Grant date | Oct 21, 2014 |
| Priority date | — |
| Expiry date | Aug 17, 2032 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F9/5061
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Systems, methods, and devices are described for implementing learning algorithms on data sets. A data set may be partitioned into a plurality of data partitions that may be distributed to two or more processors, such as a graphics processing unit. The data partitions may be processed in parallel by each of the processors to determine local counts associated with the data partitions. The local counts may then be aggregated to form a global count that reflects the local counts for the data set. The partitioning may be performed by a data partition algorithm and the processing and the aggregating may be performed by a parallel collapsed Gibbs sampling (CGS) algorithm and/or a parallel collapsed variational Bayesian (CVB) algorithm. In addition, the CGS and/or the CVB algorithms may be associated with the data partition algorithm and may be parallelized to train a latent Dirichlet allocation model.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.