Patent · US Active

Scalable probabilistic latent semantic analysis

US7844449B2 · kind B2 · utility

40Cited by
13References
10Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 30, 2006
Grant dateNov 30, 2010
Priority date
Expiry dateMar 21, 2029

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/30
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A scalable two-pass scalable probabilistic latent semantic analysis (PLSA) methodology is disclosed that may perform more efficiently, and in some cases more accurately, than traditional PLSA, especially where large and/or sparse data sets are provided for analysis. The improved methodology can greatly reduce the storage and/or computational costs of training a PLSA model. In the first pass of the two-pass methodology, objects are clustered into groups, and PLSA is performed on the groups instead of the original individual objects. In the second pass, the conditional probability of a latent class, given an object, is obtained. This may be done by extending the training results of the first pass. During the second pass, the most likely latent classes for each object are identified.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.