Patent · US Active

Stochastic document clustering using rare features

US9256669B2 · kind B2 · utility

1Cited by
1References
17Claims
0Family size

Assignee

Inventor

Key dates

Filing dateNov 15, 2013
Grant dateFeb 9, 2016
Priority date
Expiry dateMay 22, 2034

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/358
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Systems, methods, and apparatus for clustering resources using rare features are provided. For example, an environment includes an extraction module, an index module, and a cluster module. The extractions module identifies a set of resources and extracts a plurality of features from the resources. The plurality of features may be rare features. The index module identifies and generates a rare features index. The cluster module identifies at least two resources that share rare features, creates one or more clusters based on the identified at least two resources, and associates resources that share similar features with the one or more clusters. Resources that do not share similar features are not associated with the one or more clusters. Identifying at least two resources that share rare features is based at least upon a threshold.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.