Patent · US Active

Stochastic document clustering using rare features

US9754023B2 · kind B2 · utility

0Cited by
2References
20Claims
0Family size

Assignee

Inventor

Key dates

Filing dateFeb 8, 2016
Grant dateSep 5, 2017
Priority date
Expiry dateFeb 8, 2036

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/358
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Systems, methods, and apparatus for clustering resources using rare features are provided. For example, an environment includes an extraction module, an index module, and a cluster module. The extractions module identifies a set of resources and extracts a plurality of features from the resources. The plurality of features may be rare features. The index module identifies and generates a rare features index. The cluster module identifies at least two resources that share rare features, creates one or more clusters based on the identified at least two resources, and associates resources that share similar features with the one or more clusters. Resources that do not share similar features are not associated with the one or more clusters. Identifying at least two resources that share rare features is based at least upon a threshold.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.