Stochastic document clustering using rare features
US9754023B2 · kind B2 · utility
Assignee
Inventor
Key dates
| Filing date | Feb 8, 2016 |
| Grant date | Sep 5, 2017 |
| Priority date | — |
| Expiry date | Feb 8, 2036 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/358
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Systems, methods, and apparatus for clustering resources using rare features are provided. For example, an environment includes an extraction module, an index module, and a cluster module. The extractions module identifies a set of resources and extracts a plurality of features from the resources. The plurality of features may be rare features. The index module identifies and generates a rare features index. The cluster module identifies at least two resources that share rare features, creates one or more clusters based on the identified at least two resources, and associates resources that share similar features with the one or more clusters. Resources that do not share similar features are not associated with the one or more clusters. Identifying at least two resources that share rare features is based at least upon a threshold.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.