Classification of clustered documents based on similarity scores
US8543576B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | May 23, 2012 |
| Grant date | Sep 24, 2013 |
| Priority date | — |
| Expiry date | May 23, 2032 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/353
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Among other disclosed subject matter, a computer-implemented method that includes receiving a set of clusters of documents and calculating a similarity score for each cluster wherein the similarity score is based at least in part on features included in the documents in the cluster and indicates a measure of similarity of the documents in the cluster. For each cluster associated with a respective similarity score greater than a first threshold, identifying the cluster as satisfying a quality assurance requirement. For each cluster associated with a respective similarity score less than a second threshold, identifying the cluster as failing the quality assurance requirement. For each cluster associated with a similarity score less than or equal to the first threshold value and greater than or equal to the second threshold value, reviewing at least a subset of documents in the cluster to determine whether the cluster satisfies the quality assurance requirement.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.