Patent · US Active

Classification of clustered documents based on similarity scores

US8543576B1 · kind B1 · utility

13Cited by
0References
24Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMay 23, 2012
Grant dateSep 24, 2013
Priority date
Expiry dateMay 23, 2032

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/353
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Among other disclosed subject matter, a computer-implemented method that includes receiving a set of clusters of documents and calculating a similarity score for each cluster wherein the similarity score is based at least in part on features included in the documents in the cluster and indicates a measure of similarity of the documents in the cluster. For each cluster associated with a respective similarity score greater than a first threshold, identifying the cluster as satisfying a quality assurance requirement. For each cluster associated with a respective similarity score less than a second threshold, identifying the cluster as failing the quality assurance requirement. For each cluster associated with a similarity score less than or equal to the first threshold value and greater than or equal to the second threshold value, reviewing at least a subset of documents in the cluster to determine whether the cluster satisfies the quality assurance requirement.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.