Patent · US Active

Efficient document clustering

US8200670B1 · kind B1 · utility

16Cited by
0References
32Claims
0Family size

Assignee

Inventors

Key dates

Filing dateOct 31, 2008
Grant dateJun 12, 2012
Priority date
Expiry dateAug 15, 2029

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/355
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods, systems, and apparatus, including computer program products, for clustering documents. A plurality of documents are identified from a set of documents, where the identified documents have the same top N terms by term frequency score for an integer N. A pattern string that is satisfied by at least a subset of the identified documents is identified. A document cluster is formed from at least the subset of the identified documents.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.