Patent · US Active

System and method for clustering unstructured documents

US7809727B2 · kind B2 · utility

13Cited by
34References
16Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 24, 2007
Grant dateOct 5, 2010
Priority date
Expiry dateJan 12, 2029

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99945
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system and method for clustering unstructured documents is provided. Documents having terms with frequencies of occurrence that satisfy upper and lower edge conditions are selected. Concepts are generated for the selected documents. The selected documents are grouped into clusters of the documents. A weight for each of the clusters is evaluated. A similarity value is determined from the frequencies of occurrence for at least one of the terms from the concepts and the cluster weights for each selected document. Each selected document is assigned into one such cluster based on the similarity value of the selected document.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.