Patent · US Expired

Computer method and apparatus for clustering documents and automatic generation of cluster keywords

US5857179A · kind A · utility

291Cited by
6References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateSep 9, 1996
Grant dateJan 5, 1999
Priority date
Expiry dateSep 9, 2016

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99932
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A computer method and apparatus determines keywords of documents. An initial document by term matrix is formed, each document being represented by a respective M dimensional vector, where M represents the number of terms or words in a predetermined domain of documents. The dimensionality of the initial matrix is reduced to form resultant vectors of the documents. The resultant vectors are then clustered such that correlated documents are grouped into respective clusters. For each cluster, the terms having greatest impact on the documents in that cluster are identified. The identified terms represent key words of each document in that cluster. Further, the identified terms form a cluster summary indicative of the documents in that cluster.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.