Patent · US Active

System and method for clustering documents

US8046363B2 · kind B2 · utility

303Cited by
12References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJan 10, 2007
Grant dateOct 25, 2011
Priority date
Expiry dateDec 13, 2027

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/937
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Provided are a system and method of clustering documents. The system includes a document DB, a document feature writing unit storing documents, a document retrieving unit, a clustering unit, and a cluster DB. The document DB stores documents. The document feature writing unit extracts attribute information of documents stored in the document database, and writes indexes with respect to the respective documents on the basis of the attribute information. The document retrieving unit retrieves documents including a query input by a user, using the indexes. The clustering unit includes a representative vector calculator calculating feature vectors and a representative vector of the retrieved documents, and a similarity calculator calculating similarities between the documents using the feature vectors and the representative vector. The cluster database stores documents clustered by the clustering unit.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.