Patent · US Expired

Method and apparatus for almost-constant-time clustering of arbitrary corpus subsets

US6038557A · kind A · utility

6Cited by
4References
20Claims
0Family size

Assignee

Inventor

Key dates

Filing dateJan 26, 1998
Grant dateMar 14, 2000
Priority date
Expiry dateJan 26, 2018

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99935
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method and apparatus for almost-constant-time re-clustering of corpus subsets with customizable time/precision tradeoff, is usable in a basic browsing method, such as Scatter/Gather, to successfully partition a large document collection into clusters of related documents. The user is first presented with a clustering of the entire corpus into metadocuments from which the worst metadocument is selected and replaced with its "children". Children containing no documents of interest are pruned and the remaining metadocuments are further expanded until a predetermined number of children metadocuments are obtain. The resulting metadocuments are then reclustered. The process is repeated until the user obtains the desired degree of specificity.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.