Method and apparatus for almost-constant-time clustering of arbitrary corpus subsets
US6038557A · kind A · utility
Assignee
Inventor
Key dates
| Filing date | Jan 26, 1998 |
| Grant date | Mar 14, 2000 |
| Priority date | — |
| Expiry date | Jan 26, 2018 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99935
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method and apparatus for almost-constant-time re-clustering of corpus subsets with customizable time/precision tradeoff, is usable in a basic browsing method, such as Scatter/Gather, to successfully partition a large document collection into clusters of related documents. The user is first presented with a clustering of the entire corpus into metadocuments from which the worst metadocument is selected and replaced with its "children". Children containing no documents of interest are pruned and the remaining metadocuments are further expanded until a predetermined number of children metadocuments are obtain. The resulting metadocuments are then reclustered. The process is repeated until the user obtains the desired degree of specificity.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.