Patent · US Expired

Taxonomy generation for document collections

US6446061B1 · kind B1 · utility

378Cited by
1References
31Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJun 30, 1999
Grant dateSep 3, 2002
Priority date
Expiry dateJun 30, 2019

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99933
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

This mechanism relates to a method within the area of information mining within a multitude of documents stored on computer systems. More particularly, this mechanism relates to a computerized method of generating a content taxonomy of a multitude of electronic documents. The technique proposed by the current invention is able to improve at the same time the scalability and the coherence and selectivity of taxonomy generation. The fundamental approach of the current invention comprises a subset selection step, wherein a subset of a multitude of documents is being selected. In a taxonomy generation step a taxonomy is generated for that selected subset of documents, the taxonomy being a tree structured taxonomy hierarchy. Moreover this method comprises a routing selection step assigning each unprocessed document to the taxonomy hierarchy based on largest similarity.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.