Taxonomy generation for document collections
US6446061B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Jun 30, 1999 |
| Grant date | Sep 3, 2002 |
| Priority date | — |
| Expiry date | Jun 30, 2019 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99933
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
This mechanism relates to a method within the area of information mining within a multitude of documents stored on computer systems. More particularly, this mechanism relates to a computerized method of generating a content taxonomy of a multitude of electronic documents. The technique proposed by the current invention is able to improve at the same time the scalability and the coherence and selectivity of taxonomy generation. The fundamental approach of the current invention comprises a subset selection step, wherein a subset of a multitude of documents is being selected. In a taxonomy generation step a taxonomy is generated for that selected subset of documents, the taxonomy being a tree structured taxonomy hierarchy. Moreover this method comprises a routing selection step assigning each unprocessed document to the taxonomy hierarchy based on largest similarity.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.