Automatic evaluation of categorization system quality
US7296020B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jun 5, 2003 |
| Grant date | Nov 13, 2007 |
| Priority date | — |
| Expiry date | Sep 6, 2024 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99945
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A computerized method and system of document analysis. The method and system categorise documents according to a taxonomy. This is accomplished by rating training documents on a lower level by associating either of the following predicates to a training document: either correct, inbound, outbound, or unassigned, Rating categories are established on a lower level by determining precision/recall values for each category, and generating higher level category rating attributes from the lower-level rating steps. This is done by associating one or more of: aa) weak category, bb) existing source/sink relationship between categories, cc) close categories to the categories, and deriving an overall quality measure for the training base from the lower-level and higher-level rating step. The lower-level and higher-level evaluation results are stored. The quality measure is used to determine action proposals to improve the training base as either one or more of: aa) modifying the number of categories by adding a new category or deleting an existing category, or bb) splitting a category in one or more new categories, or cc) merging a category with another one, or dd) modifying the number of trai…
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.