Methods and systems for classifying data using a hierarchical taxonomy
US9367814B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Jun 22, 2012 |
| Grant date | Jun 14, 2016 |
| Priority date | — |
| Expiry date | Jun 14, 2033 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N5/022
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method and system for classifying documents is provided. A set of document classifiers is generated by applying a classification algorithm to a trusted corpus that includes a set of training documents representing a taxonomy. One or more of the generated document classifiers are executed against a plurality of input documents to create a plurality of classified documents. Each classified document is associated with a classification within the taxonomy and a classification confidence level. One or more classified documents that are associated with a classification confidence level below a predetermined threshold value are selected to create a set of low-confidence documents. The low-confidence documents are disassociated from each of the associated classifications. A user is prompted to enter a classification within the taxonomy for at least one low-confidence document. The low-confidence document is associated with the entered classification and with a predetermined confidence level to create a newly classified document.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.