Corpus management by automatic categorization into functional domains to support faceted querying
US10346442B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Nov 17, 2016 |
| Grant date | Jul 9, 2019 |
| Priority date | — |
| Expiry date | Apr 11, 2037 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/248
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Embodiments can provide a computer implemented method, in a data processing system comprising a processor and a memory comprising instructions which are executed by the processor to cause the processor to implement an enhanced corpus management system, the method comprising: identifying one or more functional domain categories; ingesting one or more incoming documents to form an open-domain corpus; for each functional domain category, identifying one or more representative documents to establish a seed sub-corpus; calculating a degree of fit score between each of the one or more incoming documents and the one or more established functional domain category seed sub-corpora; and assigning one or more of the incoming documents to one or more of the functional domain categories based upon the degree of fit score to create an enhanced corpus.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.