Domain-specific unstructured text retrieval
US10318564B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 28, 2015 |
| Grant date | Jun 11, 2019 |
| Priority date | — |
| Expiry date | Mar 25, 2037 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N20/00
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Retrieving from the Internet unstructured text related to a specified domain is described. Training data is accessed; the training data comprises unstructured text related to the specified domain. A first classifier is trained using features of the training data. It is used to classify unstructured text having plurality of features, to obtain unstructured text examples related to the domain. The unstructured text examples are used to retrieve from the Internet similar examples which do not have at least some of the plurality of features. Optionally, a second classifier is trained using the similar examples. Additional unstructured text is retrieved from the Internet and the second classifier is used to label the additional unstructured text for domain relevance.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.