Semi-supervised data integration model for named entity classification
US9292797B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Dec 14, 2012 |
| Grant date | Mar 22, 2016 |
| Priority date | — |
| Expiry date | Apr 29, 2034 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N5/02
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
According to one embodiment, a semi-supervised data integration model for named entity classification from a first repository of entity information in view of an auxiliary repository of classification assistance data is provided. Training data are compared to named entity candidates taken from the first repository to form a positive training seed set. A decision tree is populated and classification rules are created for classifying the named entity candidates. A number of entities are sampled from the named entity candidates. The sampled entities are labeled as positive examples and/or negative examples. The positive training seed set is updated to include identified commonality between the positive examples and the auxiliary repository. A negative training seed set is updated to include negative examples which lack commonality with the auxiliary repository. In view of both the updated positive and negative training seed sets, the decision tree and the classification rules are updated.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.