Patent · US Active

Semi-supervised data integration model for named entity classification

US9292797B2 · kind B2 · utility

13Cited by
10References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 14, 2012
Grant dateMar 22, 2016
Priority date
Expiry dateApr 29, 2034

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N5/02
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

According to one embodiment, a semi-supervised data integration model for named entity classification from a first repository of entity information in view of an auxiliary repository of classification assistance data is provided. Training data are compared to named entity candidates taken from the first repository to form a positive training seed set. A decision tree is populated and classification rules are created for classifying the named entity candidates. A number of entities are sampled from the named entity candidates. The sampled entities are labeled as positive examples and/or negative examples. The positive training seed set is updated to include identified commonality between the positive examples and the auxiliary repository. A negative training seed set is updated to include negative examples which lack commonality with the auxiliary repository. In view of both the updated positive and negative training seed sets, the decision tree and the classification rules are updated.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.