Patent · US Active

System, method, and computer program for obtaining a unified named entity recognition model with the collective predictive capabilities of teacher models with different tag sets using marginal distillation

US11487944B1 · kind B1 · utility

2Cited by
0References
24Claims
0Family size

Assignee

Inventors

Key dates

Filing dateSep 24, 2020
Grant dateNov 1, 2022
Priority date
Expiry dateJun 23, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N20/00
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

The present disclosure sets forth a marginal distillation approach to obtaining a unified name-entity recognition (NER) student model from a plurality of pre-trained teacher NER models with different tag sets. Knowledge from the teacher models is distilled into a student model without requiring access to the annotated training data used to train the teacher models. In particular, the system receives a tag hierarchy that combines the different teacher tag sets. The teacher models and the student model are applied to a set of input data sequence to obtain tag predictions for each of the models. A distillation loss is computed between the student and each of the teacher models. If teacher's predictions are less fine-grained than the student's with respect to a node in the tag hierarchy, the student's more fine-grained predictions for the node are marginalized in computing the distillation loss. The overall loss is minimized, resulting in the student model acquiring the collective predictive capabilities of the teacher models.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.