System, method, and computer program for obtaining a unified named entity recognition model with the collective predictive capabilities of teacher models with different tag sets using marginal distillation
US11487944B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 24, 2020 |
| Grant date | Nov 1, 2022 |
| Priority date | — |
| Expiry date | Jun 23, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N20/00
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
The present disclosure sets forth a marginal distillation approach to obtaining a unified name-entity recognition (NER) student model from a plurality of pre-trained teacher NER models with different tag sets. Knowledge from the teacher models is distilled into a student model without requiring access to the annotated training data used to train the teacher models. In particular, the system receives a tag hierarchy that combines the different teacher tag sets. The teacher models and the student model are applied to a set of input data sequence to obtain tag predictions for each of the models. A distillation loss is computed between the student and each of the teacher models. If teacher's predictions are less fine-grained than the student's with respect to a node in the tag hierarchy, the student's more fine-grained predictions for the node are marginalized in computing the distillation loss. The overall loss is minimized, resulting in the student model acquiring the collective predictive capabilities of the teacher models.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.