Systems and methods for distilled BERT-based training model for text classification
US11922303B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | May 18, 2020 |
| Grant date | Mar 5, 2024 |
| Priority date | — |
| Expiry date | Nov 21, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N3/048
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Embodiments described herein provides a training mechanism that transfers the knowledge from a trained BERT model into a much smaller model to approximate the behavior of BERT. Specifically, the BERT model may be treated as a teacher model, and a much smaller student model may be trained using the same inputs to the teacher model and the output from the teacher model. In this way, the student model can be trained within a much shorter time than the BERT teacher model, but with comparable performance with BERT.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.