Patent · US Active

Systems and methods for distilled BERT-based training model for text classification

US11922303B2 · kind B2 · utility

0Cited by
17References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMay 18, 2020
Grant dateMar 5, 2024
Priority date
Expiry dateNov 21, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N3/048
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Embodiments described herein provides a training mechanism that transfers the knowledge from a trained BERT model into a much smaller model to approximate the behavior of BERT. Specifically, the BERT model may be treated as a teacher model, and a much smaller student model may be trained using the same inputs to the teacher model and the output from the teacher model. In this way, the student model can be trained within a much shorter time than the BERT teacher model, but with comparable performance with BERT.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.