Patent · US Active

Systems and methods for distilled BERT-based training model for text classification

US11922303B2 · kind B2 · utility

0Cited by

17References

20Claims

0Family size

Assignee

Salesforce, Inc. · US

Inventors

Wenhao Liu · Dongguan, CN
Ka Chun Au · San Francisco, US
Shashank Harinath · San Francisco, US
Bryan McCann · Stanford, US
Govardana Sachithanandam Ramachandran · Palo Alto, US
Alexis Roos · Palo Alto, US
Caiming Xiong · Menlo Park, US

Key dates

Filing date	May 18, 2020
Grant date	Mar 5, 2024
Priority date	—
Expiry date	Nov 21, 2041

Classification

Technology area (CPC G)Physics
CPC primaryG06N3/048
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Embodiments described herein provides a training mechanism that transfers the knowledge from a trained BERT model into a much smaller model to approximate the behavior of BERT. Specifically, the BERT model may be treated as a teacher model, and a much smaller student model may be trained using the same inputs to the teacher model and the output from the teacher model. In this way, the student model can be trained within a much shorter time than the BERT teacher model, but with comparable performance with BERT.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.