Patent · US Active

Semi-supervised training scheme for speech recognition

US12315499B2 · kind B2 · utility

0Cited by

0References

24Claims

0Family size

Assignee

Google LLC · US

Inventors

Soheil Khorram · Richardson, US
Anshuman Tripathi · Singapore, SG
Kim Jaeyoung · Cupertino, US
Han Lu · Santa Clara, US
Qian Zhang · Cypress, US
Hasim Sak · New York, US

Key dates

Filing date	Dec 14, 2022
Grant date	May 27, 2025
Priority date	—
Expiry date	Dec 14, 2043

Classification

Technology area (CPC G)Physics
CPC primaryG10L15/02
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method includes receiving a sequence of acoustic frames extracted from unlabeled audio samples that correspond to spoken utterances not paired with any corresponding transcriptions. The method also includes generating, using a supervised audio encoder, a target higher order feature representation for a corresponding acoustic frame. The method also includes augmenting the sequence of acoustic frames and generating, as output form an unsupervised audio encoder, a predicted higher order feature representation for a corresponding augmented acoustic frame in the sequence of augmented acoustic frames. The method also includes determining an unsupervised loss term based on the target higher order feature representation and the predicted higher order feature representation and updating parameters of the speech recognition model based on the unsupervised loss term.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.