Patent · US Active

Supervised and unsupervised training with contrastive loss over sequences

US12230249B2 · kind B2 · utility

0Cited by

0References

18Claims

0Family size

Assignee

Google LLC · US

Inventors

Andrew Rosenberg · Brooklyn, US
Bhuvana Ramabhadran · Campion Road, US
Zhehuai Chen · Edgewater, US
Yuan-Fang Wang · Brooklyn, US
Yu Zhang · Mountain View, US
Jesse Emond · Mountain View, US

Key dates

Filing date	Mar 22, 2022
Grant date	Feb 18, 2025
Priority date	—
Expiry date	Jul 11, 2043

Classification

Technology area (CPC G)Physics
CPC primaryG10L2015/0635
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method includes receiving audio data corresponding to an utterance and generating a pair of positive audio data examples. Here, each positive audio data example includes a respective augmented copy of the received audio data. For each respective positive audio data example, the method includes generating a respective sequence of encoder outputs and projecting the respective sequence of encoder outputs for the positive data example into a contrastive loss space. The method also includes determining a L2 distance between each corresponding encoder output in the projected sequences of encoder outputs for the positive audio data examples and determining a per-utterance consistency loss by averaging the L2 distances. The method also includes generating corresponding speech recognition results for each respective positive audio data example. The method also includes updating parameters of the speech recognition model based on a respective supervised loss term and the per-utterance consistency loss.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.