Patent · US Active

Supervised and unsupervised training with contrastive loss over sequences

US12230249B2 · kind B2 · utility

0Cited by
0References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 22, 2022
Grant dateFeb 18, 2025
Priority date
Expiry dateJul 11, 2043

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L2015/0635
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method includes receiving audio data corresponding to an utterance and generating a pair of positive audio data examples. Here, each positive audio data example includes a respective augmented copy of the received audio data. For each respective positive audio data example, the method includes generating a respective sequence of encoder outputs and projecting the respective sequence of encoder outputs for the positive data example into a contrastive loss space. The method also includes determining a L2 distance between each corresponding encoder output in the projected sequences of encoder outputs for the positive audio data examples and determining a per-utterance consistency loss by averaging the L2 distances. The method also includes generating corresponding speech recognition results for each respective positive audio data example. The method also includes updating parameters of the speech recognition model based on a respective supervised loss term and the per-utterance consistency loss.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.