Patent · US Active

Two-pass end to end speech recognition

US12073824B2 · kind B2 · utility

0Cited by
2References
14Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 3, 2020
Grant dateAug 27, 2024
Priority date
Expiry dateOct 20, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L2015/0635
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Two-pass automatic speech recognition (ASR) models can be used to perform streaming on-device ASR to generate a text representation of an utterance captured in audio data. Various implementations include a first-pass portion of the ASR model used to generate streaming candidate recognition(s) of an utterance captured in audio data. For example, the first-pass portion can include a recurrent neural network transformer (RNN-T) decoder. Various implementations include a second-pass portion of the ASR model used to revise the streaming candidate recognition(s) of the utterance and generate a text representation of the utterance. For example, the second-pass portion can include a listen attend spell (LAS) decoder. Various implementations include a shared encoder shared between the RNN-T decoder and the LAS decoder.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.