Patent · US Active

Two-pass end to end speech recognition

US12073824B2 · kind B2 · utility

0Cited by

2References

14Claims

0Family size

Assignee

Google LLC · US

Inventors

Tara N. Sainath · Jersey City, US
Yanzhang He · Mountain View, US
Bo Li · 东风镇, CN
Arun Narayanan · Rochester Hills, US
Ruoming Pang · New York, US
Antoine Jean Bruguier · Milpitas, US
Shuo-yiin Chang · Sunnyvale, US
Wei Li · Milpitas, US

Key dates

Filing date	Dec 3, 2020
Grant date	Aug 27, 2024
Priority date	—
Expiry date	Oct 20, 2041

Classification

Technology area (CPC G)Physics
CPC primaryG10L2015/0635
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Two-pass automatic speech recognition (ASR) models can be used to perform streaming on-device ASR to generate a text representation of an utterance captured in audio data. Various implementations include a first-pass portion of the ASR model used to generate streaming candidate recognition(s) of an utterance captured in audio data. For example, the first-pass portion can include a recurrent neural network transformer (RNN-T) decoder. Various implementations include a second-pass portion of the ASR model used to revise the streaming candidate recognition(s) of the utterance and generate a text representation of the utterance. For example, the second-pass portion can include a listen attend spell (LAS) decoder. Various implementations include a shared encoder shared between the RNN-T decoder and the LAS decoder.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.