Patent · US Active

Cascaded encoders for simplified streaming and non-streaming ASR

US12154581B2 · kind B2 · utility

0Cited by

1References

17Claims

0Family size

Assignee

Google LLC · US

Inventors

Arun Narayanan · Rochester Hills, US
Tara N. Sainath · Jersey City, US
Chung-Cheng Chiu · Mountain View, US
Ruoming Pang · New York, US
Rohit Prakash Prabhavalkar · Santa Clara, US
Jiahui Yu · Champaign, US
Ehsan Variani · Mountain View, US
Trevor Strohman · Sunnyvale, US

Key dates

Filing date	Apr 21, 2021
Grant date	Nov 26, 2024
Priority date	—
Expiry date	Feb 24, 2042

Classification

Technology area (CPC G)Physics
CPC primaryG10L25/30
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

An automated speech recognition (ASR) model includes a first encoder, a second encoder, and a decoder. The first encoder receives, as input, a sequence of acoustic frames, and generates, at each of a plurality of output steps, a first higher order feature representation for a corresponding acoustic frame in the sequence of acoustic frames. The second encoder receives, as input, the first higher order feature representation generated by the first encoder at each of the plurality of output steps, and generates, at each of the plurality of output steps, a second higher order feature representation for a corresponding first higher order feature frame. The decoder receives, as input, the second higher order feature representation generated by the second encoder at each of the plurality of output steps, and generates, at each of the plurality of time steps, a first probability distribution over possible speech recognition hypotheses.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.