Patent · US Active

Unified cascaded encoder ASR model for dynamic model sizes

US12417770B2 · kind B2 · utility

0Cited by

1References

19Claims

0Family size

Assignee

Google LLC · US

Inventors

Shaojin Ding · Mountain View, US
Yangzhang He · Mountain View, US
Xin Wang · Beijing, CN
Weiran Wang · Arlington, US
Trevor Strohman · Sunnyvale, US
Tara N. Sainath · Jersey City, US
Rohit Prakash Prabhavalkar · Santa Clara, US
Robert David · Mountain View, US
Rina Panigrahy · Sunnyvale, US
Rami Botros · Mountain View, US
Qiao Liang · Mountain View, US
Ian C. McGraw · Menlo Park, US
Ding Zhao · Anjo, JP
Dongseong Hwang · Mountain View, US

Key dates

Filing date	Mar 13, 2023
Grant date	Sep 16, 2025
Priority date	—
Expiry date	Jan 8, 2044

Classification

Technology area (CPC G)Physics
CPC primaryG10L2015/223
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

An automated speech recognition (ASR) model includes a first encoder, a first encoder, a second encoder, and a second decoder. The first encoder receives, as input, a sequence of acoustic frames, and generates, at each of a plurality of output steps, a first higher order feature representation for a corresponding acoustic frame in the sequence of acoustic frames. The first decoder receives, as input, the first higher order feature representation generated by the first encoder, and generates a first probability distribution over possible speech recognition hypotheses. The second encoder receives, as input, the first higher order feature representation generated by the first encoder, and generates a second higher order feature representation for a corresponding first higher order feature frame. The second decoder receives, as input, the second higher order feature representation generated by the second encoder, and generates a second probability distribution over possible speech recognition hypotheses.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.