Patent · US Active

Unified cascaded encoder ASR model for dynamic model sizes

US12417770B2 · kind B2 · utility

0Cited by
1References
19Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 13, 2023
Grant dateSep 16, 2025
Priority date
Expiry dateJan 8, 2044

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L2015/223
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

An automated speech recognition (ASR) model includes a first encoder, a first encoder, a second encoder, and a second decoder. The first encoder receives, as input, a sequence of acoustic frames, and generates, at each of a plurality of output steps, a first higher order feature representation for a corresponding acoustic frame in the sequence of acoustic frames. The first decoder receives, as input, the first higher order feature representation generated by the first encoder, and generates a first probability distribution over possible speech recognition hypotheses. The second encoder receives, as input, the first higher order feature representation generated by the first encoder, and generates a second higher order feature representation for a corresponding first higher order feature frame. The second decoder receives, as input, the second higher order feature representation generated by the second encoder, and generates a second probability distribution over possible speech recognition hypotheses.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.