Patent · US Active

Cascaded encoders for simplified streaming and non-streaming ASR

US12154581B2 · kind B2 · utility

0Cited by
1References
17Claims
0Family size

Assignee

Inventors

Key dates

Filing dateApr 21, 2021
Grant dateNov 26, 2024
Priority date
Expiry dateFeb 24, 2042

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L25/30
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

An automated speech recognition (ASR) model includes a first encoder, a second encoder, and a decoder. The first encoder receives, as input, a sequence of acoustic frames, and generates, at each of a plurality of output steps, a first higher order feature representation for a corresponding acoustic frame in the sequence of acoustic frames. The second encoder receives, as input, the first higher order feature representation generated by the first encoder at each of the plurality of output steps, and generates, at each of the plurality of output steps, a second higher order feature representation for a corresponding first higher order feature frame. The decoder receives, as input, the second higher order feature representation generated by the second encoder at each of the plurality of output steps, and generates, at each of the plurality of time steps, a first probability distribution over possible speech recognition hypotheses.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.