Patent · US Active

Language agnostic multilingual end-to-end streaming on-device ASR system

US12183322B2 · kind B2 · utility

0Cited by

1References

20Claims

0Family size

Assignee

Google LLC · US

Inventors

Bo Li · 东风镇, CN
Tara N. Sainath · Jersey City, US
Ruoming Pang · New York, US
Shuo-yiin Chang · Sunnyvale, US
Qiumin Xu · Mountain View, US
Trevor Strohman · Sunnyvale, US
Vince Chen · Mountain View, US
Qiao Liang · Mountain View, US
Heguang Liu · Sunnyvale, US
Yanzhang He · Mountain View, US
Parisa Haghani · Jersey City, US
Sameer Bidichandani · Los Gatos, US

Key dates

Filing date	Sep 22, 2022
Grant date	Dec 31, 2024
Priority date	—
Expiry date	Jun 27, 2043

Classification

Technology area (CPC G)Physics
CPC primaryG10L2015/226
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method includes receiving a sequence of acoustic frames characterizing one or more utterances as input to a multilingual automated speech recognition (ASR) model. The method also includes generating a higher order feature representation for a corresponding acoustic frame. The method also includes generating a hidden representation based on a sequence of non-blank symbols output by a final softmax layer. The method also includes generating a probability distribution over possible speech recognition hypotheses based on the hidden representation generated by the prediction network at each of the plurality of output steps and the higher order feature representation generated by the encoder at each of the plurality of output steps. The method also includes predicting an end of utterance (EOU) token at an end of each utterance. The method also includes classifying each acoustic frame as either speech, initial silence, intermediate silence, or final silence.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.