Patent · US Active

Large-scale multilingual speech recognition with a streaming end-to-end model

US11468244B2 · kind B2 · utility

1Cited by

2References

22Claims

0Family size

Assignee

Google LLC · US

Inventors

Anjuli Patricia Kannan · Berkeley, US
Tara N. Sainath · Jersey City, US
Yonghui Wu · Fremont, US
Ankur Bapna · Sunnyvale, US
Arindrima Datta · New York, US

Key dates

Filing date	Mar 30, 2020
Grant date	Oct 11, 2022
Priority date	—
Expiry date	Aug 6, 2040

Classification

Technology area (CPC G)Physics
CPC primaryG10L15/32
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method of transcribing speech using a multilingual end-to-end (E2E) speech recognition model includes receiving audio data for an utterance spoken in a particular native language, obtaining a language vector identifying the particular language, and processing, using the multilingual E2E speech recognition model, the language vector and acoustic features derived from the audio data to generate a transcription for the utterance. The multilingual E2E speech recognition model includes a plurality of language-specific adaptor modules that include one or more adaptor modules specific to the particular native language and one or more other adaptor modules specific to at least one other native language different than the particular native language. The method also includes providing the transcription for output.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.