Patent · US Active

Phoneme-based contextualization for cross-lingual speech recognition in end-to-end models

US11270687B2 · kind B2 · utility

3Cited by

4References

18Claims

0Family size

Assignee

Google LLC · US

Inventors

Ke Hu · Stony Brook, US
Antoine Jean Bruguier · Milpitas, US
Tara N. Sainath · Jersey City, US
Rohit Prakash Prabhavalkar · Santa Clara, US
Golan Pundak · New York, US

Key dates

Filing date	Apr 28, 2020
Grant date	Mar 8, 2022
Priority date	—
Expiry date	Apr 28, 2040

Classification

Technology area (CPC G)Physics
CPC primaryG10L2015/025
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method includes receiving audio data encoding an utterance spoken by a native speaker of a first language, and receiving a biasing term list including one or more terms in a second language different than the first language. The method also includes processing, using a speech recognition model, acoustic features derived from the audio data to generate speech recognition scores for both wordpieces and corresponding phoneme sequences in the first language. The method also includes rescoring the speech recognition scores for the phoneme sequences based on the one or more terms in the biasing term list, and executing, using the speech recognition scores for the wordpieces and the rescored speech recognition scores for the phoneme sequences, a decoding graph to generate a transcription for the utterance.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.