Patent · US Active

Speech recognition with sequence-to-sequence models

US11145293B2 · kind B2 · utility

5Cited by

0References

20Claims

0Family size

Assignee

Google LLC · US

Inventors

Rohit Prakash Prabhavalkar · Santa Clara, US
Zhifeng Chen · Sunnyvale, US
Bo Li · 东风镇, CN
Chung-Cheng Chiu · Mountain View, US
Kanury Kanishka Rao · Santa Clara, US
Yonghui Wu · Fremont, US
Ron J. Weiss · New York, US
Navdeep Jaitly · Mountain View, US
Michiel A. U. Bacchiani · Summit, US
Tara N. Sainath · Jersey City, US
Jan Kazimierz Chorowski · Łęczyca, PL
Anjuli Patricia Kannan · Berkeley, US
Ekaterina Gonina · Sunnyvale, US
Patrick Nguyen · Kirkland, US

Key dates

Filing date	Jul 19, 2019
Grant date	Oct 12, 2021
Priority date	—
Expiry date	Feb 14, 2040

Classification

Technology area (CPC G)Physics
CPC primaryG10L2015/025
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Methods, systems, and apparatus, including computer-readable media, for performing speech recognition using sequence-to-sequence models. An automated speech recognition (ASR) system receives audio data for an utterance and provides features indicative of acoustic characteristics of the utterance as input to an encoder. The system processes an output of the encoder using an attender to generate a context vector and generates speech recognition scores using the context vector and a decoder trained using a training process that selects at least one input to the decoder with a predetermined probability. An input to the decoder during training is selected between input data based on a known value for an element in a training example, and input data based on an output of the decoder for the element in the training example. A transcription is generated for the utterance using word elements selected based on the speech recognition scores. The transcription is provided as an output of the ASR system.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.