Patent · US Active

System and method for multichannel end-to-end speech recognition

US11133011B2 · kind B2 · utility

2Cited by

8References

21Claims

0Family size

Assignee

Mitsubishi Electric Research Laboratories, Inc. · US

Inventors

Shinji Watanabe · Minato, JP
Tsubasa OCHIAI · Chiba, JP
Takaaki Hori · Lexington, US
John R. Hershey · Winchester, US

Key dates

Filing date	Oct 3, 2017
Grant date	Sep 28, 2021
Priority date	—
Expiry date	Oct 3, 2037

Classification

Technology area (CPC G)Physics
CPC primaryG10L2021/02166
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A speech recognition system includes a plurality of microphones to receive acoustic signals including speech signals, an input interface to generate multichannel inputs from the acoustic signals, one or more storages to store a multichannel speech recognition network, wherein the multichannel speech recognition network comprises mask estimation networks to generate time-frequency masks from the multichannel inputs, a beamformer network trained to select a reference channel input from the multichannel inputs using the time-frequency masks and generate an enhanced speech dataset based on the reference channel input and an encoder-decoder network trained to transform the enhanced speech dataset into a text. The system further includes one or more processors, using the multichannel speech recognition network in association with the one or more storages, to generate the text from the multichannel inputs, and an output interface to render the text.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.