Patent · US Active

Deep multi-channel acoustic modeling

US10726830B1 · kind B1 · utility

18Cited by

0References

25Claims

0Family size

Assignee

AMAZON TECHNOLOGIES, INC. · US

Inventors

Arindam Mandal · Redwood City, US
Kenichi Kumatani · Sammamish, US
Nikko Strom · Kirkland, US
Minhua Wu · Dongguan, CN
Shiva Kumar Sundaram · Fremont, US
Bjorn Hoffmeister · Seattle, US
Jérémie Lecomte · Santa Clara, US

Key dates

Filing date	Sep 27, 2018
Grant date	Jul 28, 2020
Priority date	—
Expiry date	Jan 24, 2039

Classification

Technology area (CPC G)Physics
CPC primaryG10L2021/02166
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Techniques for speech processing using a deep neural network (DNN) based acoustic model front-end are described. A new modeling approach directly models multi-channel audio data received from a microphone array using a first model (e.g., multi-channel DNN) that takes in raw signals and produces a first feature vector that may be used similarly to beamformed features generated by an acoustic beamformer. A second model (e.g., feature extraction DNN) processes the first feature vector and transforms it to a second feature vector having a lower dimensional representation. A third model (e.g., classification DNN) processes the second feature vector to perform acoustic unit classification and generate text data. These three models may be jointly optimized for speech processing (as opposed to individually optimized for signal enhancement), enabling improved performance despite a reduction in microphones and a reduction in bandwidth consumption during real-time processing.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.