Patent · US Active

Deep multi-channel acoustic modeling using multiple microphone array geometries

US11574628B1 · kind B1 · utility

10Cited by

1References

21Claims

0Family size

Assignee

AMAZON TECHNOLOGIES, INC. · US

Inventors

Kenichi Kumatani · Sammamish, US
Minhua Wu · Dongguan, CN
Shiva Kumar Sundaram · Fremont, US
Nikko Strom · Kirkland, US
Bjorn Hoffmeister · Seattle, US

Key dates

Filing date	Mar 28, 2019
Grant date	Feb 7, 2023
Priority date	—
Expiry date	Mar 28, 2039

Classification

Technology area (CPC H)Electricity
CPC primaryH04R2201/401
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Techniques for speech processing using a deep neural network (DNN) based acoustic model front-end are described. A new modeling approach directly models multi-channel audio data received from a microphone array using a first model (e.g., multi-geometry/multi-channel DNN) that is trained using a plurality of microphone array geometries. Thus, the first model may receive a variable number of microphone channels, generate multiple outputs using multiple microphone array geometries, and select the best output as a first feature vector that may be used similarly to beamformed features generated by an acoustic beamformer. A second model (e.g., feature extraction DNN) processes the first feature vector and transforms it to a second feature vector having a lower dimensional representation. A third model (e.g., classification DNN) processes the second feature vector to perform acoustic unit classification and generate text data. The DNN front-end enables improved performance despite a reduction in microphones.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.