Patent · US Active

Deep multi-channel acoustic modeling

US10726830B1 · kind B1 · utility

18Cited by
0References
25Claims
0Family size

Assignee

Inventors

Key dates

Filing dateSep 27, 2018
Grant dateJul 28, 2020
Priority date
Expiry dateJan 24, 2039

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L2021/02166
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Techniques for speech processing using a deep neural network (DNN) based acoustic model front-end are described. A new modeling approach directly models multi-channel audio data received from a microphone array using a first model (e.g., multi-channel DNN) that takes in raw signals and produces a first feature vector that may be used similarly to beamformed features generated by an acoustic beamformer. A second model (e.g., feature extraction DNN) processes the first feature vector and transforms it to a second feature vector having a lower dimensional representation. A third model (e.g., classification DNN) processes the second feature vector to perform acoustic unit classification and generate text data. These three models may be jointly optimized for speech processing (as opposed to individually optimized for signal enhancement), enabling improved performance despite a reduction in microphones and a reduction in bandwidth consumption during real-time processing.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.