Patent · US Active

Processing multi-channel audio waveforms

US9697826B2 · kind B2 · utility

204Cited by
2References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJul 8, 2016
Grant dateJul 4, 2017
Priority date
Expiry dateJul 8, 2036

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L2021/02166
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods, including computer programs encoded on a computer storage medium, for enhancing the processing of audio waveforms for speech recognition using various neural network processing techniques. In one aspect, a method includes: receiving multiple channels of audio data corresponding to an utterance; convolving each of multiple filters, in a time domain, with each of the multiple channels of audio waveform data to generate convolution outputs, wherein the multiple filters have parameters that have been learned during a training process that jointly trains the multiple filters and trains a deep neural network as an acoustic model; combining, for each of the multiple filters, the convolution outputs for the filter for the multiple channels of audio waveform data; inputting the combined convolution outputs to the deep neural network trained jointly with the multiple filters; and providing a transcription for the utterance that is determined.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.