Patent · US Active

Processing multi-channel audio waveforms

US9697826B2 · kind B2 · utility

204Cited by

2References

20Claims

0Family size

Assignee

Google LLC · US

Inventors

Tara N. Sainath · Jersey City, US
Ron J. Weiss · New York, US
Kevin William Wilson · Cambridge, US
Andrew W. Senior · New York, US
Arun Narayanan · Rochester Hills, US
Yedid Hoshen · Jerusalem, IL
Michiel A. U. Bacchiani · Summit, US

Key dates

Filing date	Jul 8, 2016
Grant date	Jul 4, 2017
Priority date	—
Expiry date	Jul 8, 2036

Classification

Technology area (CPC G)Physics
CPC primaryG10L2021/02166
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Methods, including computer programs encoded on a computer storage medium, for enhancing the processing of audio waveforms for speech recognition using various neural network processing techniques. In one aspect, a method includes: receiving multiple channels of audio data corresponding to an utterance; convolving each of multiple filters, in a time domain, with each of the multiple channels of audio waveform data to generate convolution outputs, wherein the multiple filters have parameters that have been learned during a training process that jointly trains the multiple filters and trains a deep neural network as an acoustic model; combining, for each of the multiple filters, the convolution outputs for the filter for the multiple channels of audio waveform data; inputting the combined convolution outputs to the deep neural network trained jointly with the multiple filters; and providing a transcription for the utterance that is determined.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.