Multichannel speech recognition using neural networks
US11062725B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Feb 19, 2019 |
| Grant date | Jul 13, 2021 |
| Priority date | — |
| Expiry date | Jun 8, 2039 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L2021/02166
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
This specification describes computer-implemented methods and systems. One method includes receiving, by a neural network of a speech recognition system, first data representing a first raw audio signal and second data representing a second raw audio signal. The first raw audio signal and the second raw audio signal describe audio occurring at a same period of time. The method further includes generating, by a spatial filtering layer of the neural network, a spatial filtered output using the first data and the second data, and generating, by a spectral filtering layer of the neural network, a spectral filtered output using the spatial filtered output. Generating the spectral filtered output comprises processing frequency-domain data representing the spatial filtered output. The method still further includes processing, by one or more additional layers of the neural network, the spectral filtered output to predict sub-word units encoded in both the first raw audio signal and the second raw audio signal.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.