Automatic speech recognition
US11915690B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 29, 2021 |
| Grant date | Feb 27, 2024 |
| Priority date | — |
| Expiry date | Nov 19, 2041 |
Classification
- Technology area (CPC H)Electricity
- CPC primaryH04R1/406
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A multi-channel transformer acoustic model that processes a plurality of audio signals output by microphones of a microphone array and outputs probabilities for acoustic units of an utterance represented in the audio signals. The audio signals represent the individual microphones' respective capturing of the utterance. The multi-channel model may perform self-attention on embeddings of the audio signals and then cross-channel attention across the attended audio signals. The cross-channel attention may involve processing of signals relative to each other to model the relationships across channels within and across time frames. The multi-channel model may include a transducer to perform processing frame-by-frame.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.