Generalized automatic speech recognition for joint acoustic echo cancellation, speech enhancement, and voice separation
US12400672B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Feb 19, 2023 |
| Grant date | Aug 26, 2025 |
| Priority date | — |
| Expiry date | Jan 5, 2044 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L2021/02082
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method for training a generalized automatic speech recognition model for joint acoustic echo cancellation, speech enhancement, and voice separation includes receiving a plurality of training utterances paired with corresponding training contextual signals. The training contextual signals include a training contextual noise signal including noise prior to the corresponding training utterance, a training reference audio signal, and a training speaker vector including voice characteristics of a target speaker that spoke the corresponding training utterance. The operations also include training, using a contextual signal dropout strategy, a contextual frontend processing model on the training utterances to learn how to predict enhanced speech features. Here, the contextual signal dropout strategy uses a predetermined probability to drop out each of the training contextual signals during training of the contextual frontend processing model.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.