ADL-UFE: all deep learning unified front-end system
US12094481B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Nov 18, 2021 |
| Grant date | Sep 17, 2024 |
| Priority date | — |
| Expiry date | Nov 26, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L2021/02166
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
There is included a method and apparatus comprising computer code for generating enhanced target speech from audio data, performed by a computing device, the method comprising: receiving audio data corresponding to one or more speakers; generating estimated an target speech, an estimated noise, and an estimated echo simultaneously based on the audio data using a jointly trained complex ratio mask; predicting frame-level multi-tap time-frequency (T-F) spatio-temporal-echo filter weights based on the estimated target speech, the estimated noise, and the estimated echo using a trained neural network model; and predicting enhanced target speech based on the frame-level multi-tap T-F spatio-temporal-echo filter weights.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.