Multi-person speech separation method and apparatus using a generative adversarial network model
US11450337B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 17, 2020 |
| Grant date | Sep 20, 2022 |
| Priority date | — |
| Expiry date | Jan 15, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L25/51
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A multi-person speech separation method is provided for a terminal. The method includes extracting a hybrid speech feature from a hybrid speech signal requiring separation, N human voices being mixed in the hybrid speech signal, N being a positive integer greater than or equal to 2; extracting a masking coefficient of the hybrid speech feature by using a generative adversarial network (GAN) model, to obtain a masking matrix corresponding to the N human voices, wherein the GAN model comprises a generative network model and an adversarial network model; and performing a speech separation on the masking matrix corresponding to the N human voices and the hybrid speech signal by using the GAN model, and outputting N separated speech signals corresponding to the N human voices.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.