Using a predictive model to automatically enhance audio having various audio quality issues
US11514925B2 · kind B2 · utility
Assignees
Inventors
Key dates
| Filing date | Apr 30, 2020 |
| Grant date | Nov 29, 2022 |
| Priority date | — |
| Expiry date | Jan 1, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L2021/02082
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Operations of a method include receiving a request to enhance a new source audio. Responsive to the request, the new source audio is input into a prediction model that was previously trained. Training the prediction model includes providing a generative adversarial network including the prediction model and a discriminator. Training data is obtained including tuples of source audios and target audios, each tuple including a source audio and a corresponding target audio. During training, the prediction model generates predicted audios based on the source audios. Training further includes applying a loss function to the predicted audios and the target audios, where the loss function incorporates a combination of a spectrogram loss and an adversarial loss. The prediction model is updated to optimize that loss function. After training, based on the new source audio, the prediction model generates a new predicted audio as an enhanced version of the new source audio.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.