Patent · US Active

Speech denoising via discrete representation learning

US11875809B2 · kind B2 · utility

1Cited by
1References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateOct 1, 2020
Grant dateJan 16, 2024
Priority date
Expiry dateJun 23, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N5/01
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Developed and presented herein are embodiments of a new end-to-end approach for audio denoising, from a synthesis perspective. Instead of explicitly modelling the noise component in the input signal, embodiments directly synthesize the denoised audio from a generative model (or vocoder), as in text-to-speech systems. In one or more embodiments, to generate the phonetic contents for the autoregressive generative model, it is learned via a variational autoencoder with discrete latent representations. Furthermore, in one or more embodiments, a new matching loss is presented for the denoising purpose, which is masked on when the corresponding latent codes differ. As compared against other method on test datasets, embodiments achieve competitive performance and can be trained from scratch.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.