Patent · US Active

Speech denoising via discrete representation learning

US11875809B2 · kind B2 · utility

1Cited by

1References

20Claims

0Family size

Assignee

BAIDU USA LLC · US

Inventors

Zhao Song · Princeton, US
Wei Ping · Sunnyvale, US

Key dates

Filing date	Oct 1, 2020
Grant date	Jan 16, 2024
Priority date	—
Expiry date	Jun 23, 2041

Classification

Technology area (CPC G)Physics
CPC primaryG06N5/01
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Developed and presented herein are embodiments of a new end-to-end approach for audio denoising, from a synthesis perspective. Instead of explicitly modelling the noise component in the input signal, embodiments directly synthesize the denoised audio from a generative model (or vocoder), as in text-to-speech systems. In one or more embodiments, to generate the phonetic contents for the autoregressive generative model, it is learned via a variational autoencoder with discrete latent representations. Furthermore, in one or more embodiments, a new matching loss is presented for the denoising purpose, which is masked on when the corresponding latent codes differ. As compared against other method on test datasets, embodiments achieve competitive performance and can be trained from scratch.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.