Patent · US Active

Techniques for disentangled variational speech representation learning for zero-shot voice conversion

US12354594B2 · kind B2 · utility

0Cited by
0References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateApr 19, 2022
Grant dateJul 8, 2025
Priority date
Expiry dateMay 13, 2043

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L17/18
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method for disentangled variational speech representation learning for voice conversion, performed by at least one processor, is provided. The method includes receiving input speech segments, encoding the input speech segments via a shared encoder to generate a speaker embedding and a content embedding, encoding the posterior distributions of the speaker embedding via a speaker encoder and encoding the posterior distributions of the content embedding via a content encoder to obtain encoded results, and decoding the encoded results by concatenating the encoded results to obtain a reconstructed speech output.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.