Patent · US Active

Techniques for disentangled variational speech representation learning for zero-shot voice conversion

US12354594B2 · kind B2 · utility

0Cited by

0References

20Claims

0Family size

Assignee

TENCENT AMERICA LLC · US

Inventors

Chunlei Zhang · Shanghai, CN
Jiachen Lian · Palo Alto, US
Dong Yu · Bellevue, US

Key dates

Filing date	Apr 19, 2022
Grant date	Jul 8, 2025
Priority date	—
Expiry date	May 13, 2043

Classification

Technology area (CPC G)Physics
CPC primaryG10L17/18
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method for disentangled variational speech representation learning for voice conversion, performed by at least one processor, is provided. The method includes receiving input speech segments, encoding the input speech segments via a shared encoder to generate a speaker embedding and a content embedding, encoding the posterior distributions of the speaker embedding via a speaker encoder and encoding the posterior distributions of the content embedding via a content encoder to obtain encoded results, and decoding the encoded results by concatenating the encoded results to obtain a reconstructed speech output.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.