Patent · US Active

Multi-speaker neural text-to-speech synthesis

US12266342B2 · kind B2 · utility

0Cited by

2References

19Claims

0Family size

Assignee

MICROSOFT TECHNOLOGY LICENSING, LLC · US

Inventors

Yan Deng · Beijing, CN
Lei He · Moraga, US

Key dates

Filing date	Dec 11, 2018
Grant date	Apr 1, 2025
Priority date	—
Expiry date	Sep 24, 2039

Classification

Technology area (CPC G)Physics
CPC primaryG10L13/047
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method for generating speech through multi-speaker neural text-to-speech (TTS) synthesis is provided. A text input may be received (1410). Speaker latent space information of a target speaker may be provided through at least one speaker model (1420). At least one acoustic feature may be predicted through an acoustic feature predictor based on the text input and the speaker latent space information (1430). A speech waveform corresponding to the text input may be generated through a neural vocoder based on the at least one acoustic feature and the speaker latent space information (1440).

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.