Patent · US Active

Multi-speaker neural text-to-speech synthesis

US12266342B2 · kind B2 · utility

0Cited by
2References
19Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 11, 2018
Grant dateApr 1, 2025
Priority date
Expiry dateSep 24, 2039

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L13/047
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method for generating speech through multi-speaker neural text-to-speech (TTS) synthesis is provided. A text input may be received (1410). Speaker latent space information of a target speaker may be provided through at least one speaker model (1420). At least one acoustic feature may be predicted through an acoustic feature predictor based on the text input and the speaker latent space information (1430). A speech waveform corresponding to the text input may be generated through a neural vocoder based on the at least one acoustic feature and the speaker latent space information (1440).

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.