Patent · US Active

Text-to-speech (TTS) processing with transfer of vocal characteristics

US11410684B1 · kind B1 · utility

18Cited by

6References

20Claims

0Family size

Assignee

AMAZON TECHNOLOGIES, INC. · US

Inventors

Viacheslav Klimkov · Gdańsk, PL
Thomas Renaud Drugman · Carnières, BE
Alexander Galkin · East Brunswick Township, US
Srikanth Ronanki · Bellevue, US

Key dates

Filing date	Jun 4, 2019
Grant date	Aug 9, 2022
Priority date	—
Expiry date	Nov 26, 2039

Classification

Technology area (CPC G)Physics
CPC primaryG10L15/26
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Audio data from a first, source speaker is received and processed to determine linguistic units and vocal characteristics corresponding to those linguistic units. The linguistic units may either be determined from received text data or may be determined from the audio data using automatic speech recognition. A model is trained using training data from a second, target speaker. The trained model concatenates the linguistic units with the vocal characteristics to produce output speech that has the “voice” of the target speaker and the vocal characteristics of the source speaker.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.