Patent · US Active

Text-to-speech (TTS) processing

US11410639B2 · kind B2 · utility

3Cited by

1References

17Claims

0Family size

Assignee

AMAZON TECHNOLOGIES, INC. · US

Inventors

Jaime Lorenzo Trueba · Cambridge, GB
Thomas Renaud Drugman · Carnières, BE
Viacheslav Klimkov · Gdańsk, PL
Srikanth Ronanki · Bellevue, US
Thomas Edward Merritt · Cambridge, GB
Andrew Paul Breen · Norwich, GB
Roberto Barra-Chicote · Cambridge, GB

Key dates

Filing date	Jul 7, 2020
Grant date	Aug 9, 2022
Priority date	—
Expiry date	Jul 30, 2040

Classification

Technology area (CPC G)Physics
CPC primaryG10L13/08
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

During text-to-speech processing, a speech model creates output audio data, including speech, that corresponds to input text data that includes a representation of the speech. A spectrogram estimator estimates a frequency spectrogram of the speech; the corresponding frequency-spectrogram data is used to condition the speech model. A plurality of acoustic features corresponding to different segments of the input text data, such as phonemes, syllable-level features, and/or word-level features, may be separately encoded into context vectors; the spectrogram estimator uses these separate context vectors to create the frequency spectrogram.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.