Patent · US Active

Text-to-speech (TTS) processing

US10741169B1 · kind B1 · utility

14Cited by

1References

20Claims

0Family size

Assignee

AMAZON TECHNOLOGIES, INC. · US

Inventors

Jaime Lorenzo Trueba · Cambridge, GB
Thomas Renaud Drugman · Carnières, BE
Viacheslav Klimkov · Gdańsk, PL
Srikanth Ronanki · Bellevue, US
Thomas Edward Merritt · Cambridge, GB
Andrew Paul Breen · Norwich, GB
Roberto Barra-Chicote · Cambridge, GB

Key dates

Filing date	Sep 25, 2018
Grant date	Aug 11, 2020
Priority date	—
Expiry date	Feb 5, 2039

Classification

Technology area (CPC G)Physics
CPC primaryG10L13/08
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

During text-to-speech processing, a speech model creates output audio data, including speech, that corresponds to input text data that includes a representation of the speech. A spectrogram estimator estimates a frequency spectrogram of the speech; the corresponding frequency-spectrogram data is used to condition the speech model. A plurality of acoustic features corresponding to different segments of the input text data, such as phonemes, syllable-level features, and/or word-level features, may be separately encoded into context vectors; the spectrogram estimator uses these separate context vectors to create the frequency spectrogram.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.