Patent · US Active

Duration informed attention network for text-to-speech analysis

US11468879B2 · kind B2 · utility

0Cited by

5References

20Claims

0Family size

Assignee

TENCENT AMERICA LLC · US

Inventors

Chengzhu Yu · Bellevue, US
Heng Lu · Sammamish, US
Dong Yu · Bellevue, US

Key dates

Filing date	Apr 29, 2019
Grant date	Oct 11, 2022
Priority date	—
Expiry date	Jun 10, 2039

Classification

Technology area (CPC G)Physics
CPC primaryG10L2013/105
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method and apparatus include receiving a text input that includes a sequence of text components. Respective temporal durations of the text components are determined using a duration model. A first set of spectra is generated based on the sequence of text components. A second set of spectra is generated based on the first set of spectra and the respective temporal durations of the sequence of text components. A spectrogram frame is generated based on the second set of spectra. An audio waveform is generated based on the spectrogram frame. The audio waveform is provided as an output.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.