Patent · US Active

Duration informed attention network for text-to-speech analysis

US11468879B2 · kind B2 · utility

0Cited by
5References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateApr 29, 2019
Grant dateOct 11, 2022
Priority date
Expiry dateJun 10, 2039

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L2013/105
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method and apparatus include receiving a text input that includes a sequence of text components. Respective temporal durations of the text components are determined using a duration model. A first set of spectra is generated based on the sequence of text components. A second set of spectra is generated based on the first set of spectra and the respective temporal durations of the sequence of text components. A spectrogram frame is generated based on the second set of spectra. An audio waveform is generated based on the spectrogram frame. The audio waveform is provided as an output.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.