Patent · US Active

Unsupervised alignment for text to speech synthesis using neural networks

US11869483B2 · kind B2 · utility

0Cited by
1References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateOct 7, 2021
Grant dateJan 9, 2024
Priority date
Expiry dateOct 7, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L2013/105
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Generation of synthetic speech from an input text sequence may be difficult when durations of individual phonemes forming the input text sequence are unknown. A predominantly parallel process may model speech rhythm as a separate generative distribution such that phoneme duration may be sampled at inference. Additional information such as pitch or energy may also be sampled to provide improved diversity for synthetic speech generation.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.