Unsupervised alignment for text to speech synthesis using neural networks
US11769481B2 · kind B2 · utility
0Cited by
1References
18Claims
0Family size
Assignee
Inventors
Key dates
| Filing date | Oct 7, 2021 |
| Grant date | Sep 26, 2023 |
| Priority date | — |
| Expiry date | Oct 7, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L2013/105
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Generation of synthetic speech from an input text sequence may be difficult when durations of individual phonemes forming the input text sequence are unknown. A predominantly parallel process may model speech rhythm as a separate generative distribution such that phoneme duration may be sampled at inference. Additional information such as pitch or energy may also be sampled to provide improved diversity for synthetic speech generation.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.