Text-to-speech (TTS) processing
US10706837B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Jun 13, 2018 |
| Grant date | Jul 7, 2020 |
| Priority date | — |
| Expiry date | Dec 25, 2038 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L13/10
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A speech model includes a sub-model corresponding to a vocal attribute. The speech model generates an output waveform using a sample model, which receives text data, and a conditioning model, which receives text metadata and produces a prosody output for use by the sample model. If, during training or runtime, a different vocal attribute is desired or needed, the sub-model is re-trained or switched to a different sub-model corresponding to the different vocal attribute.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.