Text-to-speech with emotional content
US9824681B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 11, 2014 |
| Grant date | Nov 21, 2017 |
| Priority date | — |
| Expiry date | Jan 5, 2035 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L13/033
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Techniques for converting text to speech having emotional content. In an aspect, an emotionally neutral acoustic trajectory is predicted for a script using a neutral model, and an emotion-specific acoustic trajectory adjustment is independently predicted using an emotion-specific model. The neutral trajectory and emotion-specific adjustments are combined to generate a transformed speech output having emotional content. In another aspect, state parameters of a statistical parametric model for neutral voice are transformed by emotion-specific factors that vary across contexts and states. The emotion-dependent adjustment factors may be clustered and stored using an emotion-specific decision tree or other clustering scheme distinct from a decision tree used for the neutral voice model.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.