Voice font speaker and prosody interpolation
US9472182B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Feb 26, 2014 |
| Grant date | Oct 18, 2016 |
| Priority date | — |
| Expiry date | May 26, 2034 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L13/08
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Multi-voice font interpolation is provided. A multi-voice font interpolation engine allows the production of computer generated speech with a wide variety of speaker characteristics and/or prosody by interpolating speaker characteristics and prosody from existing fonts. Using prediction models from multiple voice fonts, the multi-voice font interpolation engine predicts values for the parameters that influence speaker characteristics and/or prosody for the phoneme sequence obtained from the text to spoken. For each parameter, additional parameter values are generated by a weighted interpolation from the predicted values. Modifying an existing voice font with the interpolated parameters changes the style and/or emotion of the speech while retaining the base sound qualities of the original voice. The multi-voice font interpolation engine allows the speaker characteristics and/or prosody to be transplanted from one voice font to another or entirely new speaker characteristics and/or prosody to be generated for an existing voice font.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.