Statistical enhancement of speech output from a statistical text-to-speech synthesis system
US8682670B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jul 7, 2011 |
| Grant date | Mar 25, 2014 |
| Priority date | — |
| Expiry date | Jan 26, 2032 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L13/06
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method, system and computer program product are provided for enhancement of speech synthesized by a statistical text-to-speech (TTS) system employing a parametric representation of speech in a space of acoustic feature vectors. The method includes: defining a parametric family of corrective transformations operating in the space of the acoustic feature vectors and dependent on a set of enhancing parameters; and defining a distortion indictor of a feature vector or a plurality of feature vectors. The method further includes: receiving a feature vector output by the system; and generating an instance of the corrective transformation by: calculating a reference value of the distortion indicator attributed to a statistical model of the phonetic unit emitting the feature vector; calculating an actual value of the distortion indicator attributed to feature vectors emitted by the statistical model of the phonetic unit emitting the feature vector; calculating the enhancing parameter values depending on the reference value of the distortion indicator, the actual value of the distortion indicator and the parametric corrective transformation; and deriving an instance of the corrective trans…
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.