Patent · US Active

Statistical enhancement of speech output from a statistical text-to-speech synthesis system

US8682670B2 · kind B2 · utility

2Cited by

9References

25Claims

0Family size

Assignee

International Business Machines Corporation · US

Inventors

Slava Shechtman · Haifa, IL
Alexander Sorin · Scotts Valley, US

Key dates

Filing date	Jul 7, 2011
Grant date	Mar 25, 2014
Priority date	—
Expiry date	Jan 26, 2032

Classification

Technology area (CPC G)Physics
CPC primaryG10L13/06
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method, system and computer program product are provided for enhancement of speech synthesized by a statistical text-to-speech (TTS) system employing a parametric representation of speech in a space of acoustic feature vectors. The method includes: defining a parametric family of corrective transformations operating in the space of the acoustic feature vectors and dependent on a set of enhancing parameters; and defining a distortion indictor of a feature vector or a plurality of feature vectors. The method further includes: receiving a feature vector output by the system; and generating an instance of the corrective transformation by: calculating a reference value of the distortion indicator attributed to a statistical model of the phonetic unit emitting the feature vector; calculating an actual value of the distortion indicator attributed to feature vectors emitted by the statistical model of the phonetic unit emitting the feature vector; calculating the enhancing parameter values depending on the reference value of the distortion indicator, the actual value of the distortion indicator and the parametric corrective transformation; and deriving an instance of the corrective trans…

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.