Patent · US Active

System and method for automatic prediction of speech suitability for statistical modeling

US9484045B2 · kind B2 · utility

0Cited by

5References

17Claims

0Family size

Assignee

Nuance Communications, Inc. · US

Inventors

Alexander Sorin · Scotts Valley, US
Slava Shechtman · Haifa, IL
Vincent Pollet · Astene, BE

Key dates

Filing date	Sep 7, 2012
Grant date	Nov 1, 2016
Priority date	—
Expiry date	Apr 12, 2034

Classification

Technology area (CPC G)Physics
CPC primaryG10L13/04
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

An embodiment according to the invention provides a capability of automatically predicting how favorable a given speech signal is for statistical modeling, which is advantageous in a variety of different contexts. In Multi-Form Segment (MFS) synthesis, for example, an embodiment according to the invention uses prediction capability to provide an automatic acoustic driven template versus model decision maker with an output quality that is high, stable and depends gradually on the system footprint. In speaker selection for a statistical Text-to-Speech synthesis (TTS) system build, as another example context, an embodiment according to the invention enables a fast selection of the most appropriate speaker among several available ones for the full voice dataset recording and preparation, based on a small amount of recorded speech material.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.