System and method for automatic prediction of speech suitability for statistical modeling
US9484045B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 7, 2012 |
| Grant date | Nov 1, 2016 |
| Priority date | — |
| Expiry date | Apr 12, 2034 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L13/04
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
An embodiment according to the invention provides a capability of automatically predicting how favorable a given speech signal is for statistical modeling, which is advantageous in a variety of different contexts. In Multi-Form Segment (MFS) synthesis, for example, an embodiment according to the invention uses prediction capability to provide an automatic acoustic driven template versus model decision maker with an output quality that is high, stable and depends gradually on the system footprint. In speaker selection for a statistical Text-to-Speech synthesis (TTS) system build, as another example context, an embodiment according to the invention enables a fast selection of the most appropriate speaker among several available ones for the full voice dataset recording and preparation, based on a small amount of recorded speech material.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.