Reducing recording time when constructing a concatenative TTS voice using a reduced script and pre-recorded speech assets
US8019605B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | May 14, 2007 |
| Grant date | Sep 13, 2011 |
| Priority date | — |
| Expiry date | Jul 13, 2030 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L13/04
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
The present invention discloses a system and a method for creating a reduced script, which is read by a voice talent to create a concatenative text-to-speech (TTS) voice. The method can automatically process pre-recorded audio to derive speech assets for a concatenative TTS voice. The pre-recording audio can include sets of recorded phrases used by a speech user interface (Sill). A set of unfulfilled speech assets needed for foil phonetic coverage of the concatenative TTS voice can be determined. A reduced script can be constructed that includes a set of phrases, which when read by a voice talent result in a reduced corpus. When the reduced corpus is automatically processed, a reduced set of speech assets result. The reduced set includes each of the unfulfilled speech assets. When this reduced corpus is combined with existing speech assets the result will be a voice with a complete set of speech assets.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.