Method, apparatus and computer program providing a multi-speaker database for concatenative text-to-speech synthesis
US7716052B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Apr 7, 2005 |
| Grant date | May 11, 2010 |
| Priority date | — |
| Expiry date | May 16, 2027 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L2021/0135
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method, apparatus and a computer program product to generate an audible speech word that corresponds to text. The method includes providing a text word and, in response to the text word, processing pre-recorded speech segments that are derived from a plurality of speakers to selectively concatenate together speech segments based on at least one cost function to form audio data for generating an audible speech word that corresponds to the text word. A data structure is also provided for use in a concatenative text-to-speech system that includes a plurality of speech segments derived from a plurality of speakers, where each speech segment includes an associated attribute vector each of which is comprised of at least one attribute vector element that identifies the speaker from which the speech segment was derived.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.