Patent · US Active

Method, apparatus and computer program providing a multi-speaker database for concatenative text-to-speech synthesis

US7716052B2 · kind B2 · utility

6Cited by
12References
17Claims
0Family size

Assignee

Inventors

Key dates

Filing dateApr 7, 2005
Grant dateMay 11, 2010
Priority date
Expiry dateMay 16, 2027

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L2021/0135
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method, apparatus and a computer program product to generate an audible speech word that corresponds to text. The method includes providing a text word and, in response to the text word, processing pre-recorded speech segments that are derived from a plurality of speakers to selectively concatenate together speech segments based on at least one cost function to form audio data for generating an audible speech word that corresponds to the text word. A data structure is also provided for use in a concatenative text-to-speech system that includes a plurality of speech segments derived from a plurality of speakers, where each speech segment includes an associated attribute vector each of which is comprised of at least one attribute vector element that identifies the speaker from which the speech segment was derived.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.