Patent · US Expired

Coarticulation method for audio-visual text-to-speech synthesis

US6662161B1 · kind B1 · utility

9Cited by

13References

15Claims

0Family size

Assignee

AT&T CORP. · US

Inventors

Eric Cosatto · Highlands, US
Hans Peter Graf · South Amboy, US
Juergen Schroeter · New Providence, US

Key dates

Filing date	Sep 7, 1999
Grant date	Dec 9, 2003
Priority date	—
Expiry date	Sep 7, 2019

Classification

Technology area (CPC G)Physics
CPC primaryG10L2021/105
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method for generating animated sequences of talking heads in text-to-speech applications wherein a processor samples a plurality of frames comprising image samples. Representative parameters are extracted from the image samples and stored in an animation library. The processor also samples a plurality of multiphones comprising images together with their associated sounds. The processor extracts parameters from these images comprising data characterizing mouth shapes, maps, rules, or equations, and stores the resulting parameters and sound information in a coarticulation library. The animated sequence begins with the processor considering an input phoneme sequence, recalling from the coarticulation library parameters associated with that sequence, and selecting appropriate image samples from the animation library based on that sequence. The image samples are concatenated together, and the corresponding sound is output, to form the animated synthesis.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.