Method and apparatus for producing audio-visual synthetic speech
US5657426A · kind A · utility
Assignee
Inventors
Key dates
| Filing date | Jun 10, 1994 |
| Grant date | Aug 12, 1997 |
| Priority date | — |
| Expiry date | Jun 10, 2014 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L2021/105
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method and apparatus provide a video image of facial features synchronized with synthetic speech. Text input is transformed into a string of phonemes and timing data, which are transmitted to an image generation unit. At the same time, a string of synthetic speech samples is transmitted to an audio server. The audio server produces signals for an audio speaker, causing the audio signals to be continuously audibilized; additionally, the audio server initializes a timer. The image generation unit reads the timing data from the timer and, by consulting the phoneme and timing data, determines the position of the phoneme currently being audibilized. The image generation unit then calculates the facial configuration corresponding to the position in the string of phonemes, calculates the facial configuration, and causes the facial configuration to be displayed on a video device.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.