Method and apparatus for converting text into audible signals using a neural network
US5668926A · kind A · utility
Assignee
Inventors
Key dates
| Filing date | Mar 22, 1996 |
| Grant date | Sep 16, 1997 |
| Priority date | — |
| Expiry date | Mar 22, 2016 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L25/30
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Text may be converted to audible signals, such as speech, by first training a neural network 106 using recorded audio messages 204. To begin the training, the recorded audio messages are converted into a series of audio frames 205 having a fixed duration 213. Then, each audio frame is assigned a phonetic representation 203 and a target acoustic representation 208, where the phonetic representation 203 is a binary word that represents the phone and articulation characteristics of the audio frame, while the target acoustic representation 208 is a vector of audio information such as pitch and energy. After training, the neural network 106 is used in conversion of text into speech. First, text that is to be convened is translated to a series of phonetic frames 401 of the same form as the phonetic representations 208 and having the fixed duration 213. Then the neural network produces acoustic representations in response to context descriptions 207 that include some of the phonetic frames 401. The acoustic representations are then converted into a speech wave form by a synthesizer 107.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.