Patent · US Active

Speech synthesis using deep neural networks

US8527276B1 · kind B1 · utility

334Cited by
6References
26Claims
0Family size

Assignee

Inventors

Key dates

Filing dateOct 25, 2012
Grant dateSep 3, 2013
Priority date
Expiry dateOct 25, 2032

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L25/30
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method and system for is disclosed for speech synthesis using deep neural networks. A neural network may be trained to map input phonetic transcriptions of training-time text strings into sequences of acoustic feature vectors, which yield predefined speech waveforms when processed by a signal generation module. The training-time text strings may correspond to written transcriptions of speech carried in the predefined speech waveforms. Subsequent to training, a run-time text string may be translated to a run-time phonetic transcription, which may include a run-time sequence of phonetic-context descriptors, each of which contains a phonetic speech unit, data indicating phonetic context, and data indicating time duration of the respective phonetic speech unit. The trained neural network may then map the run-time sequence of the phonetic-context descriptors to run-time predicted feature vectors, which may in turn be translated into synthesized speech by the signal generation module.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.