Patent · US Active

Speech synthesis using deep neural networks

US8527276B1 · kind B1 · utility

334Cited by

6References

26Claims

0Family size

Assignee

Google LLC · US

Inventors

Andrew W. Senior · New York, US
Byungha Chun · Warrington, GB
Michael Schuster · Saratoga, US

Key dates

Filing date	Oct 25, 2012
Grant date	Sep 3, 2013
Priority date	—
Expiry date	Oct 25, 2032

Classification

Technology area (CPC G)Physics
CPC primaryG10L25/30
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method and system for is disclosed for speech synthesis using deep neural networks. A neural network may be trained to map input phonetic transcriptions of training-time text strings into sequences of acoustic feature vectors, which yield predefined speech waveforms when processed by a signal generation module. The training-time text strings may correspond to written transcriptions of speech carried in the predefined speech waveforms. Subsequent to training, a run-time text string may be translated to a run-time phonetic transcription, which may include a run-time sequence of phonetic-context descriptors, each of which contains a phonetic speech unit, data indicating phonetic context, and data indicating time duration of the respective phonetic speech unit. The trained neural network may then map the run-time sequence of the phonetic-context descriptors to run-time predicted feature vectors, which may in turn be translated into synthesized speech by the signal generation module.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.