Patent · US Active

Artificial intelligence-based text-to-speech system and method

US10373605B2 · kind B2 · utility

4Cited by

3References

20Claims

0Family size

Assignee

Telepathy Labs, Inc. · US

Inventors

Martin Reber · Zürich, CH
Vijeta Avijeet · Zürich, CH

Key dates

Filing date	Jun 29, 2018
Grant date	Aug 6, 2019
Priority date	—
Expiry date	Jun 29, 2038

Classification

Technology area (CPC G)Physics
CPC primaryG10L25/30
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A technique improves training and speech quality of a text-to-speech (TTS) system having an artificial intelligence, such as a neural network. The TTS system is organized as a front-end subsystem and a back-end subsystem. The front-end subsystem is configured to provide analysis and conversion of text into input vectors, each having at least a base frequency, f0, a phenome duration, and a phoneme sequence that is processed by a signal generation unit of the back-end subsystem. The signal generation unit includes the neural network interacting with a pre-existing knowledgebase of phenomes to generate audible speech from the input vectors. The technique applies an error signal from the neural network to correct imperfections of the pre-existing knowledgebase of phenomes to generate audible speech signals. A back-end training system is configured to train the signal generation unit by applying psychoacoustic principles to improve quality of the generated audible speech signals.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.