Patent · US Active

Blending recorded speech with text-to-speech output for specific domains

US8996377B2 · kind B2 · utility

0Cited by

3References

20Claims

0Family size

Assignee

MICROSOFT TECHNOLOGY LICENSING, LLC · US

Inventors

Sheng Zhao · Qingdao, CN
Peng Wang · Carlsbad, US
Difei Gao · Beijing, CN
Yijian Wu · Beijing, CN
Binggong Ding · Beijing, CN
Shenghua Ye · Sammamish, US
Max Leung · Beijing, CN

Key dates

Filing date	Jul 12, 2012
Grant date	Mar 31, 2015
Priority date	—
Expiry date	Jun 7, 2033

Classification

Technology area (CPC G)Physics
CPC primaryG10L13/08
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A text-to-speech (TTS) engine combines recorded speech with synthesized speech from a TTS synthesizer based on text input. The TTS engine receives the text input and identifies the domain for the speech (e.g. navigation, dialing, . . . ). The identified domain is used in selecting domain specific speech recordings (e.g. pre-recorded static phrases such as “turn left”, “turn right” . . . ) from the input text. The speech recordings are obtained based on the static phrases for the domain that are identified from the input text. The TTS engine blends the static phrases with the TTS output to smooth the acoustic trajectory of the input text. The prosody of the static phrases is used to create similar prosody in the TTS output.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.