Patent · US Active

Training method and apparatus for a speech synthesis model, and storage medium

US11488577B2 · kind B2 · utility

0Cited by

0References

12Claims

0Family size

Assignee

BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD. · CN

Inventors

Zhipeng Chen · Lo Wu, CN
Jinfeng Bai · Beijing, CN
Lei Jia · Beijing, CN

Key dates

Filing date	Jun 19, 2020
Grant date	Nov 1, 2022
Priority date	—
Expiry date	Mar 12, 2041

Classification

Technology area (CPC G)Physics
CPC primaryG10L25/30
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

The present application discloses a training method and an apparatus for a speech synthesis model, electronic device, and storage medium. The method includes: taking a syllable input sequence, a phoneme input sequence and a Chinese character input sequence of a current sample as inputs of an encoder of a model to be trained, to obtain encoded representations of these three sequences at an output end of the encoder; fusing the encoded representations of these three sequences, to obtain a weighted combination of these three sequences; taking the weighted combination as an input of an attention module, to obtain a weighted average of the weighted combination at each moment at an output end of the attention module; taking the weighted average as an input of a decoder of the model to be trained, to obtain a speech Mel spectrum of the current sample at an output end of the decoder.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.