Patent · US Active

Voice synthesis method, model training method, device and computer device

US12014720B2 · kind B2 · utility

0Cited by
3References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateAug 21, 2020
Grant dateJun 18, 2024
Priority date
Expiry dateJan 29, 2043

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L19/02
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

This application relates to a speech synthesis method and apparatus, a model training method and apparatus, and a computer device. The method includes: obtaining to-be-processed linguistic data; encoding the linguistic data, to obtain encoded linguistic data; obtaining an embedded vector for speech feature conversion, the embedded vector being generated according to a residual between synthesized reference speech data and reference speech data that correspond to the same reference linguistic data; and decoding the encoded linguistic data according to the embedded vector, to obtain target synthesized speech data on which the speech feature conversion is performed. The solution provided in this application can prevent quality of a synthesized speech from being affected by a semantic feature in a mel-frequency cepstrum.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.