Patent · US Active

Pitch-based speech conversion model training method and speech conversion system

US12300220B1 · kind B1 · utility

0Cited by
0References
5Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 23, 2024
Grant dateMay 13, 2025
Priority date
Expiry dateDec 23, 2044

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L2021/0135
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

The present disclosure provides a pitch-based speech conversion model training method and a speech conversion system, wherein an audio feature code is output by a priori encoder, and a pitch feature is extracted by a pitch extraction module. A linear spectrum corresponding to the reference speech is input into the posteriori encoder to obtain an audio latent variable. In addition, the audio feature code, a speech concatenation feature obtained by concatenation of the audio feature code and the pitch feature, and the audio latent variable are input into a temporal alignment module to obtain a converted speech code, and the converted speech code is decoded by a decoder to obtain a converted speech. The training loss of the converted speech is then calculated to determine the degree of convergence of the speech conversion model.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.