Patent · US Active

Pitch-based speech conversion model training method and speech conversion system

US12300220B1 · kind B1 · utility

0Cited by

0References

5Claims

0Family size

Assignee

NANJING SILICON INTELLIGENCE TECHNOLOGY CO., LTD. · CN

Inventors

Huapeng Sima · 安丰镇, CN
Ran Xu · Beijing, CN

Key dates

Filing date	Dec 23, 2024
Grant date	May 13, 2025
Priority date	—
Expiry date	Dec 23, 2044

Classification

Technology area (CPC G)Physics
CPC primaryG10L2021/0135
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

The present disclosure provides a pitch-based speech conversion model training method and a speech conversion system, wherein an audio feature code is output by a priori encoder, and a pitch feature is extracted by a pitch extraction module. A linear spectrum corresponding to the reference speech is input into the posteriori encoder to obtain an audio latent variable. In addition, the audio feature code, a speech concatenation feature obtained by concatenation of the audio feature code and the pitch feature, and the audio latent variable are input into a temporal alignment module to obtain a converted speech code, and the converted speech code is decoded by a decoder to obtain a converted speech. The training loss of the converted speech is then calculated to determine the degree of convergence of the speech conversion model.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.