Patent · US Active

Synthetic audio output method and apparatus, storage medium, and electronic device

US12051400B1 · kind B1 · utility

0Cited by
3References
6Claims
0Family size

Assignee

Inventors

Key dates

Filing dateFeb 7, 2024
Grant dateJul 30, 2024
Priority date
Expiry dateFeb 7, 2044

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY02T10/40
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

This application provide a synthetic audio output method and apparatus, a storage medium, and an electronic device. The method includes: inputting input text and a specified target identity identifier into an audio output model; extracting an identity feature sequence of a target identity by an identity recognition model; extracting a phoneme feature sequence corresponding to the input text by an encoding layer of a speech synthesis model; superimposing and inputting the identity feature sequence of the target identity and the phoneme feature sequence into a variable adapter of the speech synthesis model; and after duration prediction and alignment, energy prediction, and pitch prediction are performed on the phoneme feature sequence by the variable adapter, outputting a target Mel-frequency spectrum feature corresponding to the input text through a decoding layer of the speech synthesis model; and inputting the target Mel-frequency spectrum feature into a vocoder to output synthetic audio.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.