Synthetic audio output method and apparatus, storage medium, and electronic device
US12051400B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Feb 7, 2024 |
| Grant date | Jul 30, 2024 |
| Priority date | — |
| Expiry date | Feb 7, 2044 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY02T10/40
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
This application provide a synthetic audio output method and apparatus, a storage medium, and an electronic device. The method includes: inputting input text and a specified target identity identifier into an audio output model; extracting an identity feature sequence of a target identity by an identity recognition model; extracting a phoneme feature sequence corresponding to the input text by an encoding layer of a speech synthesis model; superimposing and inputting the identity feature sequence of the target identity and the phoneme feature sequence into a variable adapter of the speech synthesis model; and after duration prediction and alignment, energy prediction, and pitch prediction are performed on the phoneme feature sequence by the variable adapter, outputting a target Mel-frequency spectrum feature corresponding to the input text through a decoding layer of the speech synthesis model; and inputting the target Mel-frequency spectrum feature into a vocoder to output synthetic audio.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.