Method for audio-driven character lip sync, model for audio-driven character lip sync and training method therefor
US11928767B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jun 21, 2023 |
| Grant date | Mar 12, 2024 |
| Priority date | — |
| Expiry date | Jun 21, 2043 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L2021/105
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Embodiments of the present disclosure provide a method for audio-driven character lip sync, a model for audio-driven character lip sync, and a training method therefor. A target dynamic image is obtained by acquiring a character image of a target character and speech for generating a target dynamic image, processing the character image and the speech as image-audio data that may be trained, respectively, and mixing the image-audio data with auxiliary data for training. When a large amount of sample data needs to be obtained for training in different scenarios, a video when another character speaks is used as an auxiliary video for processing, so as to obtain the auxiliary data. The auxiliary data, which replaces non-general sample data, and other data are input into a model in a preset ratio for training. The auxiliary data may improve a process of training a synthetic lip sync action of the model, so that there are no parts unrelated to the synthetic lip sync action during the training process. In this way, a problem that a large amount of sample data is required during the training process is resolved.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.