Patent · US Active

Photorealistic talking faces from audio

US12033259B2 · kind B2 · utility

1Cited by

0References

20Claims

0Family size

Assignee

Google LLC · US

Inventors

Vivek Kwatra · Santa Clara, US
Christian Frueh · Mountain View, US
Avisek Lahiri · Gadh Bengal, IN
John Lewis · Sunnyvale, US

Key dates

Filing date	Jan 29, 2021
Grant date	Jul 9, 2024
Priority date	—
Expiry date	Jul 7, 2041

Classification

Technology area (CPC G)Physics
CPC primaryG10L2021/105
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Provided is a framework for generating photorealistic 3D talking faces conditioned only on audio input. In addition, the present disclosure provides associated methods to insert generated faces into existing videos or virtual environments. We decompose faces from video into a normalized space that decouples 3D geometry, head pose, and texture. This allows separating the prediction problem into regressions over the 3D face shape and the corresponding 2D texture atlas. To stabilize temporal dynamics, we propose an auto-regressive approach that conditions the model on its previous visual state. We also capture face illumination in our model using audio-independent 3D texture normalization.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.