Patent · US Active

Systems and methods for reconstructing video data using contextually-aware multi-modal generation during signal loss

US12394405B2 · kind B2 · utility

0Cited by

5References

20Claims

0Family size

Assignee

Verizon Patent and Licensing Inc. · US

Inventors

Subham Biswas · Ashti, IN
Saurabh Tahiliani · Noida, IN

Key dates

Filing date	Mar 24, 2023
Grant date	Aug 19, 2025
Priority date	—
Expiry date	Mar 19, 2044

Classification

Technology area (CPC G)Physics
CPC primaryG10L25/60
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A device may receive video data that includes a text transcript, audio sequences, and image frames, and may detect a network fluctuation. The device may process the text transcript to generate a new phrase, and may generate a response phoneme based on the new phrase. The device may generate a text embedding based on the response phoneme, and may process the audio sequences to generate a target voice sequence. The device may generate an audio embedding based on the target voice sequence, and may process the image frames to generate a target image sequence. The device may generate an image embedding based on the target image sequence, and may combine the embeddings to generate an embedding input vector. The device may generate a final voice response and a final video based on the embedding input vector, and may provide the video data, the final voice response, and the final video.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.