Image diffusion framework for text-guided video editing
US12334116B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Nov 21, 2024 |
| Grant date | Jun 17, 2025 |
| Priority date | — |
| Expiry date | Nov 21, 2044 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/40
- WIPO fieldAudio-visual technology
- WIPO sectorElectrical engineering
Abstract
The invention provides a method for adapting a text-to-image (T2I) diffusion model for video editing by using spectral decomposition to achieve controlled spectral shifts in the model's weights. This adaptation involves maintaining constant singular vectors while selectively adjusting singular values in response to a text prompt. A spectral shift regularizer constrains adjustments, particularly limiting changes to larger singular values to ensure minimal deviation from the original model's structure. This approach allows efficient, prompt-driven video editing by modifying specific elements according to the prompt while preserving the original video context. By focusing on selective spectral adjustments, the method reduces adaptation time and computational demands, making it suitable for real-time and resource-sensitive applications, such as dynamic video editing for streaming services.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.