Patent · US Active

Utilizing cross-attention guidance to preserve content in diffusion-based image modifications

US12333636B2 · kind B2 · utility

0Cited by
4References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 3, 2023
Grant dateJun 17, 2025
Priority date
Expiry dateAug 12, 2043

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06T2207/20182
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

The present disclosure relates to systems, non-transitory computer-readable media, and methods for utilizing machine learning models to generate modified digital images. In particular, in some embodiments, the disclosed systems generate image editing directions between textual identifiers of two visual features utilizing a language prediction machine learning model and a text encoder. In some embodiments, the disclosed systems generated an inversion of a digital image utilizing a regularized inversion model to guide forward diffusion of the digital image. In some embodiments, the disclosed systems utilize cross-attention guidance to preserve structural details of a source digital image when generating a modified digital image with a diffusion neural network.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.