Patent · US Active

System and method for training a model using localized textual supervision

US11663294B2 · kind B2 · utility

1Cited by
0References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMay 18, 2021
Grant dateMay 30, 2023
Priority date
Expiry dateDec 8, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/153
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Systems and methods for training a model are described herein. In one example, a system for training the model includes a processor and a memory in communication with the processor having a training module. The training module has instructions that cause the processor to determine a contrastive loss using a self-supervised contrastive loss function, adjust, based on the contrastive loss, model weights a visual backbone that generated feature maps and/or a textual backbone that generated feature vectors. The training module also has instructions that cause the processor to determine a localized loss using a supervised loss function that compares an image-caption attention map with visual identifiers and adjust, based on the localized loss, the model weights the visual backbone and/or the textual backbone.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.