Patent · US Active

Adaptive cycle consistency multimodal image captioning

US11651522B2 · kind B2 · utility

5Cited by
2References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJul 8, 2020
Grant dateMay 16, 2023
Priority date
Expiry dateSep 8, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N5/01
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

In an approach to improving the image captioning performance of low-resource languages by leveraging multimodal inputs, one or more computer processors encode an image utilizing an image encoder, wherein the image is contained within a triplet comprising the image, one or more high-resource captions, and one or more low-resource captions. The one or more computer processors generate one or more high-resource captions utilizing the encoded image and the triplet inputted into a high-resource decoder. The one or more computer processors encode the one or more generated high-resource captions utilizing a high-resource encoder. The one or more computer processors add adaptive cycle consistency constraints on a set of calculated attention weights associated the triplet. The one or more computer processors generate one or more low-resource captions by simultaneously inputting the encoded image, the encoded high-resource caption, and the triplet into a trained low-resource decoder.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.