Patent · US Active

Align-to-ground, weakly supervised phrase grounding guided by image-caption alignment

US11238631B2 · kind B2 · utility

1Cited by

0References

19Claims

0Family size

Assignee

SRI International · US

Inventors

Karan Sikka · Lawrenceville, US
Ajay Divakaran · Woburn, US
Samyak Datta · Atlanta, US

Key dates

Filing date	Apr 22, 2020
Grant date	Feb 1, 2022
Priority date	—
Expiry date	Apr 24, 2040

Classification

Technology area (CPC G)Physics
CPC primaryG06V10/764
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method, apparatus and system for visual grounding of a caption in an image include projecting at least two parsed phrases of the caption into a trained semantic embedding space, projecting extracted region proposals of the image into the trained semantic embedding space, aligning the extracted region proposals and the at least two parsed phrases, aggregating the aligned region proposals and the at least two parsed phrases to determine a caption-conditioned image representation and projecting the caption-conditioned image representation and the caption into a semantic embedding space to align the caption-conditioned image representation and the caption. The method, apparatus and system can further include a parser for parsing the caption into the at least two parsed phrases and a region proposal module for extracting the region proposals from the image.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.