Patent · US Active

Multi-granularity alignment for visual question answering

US12210835B2 · kind B2 · utility

0Cited by
7References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateSep 16, 2022
Grant dateJan 28, 2025
Priority date
Expiry dateJun 16, 2043

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V10/82
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

In one embodiment, a method includes accessing an image and a natural-language question regarding the image and extracting, from the image, a first set of image features at a first level of granularity and a second set of image features at a second level of granularity. The method further includes extracting, from the question, a first set of text features at the first level of granularity and a second set of text features at the second level of granularity; generating a first output representing an alignment between the first set of image features and the first set of text features; generating a second output representing an alignment between the second set of image features and the second set of text features; and determining an answer to the question based on the first output and the second output.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.