Patent · US Active

Systems and methods for attention-based configurable convolutional neural networks (ABC-CNN) for visual question answering

US9965705B2 · kind B2 · utility

28Cited by
0References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJun 16, 2016
Grant dateMay 8, 2018
Priority date
Expiry dateAug 10, 2036

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N3/044
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Described herein are systems and methods for generating and using attention-based deep learning architectures for visual question answering task (VQA) to automatically generate answers for image-related (still or video images) questions. To generate the correct answers, it is important for a model's attention to focus on the relevant regions of an image according to the question because different questions may ask about the attributes of different image regions. In embodiments, such question-guided attention is learned with a configurable convolutional neural network (ABC-CNN). Embodiments of the ABC-CNN models determine the attention maps by convolving image feature map with the configurable convolutional kernels determined by the questions semantics. In embodiments, the question-guided attention maps focus on the question-related regions and filters out noise in the unrelated regions.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.