Patent · US Active

Systems and methods for attention-based configurable convolutional neural networks (ABC-CNN) for visual question answering

US9965705B2 · kind B2 · utility

28Cited by

0References

20Claims

0Family size

Assignee

BAIDU USA LLC · US

Inventors

Kan-Yueh Chen · Daxi District, TW
Jiang Wang · Shanghai, CN
Wei Xu · Santa Clara, US

Key dates

Filing date	Jun 16, 2016
Grant date	May 8, 2018
Priority date	—
Expiry date	Aug 10, 2036

Classification

Technology area (CPC G)Physics
CPC primaryG06N3/044
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Described herein are systems and methods for generating and using attention-based deep learning architectures for visual question answering task (VQA) to automatically generate answers for image-related (still or video images) questions. To generate the correct answers, it is important for a model's attention to focus on the relevant regions of an image according to the question because different questions may ask about the attributes of different image regions. In embodiments, such question-guided attention is learned with a configurable convolutional neural network (ABC-CNN). Embodiments of the ABC-CNN models determine the attention maps by convolving image feature map with the configurable convolutional kernels determined by the questions semantics. In embodiments, the question-guided attention maps focus on the question-related regions and filters out noise in the unrelated regions.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.