Patent · US Active

Multilingual image question answering

US10909329B2 · kind B2 · utility

5Cited by
4References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateApr 25, 2016
Grant dateFeb 2, 2021
Priority date
Expiry dateSep 11, 2038

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N3/08
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Embodiments of a multimodal question answering (mQA) system are presented to answer a question about the content of an image. In embodiments, the model comprises four components: a Long Short-Term Memory (LSTM) component to extract the question representation; a Convolutional Neural Network (CNN) component to extract the visual representation; an LSTM component for storing the linguistic context in an answer, and a fusing component to combine the information from the first three components and generate the answer. A Freestyle Multilingual Image Question Answering (FM-IQA) dataset was constructed to train and evaluate embodiments of the mQA model. The quality of the generated answers of the mQA model on this dataset is evaluated by human judges through a Turing Test.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.