Method and apparatus for visual question answering, computer device and medium
US11854283B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Feb 5, 2021 |
| Grant date | Dec 26, 2023 |
| Priority date | — |
| Expiry date | May 20, 2042 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06T2207/30176
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
The present disclosure provides a method for visual question answering, which relates to fields of computer vision and natural language processing. The method includes: acquiring an input image and an input question; detecting visual information and position information of each of at least one text region in the input image; determining semantic information and attribute information of each of the at least one text region based on the visual information and the position information; determining a global feature of the input image based on the visual information, the position information, the semantic information, and the attribute information; determining a question feature based on the input question; and generating a predicted answer for the input image and the input question based on the global feature and the question feature. The present disclosure further provides a device for visual question answering, a computer device and a medium.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.