Visual dialog method and apparatus, method and apparatus for training visual dialog model, electronic device, and computer-readable storage medium
US12361036B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Nov 17, 2022 |
| Grant date | Jul 15, 2025 |
| Priority date | — |
| Expiry date | Nov 16, 2043 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/583
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Disclosed in this application are a visual content dialog method performed by an electronic device. The method includes: acquiring an image feature of an input image and state vectors corresponding to first n rounds of historical question answering dialog, n being a positive integer; acquiring a question feature of a current round of questioning related to the input image; performing multimodal encoding on the image feature of the input image, the state vectors corresponding to the first n rounds of historical question answering dialog, and the question feature of the current round of questioning, to obtain a state vector corresponding to the current round of questioning; and performing multimodal decoding on the state vector corresponding to the current round of questioning and the image feature of the input image, to obtain an actual output answer corresponding to the current round of questioning.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.