Aligning symbols and objects using co-attention for understanding visual content
US11210572B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Dec 17, 2019 |
| Grant date | Dec 28, 2021 |
| Priority date | — |
| Expiry date | Feb 17, 2040 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V2201/10
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method, apparatus and system for understanding visual content includes determining at least one region proposal for an image, attending at least one symbol of the proposed image region, attending a portion of the proposed image region using information regarding the attended symbol, extracting appearance features of the attended portion of the proposed image region, fusing the appearance features of the attended image region and features of the attended symbol, projecting the fused features into a semantic embedding space having been trained using fused attended appearance features and attended symbol features of images having known descriptive messages, computing a similarity measure between the projected, fused features and fused attended appearance features and attended symbol features embedded in the semantic embedding space having at least one associated descriptive message and predicting a descriptive message for an image associated with the projected, fused features.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.