Patent · US Active

Regional-to-local attention for vision transformers

US11915474B2 · kind B2 · utility

0Cited by
1References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMay 31, 2022
Grant dateFeb 27, 2024
Priority date
Expiry dateMay 31, 2042

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V10/7715
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Techniques and apparatus for analyzing visual content using a visual transformer are described. An example technique includes generating a first set of tokens based on a visual content item. Each token in the first set of tokens is associated with a regional feature from a different region of a plurality of regions of the visual content item. A second set of tokens is generated based on the visual content item. Each token in the second set of tokens is associated with a local feature from one of the plurality of regions of the visual content item. At least one feature map is generated for the visual content item, based on analyzing the first set of tokens and the second set of tokens separately using a hierarchical vision transformer. At least one vision task is performed based on the at least one feature map.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.