Methods and apparatus to perform dense prediction using transformer blocks
US12380714B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 25, 2021 |
| Grant date | Aug 5, 2025 |
| Priority date | — |
| Expiry date | Mar 11, 2043 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06T2207/20221
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Methods, apparatus, systems and articles of manufacture disclosed herein perform dense prediction of an input image using transformers at an encoder stage and at a reassembly stage of an image processing system. A disclosed apparatus includes an encoder with an embedder to convert an input image to a plurality of tokens representing features extracted from the input image. The tokens are embedded with a learnable position embedding. The encoder also includes one or more transformers configured in a sequence of stages to relate the tokens to each other. The apparatus further includes a decoder that includes one or more of reassemblers to assemble the tokens into feature representations, one or more of fusion blocks to combine the feature representations to generate a final feature representation, and an output head to generate a dense prediction based on the final feature representation and based on an output task.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.