System and method for self-distilled vision transformer for domain generalization
US12288384B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Dec 19, 2022 |
| Grant date | Apr 29, 2025 |
| Priority date | — |
| Expiry date | Jan 4, 2044 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V2201/03
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
An apparatus and method for a machine learning engine for domain generalization which trains a vision transformer neural network using a training dataset including at least two domains for diagnosis of a medical condition. Image patches and class tokens are processed through a sequence of feature extraction transformer blocks to obtain a predicted class token. In parallel, intermediate class tokens are extracted as outputs of each of the feature extraction transformer blocks, where each transformer block is a sub-model. One sub-model is randomly sampled from the sub-models to obtain a sampled intermediate class token. The intermediate class token is used to make a sub-model prediction. The vision transformer neural network is optimized based on a difference between the predicted class token and the sub-model prediction. Inferencing is performed for a target medical image in a target domain that is different from the at least two domains.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.