Patent · US Active

System and method for self-distilled vision transformer for domain generalization

US12288384B2 · kind B2 · utility

0Cited by
6References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 19, 2022
Grant dateApr 29, 2025
Priority date
Expiry dateJan 4, 2044

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V2201/03
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

An apparatus and method for a machine learning engine for domain generalization which trains a vision transformer neural network using a training dataset including at least two domains for diagnosis of a medical condition. Image patches and class tokens are processed through a sequence of feature extraction transformer blocks to obtain a predicted class token. In parallel, intermediate class tokens are extracted as outputs of each of the feature extraction transformer blocks, where each transformer block is a sub-model. One sub-model is randomly sampled from the sub-models to obtain a sampled intermediate class token. The intermediate class token is used to make a sub-model prediction. The vision transformer neural network is optimized based on a difference between the predicted class token and the sub-model prediction. Inferencing is performed for a target medical image in a target domain that is different from the at least two domains.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.