Patent · US Active

Masked autoencoders for computer vision

US12266160B2 · kind B2 · utility

0Cited by
5References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJul 27, 2022
Grant dateApr 1, 2025
Priority date
Expiry dateJul 22, 2043

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V10/774
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

In particular embodiments, a computing system may access a plurality of images for pre-training a first machine-learning model that includes an encoder and a decoder. Using each image, the system may pre-train the model by dividing the image into a set a patches, selecting a first subset of the patches to be visible and a second subset of the patches to be masked during the pre-training, processing, using the encoder, the first subset of patches to generate corresponding first latent representations, processing, using the decoder, the first latent representations corresponding to the first subset of patches and mask tokens corresponding to the second subset of patches to generate reconstructed patches corresponding to the second subset of patches, the reconstructed patches and the first subset of patches being used to generate a reconstructed image, and updating the model based on comparisons between the image and the reconstructed image.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.