Patent · US Active

Masked autoencoders for computer vision

US12266160B2 · kind B2 · utility

0Cited by

5References

20Claims

0Family size

Assignee

Meta Platforms, Inc. · US

Inventors

Kaiming He · Beijing, CN
Piotr Dollar · San Mateo, US
Ross Girshick · Seattle, US
Saining Xie · Sunnyvale, US
Xinlei Chen · Belmont, US
Yanghao Li · Sunnyvale, US

Key dates

Filing date	Jul 27, 2022
Grant date	Apr 1, 2025
Priority date	—
Expiry date	Jul 22, 2043

Classification

Technology area (CPC G)Physics
CPC primaryG06V10/774
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

In particular embodiments, a computing system may access a plurality of images for pre-training a first machine-learning model that includes an encoder and a decoder. Using each image, the system may pre-train the model by dividing the image into a set a patches, selecting a first subset of the patches to be visible and a second subset of the patches to be masked during the pre-training, processing, using the encoder, the first subset of patches to generate corresponding first latent representations, processing, using the decoder, the first latent representations corresponding to the first subset of patches and mask tokens corresponding to the second subset of patches to generate reconstructed patches corresponding to the second subset of patches, the reconstructed patches and the first subset of patches being used to generate a reconstructed image, and updating the model based on comparisons between the image and the reconstructed image.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.