Patent · US Active

Method and apparatus for training visual language pre-training model, and device and medium

US12142036B2 · kind B2 · utility

2Cited by
0References
14Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 6, 2023
Grant dateNov 12, 2024
Priority date
Expiry dateDec 6, 2043

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V20/70
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Provided in the present application are a method and apparatus for training a visual language pre-training model, and a device and a medium. The method includes: acquiring pairing groups respectively corresponding to N images, wherein the pairing group of a first image includes: a first pairing group which is composed of the first image and description text of the first image, and a second pairing group which is composed of a local image of the first image and description text of the local image, N is an integer greater than 1, and the first image is any one of the N images; and training a visual language pre-training model according to the pairing groups respectively corresponding to the N images.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.