Method and apparatus for training visual language pre-training model, and device and medium
US12142036B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Dec 6, 2023 |
| Grant date | Nov 12, 2024 |
| Priority date | — |
| Expiry date | Dec 6, 2043 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V20/70
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Provided in the present application are a method and apparatus for training a visual language pre-training model, and a device and a medium. The method includes: acquiring pairing groups respectively corresponding to N images, wherein the pairing group of a first image includes: a first pairing group which is composed of the first image and description text of the first image, and a second pairing group which is composed of a local image of the first image and description text of the local image, N is an integer greater than 1, and the first image is any one of the N images; and training a visual language pre-training model according to the pairing groups respectively corresponding to the N images.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.