Patent · US Active

System and method for supervised contrastive learning for multi-modal tasks

US12183062B2 · kind B2 · utility

0Cited by

1References

20Claims

0Family size

Assignee

SAMSUNG ELECTRONICS CO., LTD. · KR

Inventors

Changsheng Zhao · Sichuan, CN
Burak Uzkent · Mountain View, US
Yilin Shen · Mountain View, US
Hongxia Jin · Cupertino, US

Key dates

Filing date	Jan 31, 2022
Grant date	Dec 31, 2024
Priority date	—
Expiry date	Feb 18, 2043

Classification

Technology area (CPC G)Physics
CPC primaryG06V10/82
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method includes obtaining a batch of training data including multiple paired image-text pairs and multiple unpaired image-text pairs, where each paired image-text pair and each unpaired image-text pair includes an image and a text. The method also includes training a machine learning model using the training data based on an optimization of a combination of losses. The losses include, for each paired image-text pair, (i) a first multi-modal representation loss based on the paired image-text pair and (ii) a second multi-modal representation loss based on two or more unpaired image-text pairs, selected from among the multiple unpaired image-text pairs, wherein each of the two or more unpaired image-text pairs includes either the image or the text of the paired image-text pair.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.