Patent · US Active

Systems and methods for interacting with a multimodal machine learning model

US12039431B1 · kind B1 · utility

0Cited by

6References

20Claims

0Family size

Assignee

OpenAI OpCo, LLC · US

Inventors

Noah Deutsch · San Francisco, US
Nicholas TURLEY · San Francisco, US
Benjamin Zweig · San Francisco, US

Key dates

Filing date	Sep 27, 2023
Grant date	Jul 16, 2024
Priority date	—
Expiry date	Sep 27, 2043

Classification

Technology area (CPC G)Physics
CPC primaryG06N3/08
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

The disclosed embodiments may include a method of interacting with a multimodal machine learning model; the method may include providing a graphical user interface associated with a multimodal machine learning model. The method may further include displaying an image to a user in the graphical user interface. The method may also include receiving a textual prompt from the user and then generating input data using the image and the textual prompt. The method may further include generating an output at least in part by applying the input data to the multimodal machine learning model, the multimodal machine learning model configured using prompt engineering to identify a location in the image conditioned on the image and the textual prompt, wherein the output comprises a first location indication. The method may also include displaying, in the graphical user interface, an emphasis indicator at the indicated first location in the image.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.