Patent · US Active

Generating modified digital images utilizing a multimodal selection model based on verbal and gesture input

US10817713B2 · kind B2 · utility

28Cited by

0References

20Claims

0Family size

Assignee

Adobe Inc. · US

Inventors

Trung Bui · San Jose, US
Zhe Lin · Fremont, US
Walter Chang · San Jose, US
Nham Van Le · San Jose, US
Franck Dernoncourt · Sunnyvale, US

Key dates

Filing date	Nov 15, 2018
Grant date	Oct 27, 2020
Priority date	—
Expiry date	Apr 9, 2039

Classification

Technology area (CPC G)Physics
CPC primaryG06V40/28
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

The present disclosure relates to systems, methods, and non-transitory computer readable media for generating modified digital images based on verbal and/or gesture input by utilizing a natural language processing neural network and one or more computer vision neural networks. The disclosed systems can receive verbal input together with gesture input. The disclosed systems can further utilize a natural language processing neural network to generate a verbal command based on verbal input. The disclosed systems can select a particular computer vision neural network based on the verbal input and/or the gesture input. The disclosed systems can apply the selected computer vision neural network to identify pixels within a digital image that correspond to an object indicated by the verbal input and/or gesture input. Utilizing the identified pixels, the disclosed systems can generate a modified digital image by performing one or more editing actions indicated by the verbal input and/or gesture input.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.