Patent · US Active

Text-based framework for video object selection

US12266181B2 · kind B2 · utility

0Cited by

0References

20Claims

0Family size

Assignee

Adobe Inc. · US

Inventors

Shivam Nalin Patel · Mountain View, US
Kshitiz Garg · Santa Clara, US
Han Guo · San Jose, US
Ali Aminian · Piedmont, US
Aashish Kumar Misraa · San Jose, US

Key dates

Filing date	Nov 19, 2021
Grant date	Apr 1, 2025
Priority date	—
Expiry date	Nov 28, 2043

Classification

Technology area (CPC G)Physics
CPC primaryG06V20/46
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Embodiments are disclosed for receiving a user input and an input video comprising multiple frames. The method may include extracting a text feature from the user input. The method may further include extracting a plurality of image features from the frames. The method may further include identifying one or more keyframes from the frames that include the object. The method may further include clustering one or more groups of the one or more keyframes. The method may further include generating a plurality of segmentation masks for each group. The method may further include determining a set of reference masks corresponding to the user input and the object. The method may further include generating a set of fusion masks by combining the plurality of segmentation masks and the set of reference masks. The method may further include propagating the set of fusion masks and outputting a final set of masks.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.