Patent · US Active

Adaptive selection of data modalities for efficient video recognition

US12249147B2 · kind B2 · utility

0Cited by

7References

15Claims

0Family size

Assignee

International Business Machines Corporation · US

Inventors

Rameswar Panda · Medford, US
Richard Chen · Mount Kisco, US
Quanfu Fan · Somerville, US
Rogerio S. Feris · Hartford, US

Key dates

Filing date	Mar 11, 2021
Grant date	Mar 11, 2025
Priority date	—
Expiry date	Aug 8, 2042

Classification

Technology area (CPC G)Physics
CPC primaryG06V20/49
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

One embodiment of the invention provides a method for video recognition. The method comprises receiving an input video comprising a sequence of video segments over a plurality of data modalities. The method further comprises, for a video segment of the sequence, selecting one or more data modalities based on data representing the video segment. Each data modality selected is optimal for video recognition of the video segment. The method further comprises, for each data modality selected, providing at least one data input representing the video segment over the data modality selected to a machine learning model corresponding to the data modality selected, and generating a first type of prediction representative of the video segment via the machine learning model. The method further comprises determining a second type of prediction representative of the entire input video by aggregating all first type of predictions generated.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.