Patent · US Active

Neural-symbolic action transformers for video question answering

US12175384B2 · kind B2 · utility

0Cited by

2References

18Claims

0Family size

Assignee

International Business Machines Corporation · US

Inventors

Bo Wu · Cambridge, US
Chuang Gan · Cambridge, US
Dakuo Wang · Cambridge, US
Zhenfang Chen · Sunnyvale, US

Key dates

Filing date	Jul 21, 2021
Grant date	Dec 24, 2024
Priority date	—
Expiry date	Feb 11, 2043

Classification

Technology area (CPC G)Physics
CPC primaryG06N3/0464
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Mechanisms are provided for performing artificial intelligence-based video question answering. A video parser parses an input video data sequence to generate situation data structure(s), each situation data structure comprising data elements corresponding to entities, and first relationships between entities, identified by the video parser as present in images of the input video data sequence. First machine learning computer model(s) operate on the situation data structure(s) to predict second relationship(s) between the situation data structure(s). Second machine learning computer model(s) execute on a received input question to predict an executable program to execute to answer the received question. The program is executed on the situation data structure(s) and predicted second relationship(s). An answer to the question is output based on results of executing the program.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.