Patent · US Active

Neural-symbolic action transformers for video question answering

US12175384B2 · kind B2 · utility

0Cited by
2References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJul 21, 2021
Grant dateDec 24, 2024
Priority date
Expiry dateFeb 11, 2043

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N3/0464
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Mechanisms are provided for performing artificial intelligence-based video question answering. A video parser parses an input video data sequence to generate situation data structure(s), each situation data structure comprising data elements corresponding to entities, and first relationships between entities, identified by the video parser as present in images of the input video data sequence. First machine learning computer model(s) operate on the situation data structure(s) to predict second relationship(s) between the situation data structure(s). Second machine learning computer model(s) execute on a received input question to predict an executable program to execute to answer the received question. The program is executed on the situation data structure(s) and predicted second relationship(s). An answer to the question is output based on results of executing the program.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.