Patent · US Active

Method and system for multi-modal fusion model

US10417498B2 · kind B2 · utility

4Cited by
0References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 29, 2017
Grant dateSep 17, 2019
Priority date
Expiry dateNov 15, 2037

Classification

  • Technology area (CPC H)Electricity
  • CPC primaryH04N21/8549
  • WIPO fieldAudio-visual technology
  • WIPO sectorElectrical engineering

Abstract

A system for generating a word sequence includes one or more processors in connection with a memory and one or more storage devices storing instructions causing operations that include receiving first and second input vectors, extracting first and second feature vectors, estimating a first set of weights and a second set of weights, calculating a first content vector from the first set of weights and the first feature vectors, and calculating a second content vector, transforming the first content vector into a first modal content vector having a predetermined dimension and transforming the second content vector into a second modal content vector having the predetermined dimension, estimating a set of modal attention weights, generating a weighted content vector having the predetermined dimension from the set of modal attention weights and the first and second modal content vectors, and generating a predicted word using the sequence generator.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.