Method and system for multi-modal fusion model
US10417498B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 29, 2017 |
| Grant date | Sep 17, 2019 |
| Priority date | — |
| Expiry date | Nov 15, 2037 |
Classification
- Technology area (CPC H)Electricity
- CPC primaryH04N21/8549
- WIPO fieldAudio-visual technology
- WIPO sectorElectrical engineering
Abstract
A system for generating a word sequence includes one or more processors in connection with a memory and one or more storage devices storing instructions causing operations that include receiving first and second input vectors, extracting first and second feature vectors, estimating a first set of weights and a second set of weights, calculating a first content vector from the first set of weights and the first feature vectors, and calculating a second content vector, transforming the first content vector into a first modal content vector having a predetermined dimension and transforming the second content vector into a second modal content vector having the predetermined dimension, estimating a set of modal attention weights, generating a weighted content vector having the predetermined dimension from the set of modal attention weights and the first and second modal content vectors, and generating a predicted word using the sequence generator.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.