Multimodal dimensional emotion recognition method
US11281945B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 8, 2021 |
| Grant date | Mar 22, 2022 |
| Priority date | — |
| Expiry date | Sep 8, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L25/63
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A multimodal dimensional emotion recognition method includes: acquiring a frame-level audio feature, a frame-level video feature, and a frame-level text feature from an audio, a video, and a corresponding text of a sample to be tested; performing temporal contextual modeling on the frame-level audio feature, the frame-level video feature, and the frame-level text feature respectively by using a temporal convolutional network to obtain a contextual audio feature, a contextual video feature, and a contextual text feature; performing weighted fusion on these three features by using a gated attention mechanism to obtain a multimodal feature; splicing the multimodal feature and these three features together to obtain a spliced feature, and then performing further temporal contextual modeling on the spliced feature by using a temporal convolutional network to obtain a contextual spliced feature; and performing regression prediction on the contextual spliced feature to obtain a final dimensional emotion prediction result.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.