Patent · US Active

Bi-directional recurrent encoders with multi-hop attention for speech emotion recognition

US12236975B2 · kind B2 · utility

0Cited by

2References

20Claims

0Family size

Assignee

Adobe Inc. · US

Inventors

Trung Bui · San Jose, US
Subhadeep Dey · Martigny, CH
Seunghyun Yoon · Seoul, KR

Key dates

Filing date	Nov 15, 2021
Grant date	Feb 25, 2025
Priority date	—
Expiry date	Dec 4, 2043

Classification

Technology area (CPC G)Physics
CPC primaryG10L15/26
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

The present disclosure relates to systems, methods, and non-transitory computer readable media for determining speech emotion. In particular, a speech emotion recognition system generates an audio feature vector and a textual feature vector for a sequence of words. Further, the speech emotion recognition system utilizes a neural attention mechanism that intelligently blends together the audio feature vector and the textual feature vector to generate attention output. Using the attention output, which includes consideration of both audio and text modalities for speech corresponding to the sequence of words, the speech emotion recognition system can apply attention methods to one of the feature vectors to generate a hidden feature vector. Based on the hidden feature vector, the speech emotion recognition system can generate a speech emotion probability distribution of emotions among a group of candidate emotions, and then select one of the candidate emotions as corresponding to the sequence of words.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.