Patent · US Active

Attention mechanism for coping with acoustic-lips timing mismatch in audiovisual processing

US10834295B2 · kind B2 · utility

0Cited by
1References
17Claims
0Family size

Assignee

Inventors

Key dates

Filing dateAug 29, 2018
Grant dateNov 10, 2020
Priority date
Expiry dateAug 29, 2038

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N3/048
  • WIPO fieldAudio-visual technology
  • WIPO sectorElectrical engineering

Abstract

Embodiments of the present systems and methods may provide techniques for handling acoustic-lips timing mismatch in audiovisual processing. In embodiments, the context-dependent time shift between the audio and visual streams may be explicitly modeled using an attention mechanism. For example, in an embodiment, a computer-implemented method for determining a context-dependent time shift of audio and video features in an audiovisual stream or file may comprise receiving audio information and video information of the audiovisual stream or file, processing the audio information and video information separately to generate a new representation of the audio information, including information relating to features of the audio information, and a new representation of the video information, including information relating to features of the video information, and mapping features of the audio information and features of the video information using an attention mechanism to identify synchronized pairs of audio and video features.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.