Patent · US Active

Multi-modal framework for multi-channel target speech separation

US11688412B2 · kind B2 · utility

1Cited by
0References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJun 15, 2020
Grant dateJun 27, 2023
Priority date
Expiry dateMar 11, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06T2210/22
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method, computer program, and computer system for separating a target voice from among a plurality of speakers is provided. Video data associated with the plurality of speakers and audio data associated with each of the one or more speakers are received. Video feature data is extracted from the received video data. The target voice is identified from among the plurality of speakers based on the received audio data and the extracted video feature data.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.