Patent · US Active

Multi-modal framework for multi-channel target speech separation

US11688412B2 · kind B2 · utility

1Cited by

0References

20Claims

0Family size

Assignee

TENCENT AMERICA LLC · US

Inventors

Shi-Xiong Zhang · Redmond, US
Yong Xu · Brooklyn, US
Meng Yu · Bellevue, US
Dong Yu · Bellevue, US

Key dates

Filing date	Jun 15, 2020
Grant date	Jun 27, 2023
Priority date	—
Expiry date	Mar 11, 2041

Classification

Technology area (CPC G)Physics
CPC primaryG06T2210/22
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method, computer program, and computer system for separating a target voice from among a plurality of speakers is provided. Video data associated with the plurality of speakers and audio data associated with each of the one or more speakers are received. Video feature data is extracted from the received video data. The target voice is identified from among the plurality of speakers based on the received audio data and the extracted video feature data.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.