Patent · US Active

Array geometry agnostic multi-channel personalized speech enhancement

US12230259B2 · kind B2 · utility

0Cited by

2References

18Claims

0Family size

Assignee

MICROSOFT TECHNOLOGY LICENSING, LLC · US

Inventors

Sefik Emre ESKIMEZ · Bellevue, US
Takuya Yoshioka · Bellevue, US
Huaming Wang · Qingdao, CN
Hassan Taherian · Columbus, US
Zhuo Chen · Markham, CA
Xuedong Huang · Bellevue, US

Key dates

Filing date	Dec 17, 2021
Grant date	Feb 18, 2025
Priority date	—
Expiry date	Apr 7, 2043

Classification

Technology area (CPC G)Physics
CPC primaryG10L2021/02087
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Examples of array geometry agnostic multi-channel personalized speech enhancement (PSE) extract speaker embeddings, which represent acoustic characteristics of one or more target speakers, from target speaker enrollment data. Spatial features (e.g., inter-channel phase difference) are extracted from input audio captured by a microphone array. The input audio includes a mixture of speech data of the target speaker(s) and one or more interfering speaker(s). The input audio, the extracted speaker embeddings, and the extracted spatial features are provided to a trained geometry-agnostic PSE model. Output data is produced, which comprises estimated clean speech data of the target speaker(s) that has a reduction (or elimination) of speech data of the interfering speaker(s), without the trained PSE model requiring geometry information for the microphone array.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.