Patent · US Active

Array geometry agnostic multi-channel personalized speech enhancement

US12230259B2 · kind B2 · utility

0Cited by
2References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 17, 2021
Grant dateFeb 18, 2025
Priority date
Expiry dateApr 7, 2043

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L2021/02087
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Examples of array geometry agnostic multi-channel personalized speech enhancement (PSE) extract speaker embeddings, which represent acoustic characteristics of one or more target speakers, from target speaker enrollment data. Spatial features (e.g., inter-channel phase difference) are extracted from input audio captured by a microphone array. The input audio includes a mixture of speech data of the target speaker(s) and one or more interfering speaker(s). The input audio, the extracted speaker embeddings, and the extracted spatial features are provided to a trained geometry-agnostic PSE model. Output data is produced, which comprises estimated clean speech data of the target speaker(s) that has a reduction (or elimination) of speech data of the interfering speaker(s), without the trained PSE model requiring geometry information for the microphone array.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.