Patent · US Active

Speaker separation based on real-time latent speaker state characterization

US12315516B2 · kind B2 · utility

0Cited by
3References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateSep 14, 2023
Grant dateMay 27, 2025
Priority date
Expiry dateSep 14, 2043

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L21/0272
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Systems, methods, and non-transitory computer-readable media can obtain a stream of audio waveform data that represents speech involving a plurality of speakers. As the stream of audio waveform data is obtained, a plurality of audio chunks can be determined. An audio chunk can be associated with one or more identity embeddings. The stream of audio waveform data can be segmented into a plurality of segments based on the plurality of audio chunks and respective identity embeddings associated with the plurality of audio chunks. A segment can be associated with a speaker included in the plurality of speakers. Information describing the plurality of segments associated with the stream of audio waveform data can be provided.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.