Speaker separation based on real-time latent speaker state characterization
US12315516B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 14, 2023 |
| Grant date | May 27, 2025 |
| Priority date | — |
| Expiry date | Sep 14, 2043 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L21/0272
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Systems, methods, and non-transitory computer-readable media can obtain a stream of audio waveform data that represents speech involving a plurality of speakers. As the stream of audio waveform data is obtained, a plurality of audio chunks can be determined. An audio chunk can be associated with one or more identity embeddings. The stream of audio waveform data can be segmented into a plurality of segments based on the plurality of audio chunks and respective identity embeddings associated with the plurality of audio chunks. A segment can be associated with a speaker included in the plurality of speakers. Information describing the plurality of segments associated with the stream of audio waveform data can be provided.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.