Patent · US Active

Speaker separation based on real-time latent speaker state characterization

US12315516B2 · kind B2 · utility

0Cited by

3References

20Claims

0Family size

Assignee

Unity Technologies SF · US

Inventors

Valentin Alain Jean Perret · Saint-Remy, FR
Nándor Kedves · Adliswil, CH
Nicolas Lucien Perony · Zürich, CH

Key dates

Filing date	Sep 14, 2023
Grant date	May 27, 2025
Priority date	—
Expiry date	Sep 14, 2043

Classification

Technology area (CPC G)Physics
CPC primaryG10L21/0272
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Systems, methods, and non-transitory computer-readable media can obtain a stream of audio waveform data that represents speech involving a plurality of speakers. As the stream of audio waveform data is obtained, a plurality of audio chunks can be determined. An audio chunk can be associated with one or more identity embeddings. The stream of audio waveform data can be segmented into a plurality of segments based on the plurality of audio chunks and respective identity embeddings associated with the plurality of audio chunks. A segment can be associated with a speaker included in the plurality of speakers. Information describing the plurality of segments associated with the stream of audio waveform data can be provided.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.