Patent · US Active

Conditioned separation of arbitrary sounds based on machine learning models

US12361964B2 · kind B2 · utility

0Cited by

1References

23Claims

0Family size

Assignee

Google LLC · US

Inventors

Beat Gfeller · Zürich, CH
Kevin Kilgour · Mountain View, US
Marco Tagliasacchi · Lugano, CH
Aren Jansen · Mountain View, US
Scott Wisdom · Boston, US
Qingqing Huang · Palo Alto, US

Key dates

Filing date	Jun 24, 2022
Grant date	Jul 15, 2025
Priority date	—
Expiry date	Aug 15, 2043

Classification

Technology area (CPC G)Physics
CPC primaryG10L25/30
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Example methods include receiving training data comprising a plurality of audio clips and a plurality of textual descriptions of audio. The methods include generating a shared representation comprising a joint embedding. An audio embedding of a given audio clip is within a threshold distance of a text embedding of a textual description of the given audio clip. The methods include generating, based on the joint embedding, a conditioning vector and training, based on the conditioning vector, a neural network to: receive (i) an input audio waveform, and (ii) an input comprising one or more of an input textual description of a target audio source in the input audio waveform, or an audio sample of the target audio source, separate audio corresponding to the target audio source from the input audio waveform, and output the separated audio corresponding to the target audio source in response to the receiving of the input.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.