Patent · US Active

Unsupervised learning of semantic audio representations

US11335328B2 · kind B2 · utility

0Cited by

0References

20Claims

0Family size

Assignee

Google LLC · US

Inventors

Aren Jansen · Mountain View, US
Manoj Plakal · New York, US
Richard Channing Moore, III · Brooklyn, US
Shawn Hershey · Brookline, US
Ratheet Pandya · Mountain View, US
Ryan M. Rifkin · Oakland, US
Jiayang Liu · Mountain View, US
Daniel Ellis · New York, US

Key dates

Filing date	Oct 26, 2018
Grant date	May 17, 2022
Priority date	—
Expiry date	Oct 26, 2038

Classification

Technology area (CPC G)Physics
CPC primaryG10L2015/0635
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Methods are provided for generating training triplets that can be used to train multidimensional embeddings to represent the semantic content of non-speech sounds present in a corpus of audio recordings. These training triplets can be used with a triplet loss function to train the multidimensional embeddings such that the embeddings can be used to cluster the contents of a corpus of audio recordings, to facilitate a query-by-example lookup from the corpus, to allow a small number of manually-labeled audio recordings to be generalized, or to facilitate some other audio classification task. The triplet sampling methods may be used individually or collectively, and each represent a respective heuristic about the semantic structure of audio recordings.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.