Patent · US Active

Sequence models for audio scene recognition

US10930301B1 · kind B1 · utility

1Cited by
3References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateAug 19, 2020
Grant dateFeb 23, 2021
Priority date
Expiry dateAug 19, 2040

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L25/30
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method is provided. Intermediate audio features are generated from an input acoustic sequence. Using a nearest neighbor search, segments of the input acoustic sequence are classified based on the intermediate audio features to generate a final intermediate feature as a classification for the input acoustic sequence. Each segment corresponds to a respective different acoustic window. The generating step includes learning the intermediate audio features from Multi-Frequency Cepstral Component (MFCC) features extracted from the input acoustic sequence. The generating step includes dividing the same scene into the different acoustic windows having varying MFCC features. The generating step includes feeding the MFCC features of each of the different acoustic windows into respective LSTM units such that a hidden state of each respective LSTM unit is passed through an attention layer to identify feature correlations between hidden states at different time steps corresponding to different ones of the different acoustic windows.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.