Patent · US Active

Speech/music discrimination

US9613640B1 · kind B1 · utility

22Cited by

6References

11Claims

0Family size

Assignee

Audyssey Laboratories, Inc. · US

Inventors

Ramasamy Govindaraju Balamurali · Los Angeles, US
Chandra Rajagopal · Los Angeles, US

Key dates

Filing date	Jan 14, 2016
Grant date	Apr 4, 2017
Priority date	—
Expiry date	Jan 14, 2036

Classification

Technology area (CPC G)Physics
CPC primaryG10L25/21
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A speech/music discrimination method evaluates the standard deviation between envelope peaks, loudness ratio, and smoothed energy difference. The envelope is searched for peaks above a threshold. The standard deviations of the separations between peaks are calculated. Decreased standard deviation is indicative of speech, higher standard deviation is indicative of non-speech. The ratio between minimum and maximum loudness in recent input signal data frames is calculated. If this ratio corresponds to the dynamic range characteristic of speech, it is another indication that the input signal is speech content. Smoothed energies of the frames from the left and right input channels are computed and compared. Similar (e.g., highly correlated) left and right channel smoothed energies is indicative of speech. Dissimilar (e.g., un-correlated content) left and right channel smoothed energies is indicative of non-speech material. The results of the three tests are compared to make a speech/music decision.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.