Speech detection for noisy conditions
US6480823B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 24, 1998 |
| Grant date | Nov 12, 2002 |
| Priority date | — |
| Expiry date | Mar 24, 2018 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L25/87
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
The input signal is transformed into the frequency domain and then subdivided into bands corresponding to different frequency ranges. Adaptive thresholds are applied to the data from each frequency band separately. Thus the short-term band-limited energies are tested for the presence or absence of a speech signal. The adaptive threshold values are independently updated for each of the signal paths, using a histogram data structure to accumulate long-term data representing the mean and variance of energy within the respective frequency band. Endpoint detection is performed by a state machine that transitions from the speech absent state to the speech present state, and vice versa, depending on the results of the threshold comparisons. A partial speech detection system handles cases in which the input signal is truncated.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.