Patent · US Active

Speech signal enhancement using visual information

US9293151B2 · kind B2 · utility

46Cited by
4References
17Claims
0Family size

Assignee

Inventors

Key dates

Filing dateOct 17, 2011
Grant dateMar 22, 2016
Priority date
Expiry dateDec 19, 2031

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L2021/02082
  • WIPO fieldAudio-visual technology
  • WIPO sectorElectrical engineering

Abstract

Visual information is used to alter or set an operating parameter of an audio signal processor, other than a beamformer. A digital camera captures visual information about a scene that includes a human speaker and/or a listener. The visual information is analyzed to ascertain information about acoustics of a room. A distance between the speaker and a microphone may be estimated, and this distance estimate may be used to adjust an overall gain of the system. Distances among, and locations of, the speaker, the listener, the microphone, a loudspeaker and/or a sound-reflecting surface may be estimated. These estimates may be used to estimate reverberations within the room and adjust aggressiveness of an anti-reverberation filter, based on an estimated ratio of direct to indirect (reverberated) sound energy expected to reach the microphone. In addition, orientation of the speaker or the listener, relative to the microphone or the loudspeaker, can also be estimated, and this estimate may be used to adjust frequency-dependent filter weights to compensate for uneven frequency propagation of acoustic signals from a mouth, or to a human ear, about a human head.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.