Patent · US Active

System and method for enhancing speech activity detection using facial feature detection

US9318129B2 · kind B2 · utility

14Cited by

9References

21Claims

0Family size

Assignee

AT&T Intellectual Property I, L.P. · US

Inventors

Brant J. Vasilieff · Glendale, US
Patrick Ehlen · New York, US
Jay H. Lieske, Jr. · Los Angeles, US

Key dates

Filing date	Jul 18, 2011
Grant date	Apr 19, 2016
Priority date	—
Expiry date	Oct 19, 2032

Classification

Technology area (CPC H)Electricity
CPC primaryH04N1/00403
WIPO fieldAudio-visual technology
WIPO sectorElectrical engineering

Abstract

Disclosed herein are systems, methods, and non-transitory computer-readable storage media for processing audio. A system configured to practice the method monitors, via a processor of a computing device, an image feed of a user interacting with the computing device and identifies an audio start event in the image feed based on face detection of the user looking at the computing device or a specific region of the computing device. The image feed can be a video stream. The audio start event can be based on a head size, orientation or distance from the computing device, eye position or direction, device orientation, mouth movement, and/or other user features. Then the system initiates processing of a received audio signal based on the audio start event. The system can also identify an audio end event in the image feed and end processing of the received audio signal based on the end event.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.