Patent · US Active

Using combined audio and vision-based cues for voice command-and-control

US9916832B2 · kind B2 · utility

4Cited by

2References

18Claims

0Family size

Assignee

SENSORY, INCORPORATED · US

Inventor

Todd F. Mozer · Los Altos, US

Key dates

Filing date	Feb 18, 2016
Grant date	Mar 13, 2018
Priority date	—
Expiry date	Feb 21, 2036

Classification

Technology area (CPC G)Physics
CPC primaryG10L2015/227
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Techniques for leveraging a combination of audio-based and vision-based cues for voice command-and-control are provided. In one embodiment, an electronic device can identify one or more audio-based cues in a received audio signal that pertain to a possible utterance of a predefined trigger phrase, and identify one or more vision-based cues in a received video signal that pertain to a possible utterance of the predefined trigger phrase. The electronic device can further determine a degree of synchronization or correspondence between the one or more audio-based cues and the one or more vision-based cues. The electronic device can then conclude, based on the one or more audio-based cues, the one or more vision-based cues, and the degree of synchronization or correspondence, whether the predefined trigger phrase was actually spoken.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.