Using combined audio and vision-based cues for voice command-and-control
US9916832B2 · kind B2 · utility
Assignee
Inventor
Key dates
| Filing date | Feb 18, 2016 |
| Grant date | Mar 13, 2018 |
| Priority date | — |
| Expiry date | Feb 21, 2036 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L2015/227
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Techniques for leveraging a combination of audio-based and vision-based cues for voice command-and-control are provided. In one embodiment, an electronic device can identify one or more audio-based cues in a received audio signal that pertain to a possible utterance of a predefined trigger phrase, and identify one or more vision-based cues in a received video signal that pertain to a possible utterance of the predefined trigger phrase. The electronic device can further determine a degree of synchronization or correspondence between the one or more audio-based cues and the one or more vision-based cues. The electronic device can then conclude, based on the one or more audio-based cues, the one or more vision-based cues, and the degree of synchronization or correspondence, whether the predefined trigger phrase was actually spoken.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.