Eye gaze driven spatio-temporal action localization
US9514363B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Apr 8, 2014 |
| Grant date | Dec 6, 2016 |
| Priority date | — |
| Expiry date | Sep 5, 2034 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V40/20
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
The disclosure provides an approach for detecting and localizing action in video. In one embodiment, an action detection application receives training video sequences and associated eye gaze fixation data collected from a sample of human viewers. Using the training video sequences and eye gaze data, the action detection application learns a model which includes a latent regions potential term that measures the compatibility of latent spatio-temporal regions with the model, as well as a context potential term that accounts for contextual information that is not directly produced by the appearance and motion of the actor. The action detection application may train this model in, e.g., the latent structural SVM framework by minimizing a cost function which encodes the cost of an incorrect action label prediction and a mislocalization of the eye gaze. During training and thereafter, inferences using the model may be made using an efficient dynamic programming algorithm.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.