Patent · US Active

Video action detection method based on convolutional neural network

US11379711B2 · kind B2 · utility

0Cited by
1References
6Claims
0Family size

Assignee

Inventors

Key dates

Filing dateAug 16, 2017
Grant dateJul 5, 2022
Priority date
Expiry dateFeb 15, 2039

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V20/41
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A video action detection method based on a convolutional neural network (CNN) is disclosed in the field of computer vision recognition technologies. A temporal-spatial pyramid pooling layer is added to a network structure, which eliminates limitations on input by a network, speeds up training and detection, and improves performance of video action classification and time location. The disclosed convolutional neural network includes a convolutional layer, a common pooling layer, a temporal-spatial pyramid pooling layer and a full connection layer. The outputs of the convolutional neural network include a category classification output layer and a time localization calculation result output layer. The disclosed method does not require down-sampling to obtain video clips of different durations, but instead utilizes direct input of the whole video at once, improving efficiency. Moreover, the network is trained by using video clips of the same frequency without increasing differences within a category, thus reducing the learning burden of the network, achieving faster model convergence and better detection.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.