Patent · US Active

Video action detection method based on convolutional neural network

US11379711B2 · kind B2 · utility

0Cited by

1References

6Claims

0Family size

Assignee

Peking University Shenzhen Graduate School · CN

Inventors

Wenmin Wang · Nanhu, CN
Zhihao Li · Qingdao, CN
Ronggang Wang · Tangxia, CN
Ge Li · Tangxia, CN
Shengfu Dong · Nanhu, CN
Zhenyu Wang · Morganville, US
Ying Li · Richardson, US
Hui Zhao · Nanhu, CN
Wen Gao · Tianjin, CN

Key dates

Filing date	Aug 16, 2017
Grant date	Jul 5, 2022
Priority date	—
Expiry date	Feb 15, 2039

Classification

Technology area (CPC G)Physics
CPC primaryG06V20/41
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A video action detection method based on a convolutional neural network (CNN) is disclosed in the field of computer vision recognition technologies. A temporal-spatial pyramid pooling layer is added to a network structure, which eliminates limitations on input by a network, speeds up training and detection, and improves performance of video action classification and time location. The disclosed convolutional neural network includes a convolutional layer, a common pooling layer, a temporal-spatial pyramid pooling layer and a full connection layer. The outputs of the convolutional neural network include a category classification output layer and a time localization calculation result output layer. The disclosed method does not require down-sampling to obtain video clips of different durations, but instead utilizes direct input of the whole video at once, improving efficiency. Moreover, the network is trained by using video clips of the same frequency without increasing differences within a category, thus reducing the learning burden of the network, achieving faster model convergence and better detection.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.