Patent · US Active

Receptive-field-conforming convolution models for video coding

US11025907B2 · kind B2 · utility

5Cited by
6References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateFeb 28, 2019
Grant dateJun 1, 2021
Priority date
Expiry dateFeb 28, 2039

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N3/082
  • WIPO fieldAudio-visual technology
  • WIPO sectorElectrical engineering

Abstract

Convolutional neural networks (CNN) that determine a mode decision (e.g., block partitioning) for encoding a block include feature extraction layers and multiple classifiers. A non-overlapping convolution operation is performed at a feature extraction layer by setting a stride value equal to a kernel size. The block has a N×N size, and a smallest partition output for the block has a S×S size. Classification layers of each classifier receive feature maps having a feature dimension. An initial classification layer receives the feature maps as an output of a final feature extraction layer. Each classifier infers partition decisions for sub-blocks of size (αS)×(αS) of the block, wherein α is a power of 2 and α=2, . . . , N/S, by applying, at some successive classification layers, a 1×1 kernel to reduce respective feature dimensions; and outputting by a last layer of the classification layers an output corresponding to a N/(αS)×N/(αS)×1 output map.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.