Patent · US Active

Receptive-field-conforming convolution models for video coding

US11025907B2 · kind B2 · utility

5Cited by

6References

20Claims

0Family size

Assignee

Google LLC · US

Inventors

Shan Li · Edison, US
Claudionor Coelho · Felixlândia, BR
Aki Kuusela · Palo Alto, US
Dake He · Waterloo, CA

Key dates

Filing date	Feb 28, 2019
Grant date	Jun 1, 2021
Priority date	—
Expiry date	Feb 28, 2039

Classification

Technology area (CPC G)Physics
CPC primaryG06N3/082
WIPO fieldAudio-visual technology
WIPO sectorElectrical engineering

Abstract

Convolutional neural networks (CNN) that determine a mode decision (e.g., block partitioning) for encoding a block include feature extraction layers and multiple classifiers. A non-overlapping convolution operation is performed at a feature extraction layer by setting a stride value equal to a kernel size. The block has a N×N size, and a smallest partition output for the block has a S×S size. Classification layers of each classifier receive feature maps having a feature dimension. An initial classification layer receives the feature maps as an output of a final feature extraction layer. Each classifier infers partition decisions for sub-blocks of size (αS)×(αS) of the block, wherein α is a power of 2 and α=2, . . . , N/S, by applying, at some successive classification layers, a 1×1 kernel to reduce respective feature dimensions; and outputting by a last layer of the classification layers an output corresponding to a N/(αS)×N/(αS)×1 output map.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.