Patent · US Active

Neural network acceleration and embedding compression systems and methods with activation sparsification

US10832139B2 · kind B2 · utility

17Cited by
1References
29Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJun 20, 2019
Grant dateNov 10, 2020
Priority date
Expiry dateJun 20, 2039

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06T2207/20132
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Systems, methods and computer-readable medium for (i) accelerating the inference speed of a deep neural network (DNN), and (ii) compressing the vector representations produced by the DNN out of a variety of input data, such as image, audio, video and text. A method embodiment takes as inputs a neural network architecture and a task-dependent loss function, measuring how well a neural network performs on a training data set, and outputs a deep neural network with sparse neuron activations. The invented procedure augments an existing training objective function of a DNN with regularization terms that encourage sparse activation of neurons, and compresses the DNN by solving the optimization problem with a variety of algorithms. The present disclosure also shows how to utilize the sparsity of activations during the inference of DNNs so the number of arithmetic operations can be reduced proportionately, and how to use the sparse representations produced by the DNNs to build an efficient search engine.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.