Neural network acceleration and embedding compression systems and methods with activation sparsification
US10832139B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jun 20, 2019 |
| Grant date | Nov 10, 2020 |
| Priority date | — |
| Expiry date | Jun 20, 2039 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06T2207/20132
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Systems, methods and computer-readable medium for (i) accelerating the inference speed of a deep neural network (DNN), and (ii) compressing the vector representations produced by the DNN out of a variety of input data, such as image, audio, video and text. A method embodiment takes as inputs a neural network architecture and a task-dependent loss function, measuring how well a neural network performs on a training data set, and outputs a deep neural network with sparse neuron activations. The invented procedure augments an existing training objective function of a DNN with regularization terms that encourage sparse activation of neurons, and compresses the DNN by solving the optimization problem with a variety of algorithms. The present disclosure also shows how to utilize the sparsity of activations during the inference of DNNs so the number of arithmetic operations can be reduced proportionately, and how to use the sparse representations produced by the DNNs to build an efficient search engine.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.