Training sparse networks with discrete weight values
US11537870B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 14, 2018 |
| Grant date | Dec 27, 2022 |
| Priority date | — |
| Expiry date | Feb 10, 2041 |
Classification
- Technology area (CPC H)Electricity
- CPC primaryH04L1/24
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Some embodiments provide a method for training a machine-trained (MT) network. The method propagates multiple inputs through the MT network to generate an output for each of the inputs. each of the inputs is associated with an expected output, the MT network uses multiple network parameters to process the inputs, and each network parameter of a set of the network parameters is defined during training as a probability distribution across a discrete set of possible values for the network parameter. The method calculates a value of a loss function for the MT network that includes (i) a first term that measures network error based on the expected outputs compared to the generated outputs and (ii) a second term that penalizes divergence of the probability distribution for each network parameter in the set of network parameters from a predefined probability distribution for the network parameter.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.