Training network to minimize worst case surprise
US11586902B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 14, 2018 |
| Grant date | Feb 21, 2023 |
| Priority date | — |
| Expiry date | Jun 21, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N3/0464
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Some embodiments provide a method for training a machine-trained (MT) network that processes input data using network parameters. The method maps input instances to output values by propagating the instances through the network. The input instances include instances for each of multiple categories. For a particular instance selected as an anchor instance, the method identifies each instance in a different category as a negative instance. The method calculates, for each negative instance of the anchor, a surprise function that probabilistically measures a surprise of finding an output value for an instance in the same category as the anchor that is a greater distance from the output value for the anchor instance than output value for the negative instance. The method calculates a loss function that emphasizes a maximum surprise calculated for the anchor. The method trains the network parameters using the calculated loss function value to minimize the maximum surprise.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.