Patent · US Active

Automatically determining whether an activation cluster contains poisonous data

US11487963B2 · kind B2 · utility

3Cited by

4References

20Claims

0Family size

Assignee

International Business Machines Corporation · US

Inventors

Nathalie Baracaldo Angel · San Jose, US
Bryant Chen · San Jose, US
Biplav Srivastava · Noida, IN
Heiko H. Ludwig · San Francisco, US

Key dates

Filing date	Sep 16, 2019
Grant date	Nov 1, 2022
Priority date	—
Expiry date	May 2, 2041

Classification

Technology area (CPC G)Physics
CPC primaryG06N3/09
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Embodiments relate to a system, program product, and method for automatically determining which activation data points in a neural model have been poisoned to erroneously indicate association with a particular label or labels. A neural network is trained network using potentially poisoned training data. Each of the training data points is classified using the network to retain the activations of the last hidden layer, and segment those activations by the label of corresponding training data. Clustering is applied to the retained activations of each segment, and a cluster assessment is conducted for each cluster associated with each label to distinguish clusters with potentially poisoned activations from clusters populated with legitimate activations. The assessment includes analyzing, for each cluster, a distance of a median of the activations therein to medians of the activations in the labels.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.