Defending machine learning systems from adversarial attacks
US11893111B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Nov 26, 2019 |
| Grant date | Feb 6, 2024 |
| Priority date | — |
| Expiry date | Mar 26, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F2221/034
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Techniques are disclosed for detecting adversarial attacks. A machine learning (ML) system processes the input into and output of a ML model using an adversarial detection module that does not include a direct external interface. The adversarial detection module includes a detection model that generates a score indicative of whether the input is adversarial using, e.g., a neural fingerprinting technique or a comparison of features extracted by a surrogate ML model to an expected feature distribution for the output of the ML model. In turn, the adversarial score is compared to a predefined threshold for raising an adversarial flag. Appropriate remedial measures, such as notifying a user, may be taken when the adversarial score satisfies the threshold and raises the adversarial flag.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.