Detection of privacy attacks on machine learning models
US12328331B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Feb 4, 2025 |
| Grant date | Jun 10, 2025 |
| Priority date | — |
| Expiry date | Feb 4, 2045 |
Classification
- Technology area (CPC H)Electricity
- CPC primaryH04L63/1441
- WIPO fieldDigital communication
- WIPO sectorElectrical engineering
Abstract
A plurality of queries are input into an artificial intelligence (AI) model. The AI model is made up of a plurality of layers including an input layer, an output layer, and at least one intermediate layer between the input layer and the output layer. Each intermediate layer, during inference, can output a plurality of activations. Thereafter, for each query, activations are intercepted from at least one of the intermediate layers. It is then determined whether a distribution of the intercepted activations across the queries indicates that the queries seek to cause the AI model to behave in an undesired manner by conducting a distance-based similarity analysis between the intercepted activations and reference activations. Data characterizing such determination is then provided to a consuming application or process. Related apparatus, systems, techniques and articles are also described.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.