Detection of machine learning model attacks obfuscated in unicode
US12273381B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Nov 12, 2024 |
| Grant date | Apr 8, 2025 |
| Priority date | — |
| Expiry date | Nov 12, 2044 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/284
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A prompt for a generative artificial intelligence (GenAI) model which contains unicode is received. The prompt is then tokenized to result in a plurality of tokens. Token forming part of a repeating sequence are identified and then removed to result in a modified set of tokens. The modified set of tokens are subsequently detokenized to result in a modified prompt. It is then determined, whether ingestion of the modified prompt by the GenAI model will result in the GenAI model behaving in an undesired manner. The modified prompt is passed to the GenAI model when it is determined that ingestion of the modified prompt will not result in the GenAI model behaving in an undesired manner. Otherwise, at least one remediation action is initiated when it is determined that ingestion of the modified prompt by the GenAI model will result in the GenAI model behaving in an undesired manner.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.