Canonicalization of unicode prompt injections
US12278836B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Nov 12, 2024 |
| Grant date | Apr 15, 2025 |
| Priority date | — |
| Expiry date | Nov 12, 2044 |
Classification
- Technology area (CPC H)Electricity
- CPC primaryH04L63/1466
- WIPO fieldDigital communication
- WIPO sectorElectrical engineering
Abstract
A prompt for a generative artificial intelligence (GenAI) model is received which includes unicode. Unicode fonts in the prompt are identified and then translated into a plaintext representation. Further, unicode characters in the prompt are identified which each have an associated unicode tag. It is determined, based on the associated unicode tags, whether at least a portion of the unicode characters are valid. When at least a portion of the unicode characters are determined to be valid, the unicode characters in the prompt are converted into a plaintext representation. The prompt with the translated fonts and the converted unicode fonts are passed into the GenAI model. When at least a portion of the unicode characters are not determined to be valid, the unicode characters are removed from the prompt. This prompt with the translated unicode fonts, after the unicode characters are removed, is input into the GenAI model.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.