Personally identifiable information scrubber with language models
US12387007B2 · kind B2 · utility
Assignee
Inventor
Key dates
| Filing date | Dec 16, 2024 |
| Grant date | Aug 12, 2025 |
| Priority date | — |
| Expiry date | Dec 16, 2044 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/284
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Sanitizing data can be a cumbersome task, particularly when the volume of data is large, the content is sensitive, and/or the type of sanitation requires contextual determinations. Sanitizing large amounts of data is tedious and may often require highly trained personnel with clearances and/or other qualifications. In the systems and methods of the present disclosure, language models (LMs) are used to solve these and other technical issues with tools that may allow sanitizing data easily, with high versatility, context awareness, and/or low demand for computational resources. In particular, some of the disclosed systems and methods use a first language model and a second language model (being less resource-intensive than the first language model) to generate sanitized output data with improved efficiency and accuracy. This dual-model approach ensures that sensitive information is handled appropriately while optimizing computer resource usage.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.