Training language models and preserving privacy
US12412038B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Feb 23, 2023 |
| Grant date | Sep 9, 2025 |
| Priority date | — |
| Expiry date | Nov 9, 2043 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/274
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
In implementations of systems for training language models and preserving privacy, a computing device implements a privacy system to predict a next word after a last word in a sequence of words by processing input data using a machine learning model trained on training data to predict next words after last words in sequences of words. The training data describes a corpus of text associated with clients and including sensitive samples and non-sensitive samples. The machine learning model is trained by sampling a client of the clients and using a subset of the sensitive samples associated with the client and a subset of the non-sensitive samples associated with the client to update parameters of the machine learning model. The privacy system generates an indication of the next word after the last word in the sequence of words for display in a user interface.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.