Cognitive data pseudonymization
US11574186B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Oct 31, 2019 |
| Grant date | Feb 7, 2023 |
| Priority date | — |
| Expiry date | Aug 24, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N5/02
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Computer systems, methods and program products for automating pseudonymization of personal identifying information (PII) using machine learning, metadata, and crowdsourcing patterns to identify and replace PII. Machine learning models are trained for classifying known column names or key names for processing, using metadata. Column or key names are classified to be unprocessed, anonymized or pseudonymized by a pseudonymizer without revealing PII or scrubbing data into a useless format. A library of crowdsourced patterns are utilized for matching PII to data values within column or key names and PII is mapped to replacement methods. Feedback from user annotations retrains the algorithms to improve classification accuracy and Deep Learning algorithms automate the identification of PII using regular expression generation to concisely articulate how pseudonymizers search for PII patterns within a data set. PII replacement is mapped consistently across entire data packages and the crowdsourced pattern library is updated with generated regular expressions.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.