Patent · US Active

Automated sensitive data classification in computerized databases

US11941135B2 · kind B2 · utility

0Cited by
5References
12Claims
0Family size

Assignee

Inventors

Key dates

Filing dateAug 23, 2019
Grant dateMar 26, 2024
Priority date
Expiry dateSep 28, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F21/6245
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Automated classification of sensitive data in a database, which includes: Retrieving a catalog of a database. Sampling record values from at least some of the columns. Generating a map of probable associations between different columns of tables of the database. Applying a machine learning classifier to the sampled record values, to classify the columns of the sampled records into multiple data classes, some being sensitive data classes. Classifying columns of non-sampled record values according to the classification of the sampled record values, based on the map. Searching all objects of the database for existence of record values of the classified columns, to output value and field name pairs. Scoring the pairs according to a measure of their repetitiveness in the output. Increasing the score of the pairs whose field names are similar. Based on the scores, indicating which fields of the database are likely to include sensitive data.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.