Patent · US Active

Synthetic deidentified test data

US11392487B2 · kind B2 · utility

1Cited by
2References
14Claims
0Family size

Assignee

Inventors

Key dates

Filing dateNov 16, 2020
Grant dateJul 19, 2022
Priority date
Expiry dateNov 16, 2040

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F11/3688
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Embodiments include a method for one or more processors to receive an organic dataset and a domain knowledge base. The one or more processors identify private data entities present within the organic dataset. The one or more processors determine statistical properties of the private data entities identified within the organic dataset. The one or more processors create a plurality of test data templates by removing the private data entities from the organic dataset. The one or more processors select from the domain knowledge base, synthetic data entities that match a data type of the removed private data entities, respectively, and align with the statistical properties of the private data entities, and the one or more processors generate synthetic test data by inserting, respectively, the synthetic data entities of the matching data type for the removed private data entities in the test data templates.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.