Patent · US Active

Tracking missing data using provenance traces and data simulation

US10740209B2 · kind B2 · utility

0Cited by
9References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateAug 20, 2018
Grant dateAug 11, 2020
Priority date
Expiry dateSep 29, 2038

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N20/00
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods, systems, and computer program products for tracking missing data using provenance traces and data simulation are provided herein. A computer-implemented method includes generating, for each of multiple stages in a data curation sequence, a machine learning model of the data curation sequence, wherein the model is based on historical input records within the data curation sequence, historical output records within the data curation sequence, and provenance data within the data curation sequence; creating a simulated output record based on a detected anomaly corresponding to the data curation sequence; predicting the content of absent input records that precede the simulated output record in the data curation sequence and provenance data corresponding to the simulated output record; and outputting, to a user, in response to a query pertaining to the detected anomaly, the predicted input records and information relating the predicted input records to the detected anomaly.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.