Patent · US Active

Automated selection and ordering of data quality rules during data ingestion

US12353375B2 · kind B2 · utility

0Cited by
9References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateNov 9, 2023
Grant dateJul 8, 2025
Priority date
Expiry dateNov 9, 2043

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/215
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Selecting and ordering the execution of data quality rules includes generating a snapshot of a table-formatted dataset. The snapshot comprises a reduced number of rows of the dataset such that each column variation of the dataset is included in the snapshot. A predetermined collection of data quality (DQ) rules is executed on the snapshot. One or more performance statistics is determined for each of the DQ rules. The performance statistics indicate a likelihood that a DQ rule determines a data quality deficiency. Based on the performance statistics, a subset of the DQ rules is generated. Each DQ rule of the subset is selected based on the likelihood that the DQ rule selected detects a quality deficiency. An ordered subset of selected DQ rules is generated by ordering the application of each of the subset of DQ rules selected. The ordering specifies a sequence for executing each selected DQ rule.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.