Automated selection and ordering of data quality rules during data ingestion
US12353375B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Nov 9, 2023 |
| Grant date | Jul 8, 2025 |
| Priority date | — |
| Expiry date | Nov 9, 2043 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/215
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Selecting and ordering the execution of data quality rules includes generating a snapshot of a table-formatted dataset. The snapshot comprises a reduced number of rows of the dataset such that each column variation of the dataset is included in the snapshot. A predetermined collection of data quality (DQ) rules is executed on the snapshot. One or more performance statistics is determined for each of the DQ rules. The performance statistics indicate a likelihood that a DQ rule determines a data quality deficiency. Based on the performance statistics, a subset of the DQ rules is generated. Each DQ rule of the subset is selected based on the likelihood that the DQ rule selected detects a quality deficiency. An ordered subset of selected DQ rules is generated by ordering the application of each of the subset of DQ rules selected. The ordering specifies a sequence for executing each selected DQ rule.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.