Data pipeline validation
US12314157B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jul 10, 2023 |
| Grant date | May 27, 2025 |
| Priority date | — |
| Expiry date | Nov 23, 2043 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F11/3684
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A data pipeline validation system and method configured to partially automate testing of data pipelines in a distributed computing environment. The system includes a data pipeline analytic device equipped with various modules, such as a query generation module, data frame comparison module, and metadata management module. The query generation module employs natural language processing techniques to analyze configuration entries and dynamically generate SQL queries tailored to specific test cases. The data frame comparison module compares the results of different test cases using distributed collections, enabling parallel processing and efficient result comparison. The metadata management module captures and stores relevant metadata for traceability and auditing purposes. The system facilitates comprehensive validation of data pipelines, enabling organizations to ensure the accuracy, reliability, and integrity of data.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.