Data correctness optimization
US11789914B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | May 20, 2020 |
| Grant date | Oct 17, 2023 |
| Priority date | — |
| Expiry date | May 20, 2040 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N20/00
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Method and system for providing ground truth dataset and use thereof for improving data correctness. Datasets comprising data elements are received from different sources, each data element including an identifier and at least one attribute value associated therewith. Data correctness values are determined for the attribute values, each associated with a probability that an attribute value is correct. Data element with single data correctness value is added to the ground truth dataset for each attribute value for each identifier with which a respective attribute value is associated based on the determined data correctness values for the attribute values, whereby the data correctness values in the ground truth dataset define probability distributions of data correctness for the attribute values. Data correctness values for attribute values of data elements of a new dataset can be determined based on overlapping data compared in the ground truth dataset and the new dataset.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.