Scalable automatic data repair
US9619494B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | May 25, 2011 |
| Grant date | Apr 11, 2017 |
| Priority date | — |
| Expiry date | Nov 1, 2031 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/215
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A computer implemented method for generating a set of updates for a database comprising multiple records including erroneous, missing and inconsistent values, the method comprising using a set of partitioning functions for subdividing the records of the database into multiple subsets of records, allocating respective ones of the records to at least one subset according to a predetermined criteria for mapping records to subsets, applying multiple machine learning models to each of the subsets to determine respective candidate replacement values representing a tuple repair for a record including a probability of candidate and current values for the record, computing probabilities to select replacement values for the record from among the candidate replacement values which maximise the probability for values of the record for an updated database.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.