Data-analysis-based, noisy labeled and unlabeled datapoint detection and rectification for machine-learning
US11853908B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | May 13, 2020 |
| Grant date | Dec 26, 2023 |
| Priority date | — |
| Expiry date | Oct 27, 2042 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N7/01
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Noisy labeled and unlabeled datapoint detection and rectification in a training dataset for machine-learning is facilitated by a processor(s) obtaining a training dataset for use in training a machine-learning model. The processor(s) applies ensemble machine-learning and a generative model to the training dataset to detect noisy labeled datapoints in the training dataset, and create a clean dataset with preliminary labels added for any unlabeled datapoints in the training dataset. Data-driven active learning and the clean dataset are used by the processor(s) to facilitate generating an active-learned dataset with true labels added for one or more selected datapoints of a datapoint pool including the detected noisy labeled datapoints and the unlabeled datapoints of the training dataset. The machine-learning model is trained by the processor(s) using, at least in part, the clean dataset and the active-learned dataset.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.