Patent · US Active

Data-analysis-based, noisy labeled and unlabeled datapoint detection and rectification for machine-learning

US11853908B2 · kind B2 · utility

0Cited by
0References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMay 13, 2020
Grant dateDec 26, 2023
Priority date
Expiry dateOct 27, 2042

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N7/01
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Noisy labeled and unlabeled datapoint detection and rectification in a training dataset for machine-learning is facilitated by a processor(s) obtaining a training dataset for use in training a machine-learning model. The processor(s) applies ensemble machine-learning and a generative model to the training dataset to detect noisy labeled datapoints in the training dataset, and create a clean dataset with preliminary labels added for any unlabeled datapoints in the training dataset. Data-driven active learning and the clean dataset are used by the processor(s) to facilitate generating an active-learned dataset with true labels added for one or more selected datapoints of a datapoint pool including the detected noisy labeled datapoints and the unlabeled datapoints of the training dataset. The machine-learning model is trained by the processor(s) using, at least in part, the clean dataset and the active-learned dataset.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.