AUC-maximized high-accuracy classifier for imbalanced datasets
US10970650B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | May 18, 2020 |
| Grant date | Apr 6, 2021 |
| Priority date | — |
| Expiry date | May 18, 2040 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N20/20
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
An AUC-maximized high-accuracy classification method and system for imbalanced datasets integrates an under-sampling-and-ensemble strategy, a true-outliers-removing strategy and a fake-outliers-concealing strategy, with the hope to effectively and robustly enhance both the AUC and the accuracy metrics in imbalanced classification. Applying under-sampling to construct multiple sub-datasets and assembling classification results of multiple classifiers greatly decline the risk of misclassification and lead to highly accurate and robust results in imbalanced classification task. Moreover, this invention pays attention to detect and identify extremely hidden outliers in a sub-dataset which includes a sub-majority dataset and the entire minority dataset. In this way, more hidden outliers can be located and thus exert less influence on the decision boundary, which contributes to both high AUC and accuracy. Furthermore, this invention proposes to conceal fake outliers when building decision boundary, which can achieve a higher classification accuracy of the majority class without changing that of the minority class.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.