Feature selection for deviation analysis
US12079196B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Oct 8, 2021 |
| Grant date | Sep 3, 2024 |
| Priority date | — |
| Expiry date | Apr 8, 2042 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N20/00
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
The present disclosure provides for accurate and efficient identification of candidate features for an input dataset comprising one or more continuous features and one or more categorical features is obtained. A number of categorical feature categories based on the one or more categorical features is determined. Record counts for each of the categorical feature categories are determined. Skew statistics for each category are determined based on the record counts for each of the categorical feature categories. Cardinality skew factors for each of the one or more categorical features are then determined based on the record counts and the skew statistics. A number of the one or more categorical features having the highest cardinality skew factors are selected from among the cardinality skew factors. Then, a top contributor deviation analysis is performed using the selected number of the categorical features having the highest cardinality skew factors.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.