Intelligent generation of code for imputation of missing data in a machine learning dataset
US12014157B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Aug 30, 2022 |
| Grant date | Jun 18, 2024 |
| Priority date | — |
| Expiry date | Dec 16, 2042 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N20/20
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Methods and apparatuses are described for intelligent imputation of missing data in a machine learning (ML) dataset comprised of a plurality of features. Each feature includes a plurality of values, where at least a portion of the values for one or more features are missing. A server analyzes the ML dataset to generate characteristics of the missing values in the ML dataset. The server selects an imputation algorithm for filling in the missing values based upon the identified characteristics. The server determines a computing environment in which the imputation algorithm is executed based upon one or more of a size of the ML dataset or the selected algorithm. The server generates code that comprises instructions for executing the imputation algorithm on the ML dataset in the computing environment. The server integrates the code into an ML platform that executes the code to assign replacement values to the missing values.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.