Patent · US Active

Intelligent generation of code for imputation of missing data in a machine learning dataset

US12014157B2 · kind B2 · utility

0Cited by
7References
28Claims
0Family size

Assignee

Inventors

Key dates

Filing dateAug 30, 2022
Grant dateJun 18, 2024
Priority date
Expiry dateDec 16, 2042

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N20/20
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods and apparatuses are described for intelligent imputation of missing data in a machine learning (ML) dataset comprised of a plurality of features. Each feature includes a plurality of values, where at least a portion of the values for one or more features are missing. A server analyzes the ML dataset to generate characteristics of the missing values in the ML dataset. The server selects an imputation algorithm for filling in the missing values based upon the identified characteristics. The server determines a computing environment in which the imputation algorithm is executed based upon one or more of a size of the ML dataset or the selected algorithm. The server generates code that comprises instructions for executing the imputation algorithm on the ML dataset in the computing environment. The server integrates the code into an ML platform that executes the code to assign replacement values to the missing values.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.