Data sampling for model exploration utilizing a plurality of machine learning models
US11704566B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jun 20, 2019 |
| Grant date | Jul 18, 2023 |
| Priority date | — |
| Expiry date | Feb 4, 2042 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N20/20
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
The disclosed embodiments provide a system for processing data. During operation, the system obtains a training dataset containing a first set of records associated with a first set of identifier (ID) values and an evaluation dataset containing a second set of records associated with a second set of ID values. Next, the system selects a random subset of ID values from the second set of ID values. The system then generates a sampled evaluation dataset comprising a first subset of records associated with the random subset of ID values in the second set of records. The system also generates a sampled training dataset comprising a second subset of records associated with the random subset of ID values in the first set of records. Finally, the system outputs the sampled training dataset and the sampled evaluation dataset for use in training and evaluating a machine learning model.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.