Patent · US Active

Distributed random forest training with a predictor trained to balance tasks

US11625640B2 · kind B2 · utility

0Cited by

5References

20Claims

0Family size

Assignee

Cisco Technology, Inc. · US

Inventors

Radek Starosta · Praha, CZ
Jan Brabec · Praha, CZ
Lukas Machlica · Praha, CZ

Key dates

Filing date	Oct 5, 2018
Grant date	Apr 11, 2023
Priority date	—
Expiry date	Jul 4, 2041

Classification

Technology area (CPC G)Physics
CPC primaryG06N20/10
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

In one embodiment, a device distributes sets of training records from a training dataset for a random forest-based classifier among a plurality of workers of a computing cluster. Each worker determines whether it can perform a node split operation locally on the random forest by comparing a number of training records at the worker to a predefined threshold. The device determines, for each of the split operations, a data size and entropy measure of the training records to be used for the split operation. The device applies a machine learning-based predictor to the determined data size and entropy measure of the training records to be used for the split operation, to predict its completion time. The device coordinates the workers of the computing cluster to perform the node split operations in parallel such that the node split operations in a given batch are grouped based on their predicted completion times.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.