Patent · US Expired

Method and system for generating a decision-tree classifier in parallel in a multi-processor system

US5870735A · kind A · utility

88Cited by
0References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMay 1, 1996
Grant dateFeb 9, 1999
Priority date
Expiry dateMay 1, 2016

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99944
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method and system are disclosed for generating a decision-tree classifier in parallel in a multi-processor system, from a training set of records. The method comprises the steps of: partitioning the records among the processors, each processor generating an attribute list for each attribute, and the processors cooperatively generating a decision tree by repeatedly partitioning the records using the attribute lists. For each node, each processor determines its best split test and, along with other processors, selects the best overall split for the records at that node. Preferably, the gini-index and class histograms are used in determining the best splits. Also, each processor builds a hash table using the attribute list of the split attribute and shares it with other processors. The hash tables are used for splitting the remaining attribute lists. The created tree is then pruned based on the MDL principle, which encodes the tree and split tests in an MDL-based code, and determines whether to prune and how to prune each node based on the code length of the node.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.