Method and system for generating a decision-tree classifier in parallel in a multi-processor system
US6138115A · kind A · utility
Assignee
Inventors
Key dates
| Filing date | Feb 5, 1999 |
| Grant date | Oct 24, 2000 |
| Priority date | — |
| Expiry date | Feb 5, 2019 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99944
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method and system are disclosed for generating a decision-tree classifier in parallel in a multi-processor system, from a training set of records. The method comprises the steps of: partitioning the records among the processors, each processor generating an attribute list for each attribute, and the processors cooperatively generating a decision tree by repeatedly partitioning the records using the attribute lists. For each node, each processor determines its best split test and, along with other processors, selects the best overall split for the records at that node. Preferably, the gini-index and class histograms are used in determining the best splits. Also, each processor builds a hash table using the attribute list of the split attribute and shares it with other processors. The hash tables are used for splitting the remaining attribute lists. The created tree is then pruned based on the MDL principle, which encodes the tree and split tests in an MDL-based code, and determines whether to prune and how to prune each node based on the code length of the node.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.