Binning predictors using per-predictor trees and MDL pruning
US8280915B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Feb 1, 2006 |
| Grant date | Oct 2, 2012 |
| Priority date | — |
| Expiry date | Feb 28, 2027 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F18/24323
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Binning of predictor values used for generating a data mining model provides useful reduction in memory footprint and computation during the computationally dominant decision tree build phase, but reduces the information loss of the model and reduces the introduction of false information artifacts. A method of binning data in a database for data mining modeling in a database system, the data stored in a database table in the database system, the data mining modeling having selected at least one predictor and one target for the data, the data including a plurality of values of the predictor and a plurality of values of the target, the method comprises constructing a binary tree for the predictor that splits the values of the predictor into a plurality of portions, pruning the binary tree, and defining as bins of the predictor leaves of the tree that remain after pruning, each leaf of the tree representing a portion of the values of the predictor.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.