Patent · US Expired

Methods and apparatus for selecting a data classification model using meta-learning

US6842751B1 · kind B1 · utility

27Cited by

9References

24Claims

0Family size

Assignee

International Business Machines Corporation · US

Inventors

Ricardo Vilalta · Stamford, US
Irina Rish · Rye Brook, US

Key dates

Filing date	Jul 31, 2000
Grant date	Jan 11, 2005
Priority date	—
Expiry date	Dec 20, 2021

Classification

Technology area (CPC Y)Emerging Cross-Sectional Technologies
CPC primaryY10S707/99945
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A data classification method and apparatus are disclosed for labeling unknown objects. The disclosed data classification system employs a model selection technique that characterizes domains and identifies the degree of match between the domain meta-features and the learning bias of the algorithm under analysis. An improved concept variation meta-feature or an average weighted distance meta-feature, or both, are used to fully discriminate learning performance, as well as conventional meta-features. The “concept variation” meta-feature measures the amount of concept variation or the degree of lack of structure of a concept. The present invention extends conventional notions of concept variation to allow for numeric and categorical features, and estimates the variation of the whole example population through a training sample. The “average weighted distance” meta-feature of the present invention measures the density of the distribution in the training set. While the concept variation meta-feature is high for a training set comprised of only two examples having different class labels, the average weighted distance can distinguish between examples that are too far apart or too close to…

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.