Patent · US Expired

Method and apparatus for classification of high dimensional data

US6563952B1 · kind B1 · utility

12Cited by
5References
15Claims
0Family size

Assignee

Inventors

Key dates

Filing dateOct 18, 1999
Grant dateMay 13, 2003
Priority date
Expiry dateOct 18, 2019

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F18/24147
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

The present invention is an apparatus and method for classifying high-dimensional sparse datasets. A raw data training set is flattened by converting it from categorical representation to a boolean representation. The flattened data is then used to build a class model on which new data not in the training set may be classified. In one embodiment, the class model takes the form of a decision tree, and large itemsets and cluster information are used as attributes for classification. In another embodiment, the class model is based on the nearest neighbors of the data to be classified. An advantage of the invention is that, by flattening the data, classification accuracy is increased by eliminating artificial ordering induced on the attributes. Another advantage is that the use of large itemsets and clustering increases classification accuracy.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.