Patent · US Expired

Method and system for clustering data in parallel in a distributed-memory multiprocessor system

US6269376A · kind A · utility

46Cited by
9References
42Claims
0Family size

Assignee

Inventors

Key dates

Filing dateOct 26, 1998
Grant dateJul 31, 2001
Priority date
Expiry dateOct 26, 2018

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99943
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method, apparatus, article of manufacture, and a memory structure for clustering data points in parallel using a distributed-memory multi-processor system is disclosed. The disclosed system has particularly advantageous application to a rapid and flexible k-means computation for data mining. The method comprises the steps of dividing a set of data points into a plurality of data blocks, initializing a set of k global centroid values in each of the data blocks k initial global centroid values, performing a plurality of asynchronous processes on the data blocks, each asynchronous process assigning each data point in each data block to the closest global centroid value within each data block, computing a set of k block accumulation values from the data points assigned to the k global centroid values, and recomputing the k global centroid values from the k block accumulation values.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.