Patent · US Active

System and method for analyzing result of clustering massive data

US10402427B2 · kind B2 · utility

0Cited by
4References
5Claims
0Family size

Assignee

Inventors

Key dates

Filing dateOct 31, 2012
Grant dateSep 3, 2019
Priority date
Expiry dateOct 31, 2032

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F18/231
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Disclosed are a system and a method for analyzing a result of clustering massive data. An open-source map/reduce framework named Hadoop is used to calculate a silhouette coefficient corresponding to a significance verification index capable of evaluating a result of clustering massive data. To implement the system and the method for analyzing a result of clustering massive data, clustered data is divided into blocks. For all of the blocks, input splits are generated. Then, the generated input splits are assigned to multiple computers. Each computer stores only data of blocks included in an input split assigned in a memory, and calculates a silhouette coefficient for each record. Each computer provides only the calculated silhouette coefficient to an index coefficient calculation apparatus, and enables the index coefficient calculation apparatus to calculate a silhouette coefficient for a cluster. Therefore, the result of clustering the massive data can be rapidly and objectively analyzed.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.