Patent · US Active

Cross-guided data clustering based on alignment between data domains

US8589396B2 · kind B2 · utility

7Cited by
2References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJan 6, 2010
Grant dateNov 19, 2013
Priority date
Expiry dateMay 9, 2032

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F18/2323
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system and associated method for cross-guided data clustering by aligning target clusters in a target domain to source clusters in a source domain. The cross-guided clustering process takes the target domain and the source domain as inputs. A common word attribute shared by both the target domain and the source domain is a pivot vocabulary, and all other words in both domains are a non-pivot vocabulary. The non-pivot vocabulary is projected onto the pivot vocabulary to improve measurement of similarity between data items. Source centroids representing clusters in the source domain are created and projected to the pivot vocabulary. Target centroids representing clusters in the target domain are initially created by conventional clustering method and then repetitively aligned to converge with the source centroids by use of a cross-domain similarity graph that measures a respective similarity of each target centroid to each source centroid.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.