Cross-guided data clustering based on alignment between data domains
US8589396B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jan 6, 2010 |
| Grant date | Nov 19, 2013 |
| Priority date | — |
| Expiry date | May 9, 2032 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F18/2323
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A system and associated method for cross-guided data clustering by aligning target clusters in a target domain to source clusters in a source domain. The cross-guided clustering process takes the target domain and the source domain as inputs. A common word attribute shared by both the target domain and the source domain is a pivot vocabulary, and all other words in both domains are a non-pivot vocabulary. The non-pivot vocabulary is projected onto the pivot vocabulary to improve measurement of similarity between data items. Source centroids representing clusters in the source domain are created and projected to the pivot vocabulary. Target centroids representing clusters in the target domain are initially created by conventional clustering method and then repetitively aligned to converge with the source centroids by use of a cross-domain similarity graph that measures a respective similarity of each target centroid to each source centroid.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.