Patent · US Active

System and method for clustering data in input and output spaces

US9116974B2 · kind B2 · utility

5Cited by
7References
17Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 15, 2013
Grant dateAug 25, 2015
Priority date
Expiry dateOct 9, 2033

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/35
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method of clustering a plurality of documents having input and output space data is disclosed that uses both input and output space criteria. The method can include aggregating documents into clusters based on input and/or output space similarity measures, and then refining the clusters based on further input and/or output space similarity measures. Aggregating the documents into clusters can include forming a hierarchical tree based on the input and/or output space similarity measures where the hierarchical tree has a root node, branching into intermediate nodes, and branching into leaf nodes covering individual documents, where the hierarchical tree includes a leaf node for each document of the plurality of documents. The method can then include forming a forest of sub-trees of the hierarchical tree based on cluster criteria. Textual and numeric similarity measures can be used depending on the type and distribution of data in the input and output spaces.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.