Semi-supervised hybrid clustering/classification system
US11023710B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Feb 20, 2019 |
| Grant date | Jun 1, 2021 |
| Priority date | — |
| Expiry date | Jun 23, 2039 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V40/172
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
System and method for classifying data objects occurring in an unstructured dataset, comprising: extracting feature vectors from the unstructured dataset, each feature vector representing an occurrence of a data object in the unstructured dataset; classifying the feature vectors into feature vector sets that each correspond to a respective object class from a plurality of object classes; for each feature vector set: performing multiple iterations of a clustering operation, each iteration including clustering feature vectors from the feature vector set into clusters of similar feature vectors and identifying outlier feature vectors, wherein for at least one iteration after a first iteration of the clustering operation, outlier feature vectors identified in a previous iteration are excluded from the clustering operation; and outputting a key cluster for the feature vector set from a final iteration of the multiple iterations, the key cluster including a greater number of similar feature vectors than any of the other clusters of the final iteration; and assembling a dataset that includes at least the feature vectors from the key clusters of the feature vector sets.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.