Measuring confidence of file clustering and clustering based file classification
US8214365B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Feb 28, 2011 |
| Grant date | Jul 3, 2012 |
| Priority date | — |
| Expiry date | Mar 31, 2031 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V30/268
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A uniformity of a cluster of samples is determined, and a corresponding raw confidence value is calculated. A confidence interval weight is calculated using a confidence interval to determine reliability of the uniformity. A trace length weight is calculated, as a function of traces of the samples. An n-gram weight is calculated, as a function of numbers of n-grams generated by the samples. A compactness weight is calculated, as a function of the similarity of the samples. A cluster weight is calculated as a function of the four above-described weights. A cluster confidence measurement is calculated as a function of the cluster weight and the raw confidence value. When a new sample is assigned to the cluster, an assignment confidence measurement is calculated, as a function of the cluster's confidence measurement and the sample's trace length, n-grams and similarity.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.