Real-time identification of data candidates for classification based compression
US9239842B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | May 5, 2015 |
| Grant date | Jan 19, 2016 |
| Priority date | — |
| Expiry date | May 5, 2035 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/285
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Identification of data candidates for data processing is performed in real time by a processor device in a computing environment. Data candidates are sampled for performing a classification-based compression upon the data candidates. A heuristic is computed on a randomly selected data sample from the data candidate for determining if the data candidate may benefit from the classification-based compression, wherein a ratio is summed between the actual number of the characters and the expected number of the characters, and then dividing the ratio by a number of the data classes that are not empty, wherein the non-classifiable data are included in the number of the data classes during the dividing, and the number of the data classes, that are not empty, have characters that belong to the class that were observed in the input; and the classification-based compression is performed on the data candidates if the ratio exceeds a threshold.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.