Generation and use of trained file classifiers for malware detection
US10062038B1 · kind B1 · utility
Assignee
Inventor
Key dates
| Filing date | May 31, 2017 |
| Grant date | Aug 28, 2018 |
| Priority date | — |
| Expiry date | May 31, 2037 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F2221/033
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method includes accessing information identifying multiple files and identifying classification data for the multiple files, where the classification data indicates, for a particular file of the multiple files, whether the particular file includes malware. The method also includes generating a sequence of entropy indicators for each of the multiple files, each entropy indicator of the sequence of entropy indicators for the particular file corresponding to a chunk of the particular file. The method further includes generating n-gram vectors for the multiple files, where the n-gram vector for the particular file indicates occurrences of groups of entropy indicators in the sequence of entropy indicators for the particular file. The method also includes generating and storing a file classifier using the n-gram vectors and the classification data as supervised training data.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.