Machine learning-based DNS request string representation with hash replacement
US11784964B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 10, 2021 |
| Grant date | Oct 10, 2023 |
| Priority date | — |
| Expiry date | Mar 10, 2042 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/30
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Techniques are described herein for using machine learning to learn vector representations of DNS requests such that the resulting embeddings represent the semantics of the DNS requests as a whole. Techniques described herein perform pre-processing of tokenized DNS request strings in which hashes, which are long and relatively random strings of characters, are detected in DNS request strings and each detected hash token is replaced with a placeholder token. A vectorizing ML model is trained using the pre-processed training dataset in which hash tokens have been replaced. Embeddings for the DNS tokens are derived from an intermediate layer of the vectorizing ML model. The encoding application creates final vector representations for each DNS request string by generating a weighted summation of the embeddings of all of the tokens in the DNS request string. Because of hash replacement, the resulting DNS request embeddings reflect semantics of the hashes as a group.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.