System and method using a large language model (LLM) and/or regular expressions for feature extractions from unstructured or semi-structured data to generate ontological graph
US12231456B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jul 28, 2023 |
| Grant date | Feb 18, 2025 |
| Priority date | — |
| Expiry date | Jul 28, 2043 |
Classification
- Technology area (CPC H)Electricity
- CPC primaryH04L63/1491
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A system and method are provided for generating a cybersecurity behavioral graph from a log files and/or other telemetry data, which can be unstructured or semi-structured data. The log files are applied to a machine learning (ML) model (e.g., a large language model (LLM)) that generates/extract from the log files entities and relationships between said entities. The entities and relationships can be constrained using a cybersecurity ontology or schema to ensure that the results are meaningful to a cybersecurity context. A graph is then generated by mapping the extracted entities to nodes in the graph and the relationships to edges connecting nodes. To more efficiently extract the entities and relationships from the data file, an LLM is used to generate regular expressions for the format of the log files. Once generated, the regular expressions can rapidly parse the log files to extract the entities and relationships.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.