Patent · US Active

System and method using a large language model (LLM) and/or regular expressions for feature extractions from unstructured or semi-structured data to generate ontological graph

US12231456B2 · kind B2 · utility

1Cited by
2References
17Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJul 28, 2023
Grant dateFeb 18, 2025
Priority date
Expiry dateJul 28, 2043

Classification

  • Technology area (CPC H)Electricity
  • CPC primaryH04L63/1491
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system and method are provided for generating a cybersecurity behavioral graph from a log files and/or other telemetry data, which can be unstructured or semi-structured data. The log files are applied to a machine learning (ML) model (e.g., a large language model (LLM)) that generates/extract from the log files entities and relationships between said entities. The entities and relationships can be constrained using a cybersecurity ontology or schema to ensure that the results are meaningful to a cybersecurity context. A graph is then generated by mapping the extracted entities to nodes in the graph and the relationships to edges connecting nodes. To more efficiently extract the entities and relationships from the data file, an LLM is used to generate regular expressions for the format of the log files. Once generated, the regular expressions can rapidly parse the log files to extract the entities and relationships.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.