Computer system programmed to identify common subsequences in logs
US10664481B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 29, 2015 |
| Grant date | May 26, 2020 |
| Priority date | — |
| Expiry date | Jan 31, 2037 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N5/025
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A data processing method includes receiving a stream of digital data with a plurality of objects and, in response to receiving an object, tokenizing the object to create a tokenized object, and storing the tokenized object in a token database. The method further includes comparing the tokenized object to a plurality of other tokenized objects stored in the token database, computing a pattern associated with the tokenized object, storing the pattern in a pattern database, and managing a size of the pattern database by identifying, a subset of patterns that are eligible for deletion from the pattern database based on an age of each pattern, ranking each pattern of the subset based on a quality and a popularity metric, identifying, based on the ranking and from the subset, a second pattern and deleting the second pattern from the pattern database to produce an updated database.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.