Patent · US Active

Apparatus and method for identifying similarity via dynamic decimation of token sequence n-grams

US9111095B2 · kind B2 · utility

8Cited by
5References
19Claims
0Family size

Assignee

Inventor

Key dates

Filing dateApr 9, 2014
Grant dateAug 18, 2015
Priority date
Expiry dateApr 19, 2034

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/284
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

An apparatus for identifying related code variants or text samples includes processing circuitry configured to execute instructions for receiving query binary code, processing the query binary code to generate one or more query code fingerprints comprising compressed representations of respective functional components of the query binary code, generating token sequence n-grams of the fingerprints, hashing the n-grams, partitioning samples by length to compare selected samples based on length, and identifying similarity via dynamic decimation of token sequence n-grams.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.