Patent · US Active

Apparatus and method for identifying similarity via dynamic decimation of token sequence N-grams

US9910985B2 · kind B2 · utility

3Cited by
7References
21Claims
0Family size

Assignee

Inventor

Key dates

Filing dateJun 30, 2015
Grant dateMar 6, 2018
Priority date
Expiry dateAug 6, 2035

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/284
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

An apparatus for identifying related code variants or text samples includes processing circuitry configured to execute instructions for receiving query binary code, processing the query binary code to generate one or more query code fingerprints comprising compressed representations of respective functional components of the query binary code, generating token sequence n-grams of the fingerprints, hashing the n-grams, partitioning samples by length to compare selected samples based on length, and identifying similarity via dynamic decimation of token sequence n-grams.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.