Patent · US Active

Generating vector representations of code capturing semantic similarity

US11238306B2 · kind B2 · utility

0Cited by
4References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateSep 27, 2018
Grant dateFeb 1, 2022
Priority date
Expiry dateDec 3, 2040

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N20/00
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method, system and computer program product for obtaining vector representations of code snippets capturing semantic similarity. A first and second training set of code snippets are collected, where the first training set of code snippets implements the same function representing semantic similarity and the second training set of code snippets implements a different function representing semantic dissimilarity. A vector representation of a first and second code snippet from either the first or second training set of code snippets is generated using a machine learning model. A loss value is generated utilizing a loss function that is proportional or inverse to the distance between the first and second vectors in response to receiving the first and second code snippets from the first or second training set of code snippets, respectively. The machine learning model is trained to capture the semantic similarity in the code snippets by minimizing the loss value.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.