Patent · US Active

Representing source code in vector space to detect errors

US11334467B2 · kind B2 · utility

0Cited by
5References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMay 3, 2019
Grant dateMay 17, 2022
Priority date
Expiry dateFeb 4, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N3/0895
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A computer-implemented method, system and computer program product for representing source code in vector space. The source code is parsed into an abstract syntax tree, which is then traversed to produce a sequence of tokens. Token embeddings may then be constructed for a subset of the sequence of tokens, which are inputted into an encoder artificial neural network (“encoder”) for encoding the token embeddings. A decoder artificial neural network (“decoder”) is initialized with a final internal cell state of the encoder. The decoder is run the same number of steps as the encoding performed by the encoder. After running the decoder and completing the training of the decoder to learn the inputted token embeddings, the final internal cell state of the encoder is used as the code representation vector which may be used to detect errors in the source code.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.