Patent · US Active

Pre-training molecule embedding GNNs using contrastive learning based on scaffolding

US12327616B2 · kind B2 · utility

0Cited by
8References
8Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 30, 2022
Grant dateJun 10, 2025
Priority date
Expiry dateFeb 24, 2044

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG16C20/70
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Systems and methods are provided for generating a training dataset for training a molecule embedding module using contrastive learning, wherein the definition of similarity is based on molecular scaffold similarity. For example, systems access a molecular dataset and separate the molecular dataset into positive samples and negative samples. Systems then generate a training dataset comprising the positive samples and negative samples. Systems and methods are also provided for using the trained molecule embedding module to generate molecule embeddings and for building an end-to-end machine learning model configured to perform molecular embedding analysis and molecular property prediction, the model comprising the trained molecule embedding module and a property prediction module.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.