Pre-training molecule embedding GNNs using contrastive learning based on scaffolding
US12327616B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 30, 2022 |
| Grant date | Jun 10, 2025 |
| Priority date | — |
| Expiry date | Feb 24, 2044 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG16C20/70
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Systems and methods are provided for generating a training dataset for training a molecule embedding module using contrastive learning, wherein the definition of similarity is based on molecular scaffold similarity. For example, systems access a molecular dataset and separate the molecular dataset into positive samples and negative samples. Systems then generate a training dataset comprising the positive samples and negative samples. Systems and methods are also provided for using the trained molecule embedding module to generate molecule embeddings and for building an end-to-end machine learning model configured to perform molecular embedding analysis and molecular property prediction, the model comprising the trained molecule embedding module and a property prediction module.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.