3D-Transformer: Molecular Representation with Transformer in 3D Space

Last update: Dec 19, 2022

Related tags

Deep Learning 3D-Transformer

Overview

Introduction

This is the repository for 3D-Transformer.

Dataset

Quantum Chemistry

QM7 Dataset
Download (Official Website): http://quantum-machine.org/datasets/
Discription (DeepChem): https://deepchem.readthedocs.io/en/latest/api_reference/moleculenet.html#qm7-datasets

QM8 Dataset
Download (DeepChem): https://github.com/deepchem/deepchem/blob/master/deepchem/molnet/load_function/qm8_datasets.py
Discription (DeepChem): https://deepchem.readthedocs.io/en/latest/api_reference/moleculenet.html?highlight=qm7#qm8-datasets

QM9 Dataset Download (Atom3D): https://www.atom3d.ai/smp.html
Download (Official Website): https://ndownloader.figshare.com/files/3195389
Download (MPNN Supplement): https://drive.google.com/file/d/0Bzn36Iqm8hZscHFJcVh5aC1mZFU/view?resourcekey=0-86oyPL3e3l2ZTiRpwtPDBg
Download (Schnet): https://schnetpack.readthedocs.io/en/stable/tutorials/tutorial_02_qm9.html#Loading-the-data

GEOM-QM9 Dataset Download (Official Website): https://doi.org/10.7910/DVN/JNGTDF Tutorial of usage: https://github.com/learningmatter-mit/geom/blob/master/tutorials/01_loading_data.ipynb

Material Science

COREMOF
Download (Google Drive): https://drive.google.com/drive/folders/1DMmjL-JNgUWQDU-52_DT_cX-XWNEEi-W?usp=sharing
Reproduction of PointNet++: python coremof/reproduce/main_pn_coremof.py
Reproduction of MPNN: python coremof/reproduce/main_mpnn_coremof.py
Repredoction of SchNet: (1) load COREMOF python coremof/reproduce/main_sch_coremof.py
(2) run SchNet spk_run.py train schnet custom ../../coremof.db ./coremof --split 900 100 --property LCD --features 16 --batch_size 20 --cuda
(Note: official script of Schnet cannot be reproduced successfully due to the memory limitation.)

Protein

PDBbind
Atom3d: https://github.com/drorlab/atom3d
(1) download 'split-by-sequence-identity-30' dataset from https://www.atom3d.ai/
(2) install atom3D pip install atom3d
(3) preprocess the data by running python pdbbind/dataloader_pdb.py

Models

models/tr_spe: 3D-Transformer with Sinusoidal Position Encoding (SPE)
models/tr_cpe: 3D-Transformer with Convolutional Position Encoding (CPE)
models/tr_msa: 3D-Transformer with Multi-scale Self-attention (MSA)
models/tr_afps: 3D-Transformer with Attentive Farthest Point Sampling (AFPS)
models/tr_full: 3D-Transformer with CPE + MAS + AFPS

You might also like...

Implementation of Learning Gradient Fields for Molecular Conformation Generation (ICML 2021).

[PDF] | [Slides] The official implementation of Learning Gradient Fields for Molecular Conformation Generation (ICML 2021 Long talk) Installation Inst

117 Dec 9, 2022

Kaggle | 9th place (part of) solution for the Bristol-Myers Squibb – Molecular Translation challenge

Part of the 9th place solution for the Bristol-Myers Squibb – Molecular Translation challenge translating images containing chemical structures into I

22 Nov 30, 2022

source code for https://arxiv.org/abs/2005.11248 "Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics"

Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics This work will be published in Nature Biomedical

71 Nov 15, 2022

Fast and scalable uncertainty quantification for neural molecular property prediction, accelerated optimization, and guided virtual screening.

Evidential Deep Learning for Guided Molecular Property Prediction and Discovery Ava Soleimany*, Alexander Amini*, Samuel Goldman*, Daniela Rus, Sangee

75 Dec 15, 2022

Code for the paper "JANUS: Parallel Tempered Genetic Algorithm Guided by Deep Neural Networks for Inverse Molecular Design"

JANUS: Parallel Tempered Genetic Algorithm Guided by Deep Neural Networks for Inverse Molecular Design This repository contains code for the paper: JA

55 Nov 29, 2022

DockStream: A Docking Wrapper to Enhance De Novo Molecular Design

DockStream Description DockStream is a docking wrapper providing access to a collection of ligand embedders and docking backends. Docking execution an

72 Jan 2, 2023

Molecular AutoEncoder in PyTorch

MolEncoder Molecular AutoEncoder in PyTorch Install $ git clone https://github.com/cxhernandez/molencoder.git && cd molencoder $ python setup.py insta

80 Dec 5, 2022

Automatic Differentiation Multipole Moment Molecular Forcefield

Automatic Differentiation Multipole Moment Molecular Forcefield Performance notes On a single gpu, using waterbox_31ang.pdb example from MPIDplugin wh

4 Jan 7, 2022

Python Rapid Artificial Intelligence Ab Initio Molecular Dynamics

14 Nov 6, 2022

Comments

type of dist_bar in tr_msa
Hi,

I have difficulties with launching build_model function from tr_msa module. Can you explain what is dist_bar parameter meaning and how it should be constructed?

Supposing it is a list of distances between atoms in pos from your example, I do as follows:

def dist(l, r): return ((l[0]-r[0])**2 + (l[1]-r[1])**2 + (l[2]-r[2])**2)**0.5 dst = [[dist(l, r) for r in data] for l in data]

, where data is 4x3 List[List[float]] of coordinates from pos tensor.

Then:

from model.tr_msa import build_model model = build_model(N, n, dst).cuda()

The model returned accept third argument dist as tensor, not list, so I do as follows:

tens_dist = torch.tensor(dst).cuda() out = model(x, mask, tens_dist)

, where x and mask as in the example code. The model computation fails with error:

RuntimeError: Expected 4-dimensional input for 4-dimensional weight [8, 1, 1, 1], but got 3-dimensional input of size [4, 1, 4] instead
opened by SimonTsirikov 5

incompatible matrixes' sizes when using tr_all

Hi,

With updated version I can succesfully launch either tr_spe, tr_msa and tr_afps separately, but not with tr_all:

Error log

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[<ipython-input-115-397097d7905a>](https://localhost:8080/#) in <module>()
      3 
      4
----> 5 out = model(x.long(), mask, dist)
      6 
      7 print(out)

8 frames
[/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

[/content/model/tr_all.py](https://localhost:8080/#) in forward(self, src, src_mask, dist)
     29 
     30     def forward(self, src, src_mask, dist):
---> 31         return self.generator(self.encoder(self.src_embed(src), dist, src_mask)[:, 0])

[/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

[/content/model/tr_spe.py](https://localhost:8080/#) in forward(self, x)
    269 
    270     def forward(self, x):
--> 271         return self.proj(x)

[/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

[/usr/local/lib/python3.7/dist-packages/torch/nn/modules/container.py](https://localhost:8080/#) in forward(self, input)
    139     def forward(self, input):
    140         for module in self:
--> 141             input = module(input)
    142         return input
    143 

[/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

[/usr/local/lib/python3.7/dist-packages/torch/nn/modules/linear.py](https://localhost:8080/#) in forward(self, input)
    101 
    102     def forward(self, input: Tensor) -> Tensor:
--> 103         return F.linear(input, self.weight, self.bias)
    104 
    105     def extra_repr(self) -> str:

[/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py](https://localhost:8080/#) in linear(input, weight, bias)
   1846     if has_torch_function_variadic(input, weight, bias):
   1847         return handle_torch_function(linear, (input, weight, bias), input, weight, bias=bias)
-> 1848     return torch._C._nn.linear(input, weight, bias)
   1849 
   1850 

RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x1 and 512x512)

I have tried to investigate reasons and found that tr_afps and consequently tr_all have slightly different last part of encoder layers than others (information is got with print(model)):

AFPS

(norm): LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
(sublayer): SublayerConnection(
  (norm): LayerNorm()
  (dropout): Dropout(p=0.1, inplace=False)
)

others

(sublayer): ModuleList(
  (0): SublayerConnection(
    (norm): LayerNorm()
    (dropout): Dropout(p=0.1, inplace=False)
  )
  (1): SublayerConnection(
    (norm): LayerNorm()
    (dropout): Dropout(p=0.1, inplace=False)
  )
)

I don't know is it really matter and how to fix it, a problem could be in a different place, but I supposed that there is a trouble in the connection between modules.

opened by SimonTsirikov 1

about Molformer

hello, thanks for your harding work, i am reading your new work about Molformer project. But i can not get the project at https://github.com/smiles724/Molformer.

opened by JWSunny 1
replicated script request

Do you plan to share a replicated script like PA-Graph-Transformer did. For example, a train.py script for the QM7 dataset and achieve around your best score 43.9

opened by veya2ztn 1

Owner

NLP learner and researcher, a master's student at Columbia University

GitHub

Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models

Molecular Sets (MOSES): A benchmarking platform for molecular generation models Deep generative models are rapidly becoming popular for the discovery

656 Dec 29, 2022

MolRep: A Deep Representation Learning Library for Molecular Property Prediction

MolRep: A Deep Representation Learning Library for Molecular Property Prediction Summary MolRep is a Python package for fairly measuring algorithmic p

83 Dec 24, 2022

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

191 Dec 31, 2022

Space robot - (Course Project) Using the space robot to capture the target satellite that is disabled and spinning, then stabilize and fix it up

3 Jan 7, 2022

This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

212 Dec 25, 2022

[SIGIR22] Official PyTorch implementation for "CORE: Simple and Effective Session-based Recommendation within Consistent Representation Space".

CORE This is the official PyTorch implementation for the paper: Yupeng Hou, Binbin Hu, Zhiqiang Zhang, Wayne Xin Zhao. CORE: Simple and Effective Sess

26 Dec 19, 2022