Official implementation of "Generating 3D Molecules for Target Protein Binding"

DIVE Lab, Texas A&M University

Last update: Dec 7, 2022

Related tags

Deep Learning GraphBP

Overview

Generating 3D Molecules for Target Protein Binding

This is the official implementation of the GraphBP method proposed in the following paper.

Meng Liu, Youzhi Luo, Kanji Uchino, Koji Maruhashi, and Shuiwang Ji. "Generating 3D Molecules for Target Protein Binding".

Requirements

We include key dependencies below. The versions we used are in the parentheses. Our detailed environmental setup is available in environment.yml.

PyTorch (1.9.0)
PyTorch Geometric (1.7.2)
rdkit-pypi (2021.9.3)
biopython (1.79)
openbabel (3.3.1)

Preparing Data

Download and extract the CrossDocked2020 dataset:

wget https://bits.csb.pitt.edu/files/crossdock2020/CrossDocked2020_v1.1.tgz -P data/crossdock2020/
tar -C data/crossdock2020/ -xzf data/crossdock2020/CrossDocked2020_v1.1.tgz
wget https://bits.csb.pitt.edu/files/it2_tt_0_lowrmsd_mols_train0_fixed.types -P data/crossdock2020/
wget https://bits.csb.pitt.edu/files/it2_tt_0_lowrmsd_mols_test0_fixed.types -P data/crossdock2020/

Note: (1) The unzipping process could take a lot of time. Unzipping on SSD is much faster!!! (2) Several samples in the training set cannot be processed by our code. Hence, we recommend replacing the it2_tt_0_lowrmsd_mols_train0_fixed.types file with a new one, where these samples are deleted. The new one is available here.

Split data files:

python scripts/split_sdf.py data/crossdock2020/it2_tt_0_lowrmsd_mols_train0_fixed.types data/crossdock2020
python scripts/split_sdf.py data/crossdock2020/it2_tt_0_lowrmsd_mols_test0_fixed.types data/crossdock2020

Run

Train GraphBP from scratch:

CUDA_VISIBLE_DEVICES=${you_gpu_id} python main.py

Note: GraphBP can be trained on a 48GB GPU with batchsize=16. Our trained model is avaliable here.

Generate atoms in the 3D space with the trained model:

CUDA_VISIBLE_DEVICES=${you_gpu_id} python main_gen.py

Postprocess and then save the generated molecules:

CUDA_VISIBLE_DEVICES=${you_gpu_id} python main_eval.py

Reference

@article{liu2022graphbp,
      title={Generating 3D Molecules for Target Protein Binding},
      author={Meng Liu and Youzhi Luo and Kanji Uchino and Koji Maruhashi and Shuiwang Ji},
      journal={arXiv preprint arXiv:2204.09410},
      year={2022},
}

Comments

Data preparation BUG

#

Dear Authors, when we run your code "python scripts/split_sdf.py data/crossdock2020/it2_tt_0_lowrmsd_mols_train0_fixed.types data/crossdock2020", we have downloaded and replaced the old train0_fixed.types file, but we also got one BUG as shown in the screen shoot. How can I solve this? Ask for help, thank you a lot!

opened by JackAILab 4
FileNotFoundError:No such file or directory: './data/crossdock2020/selected_test_targets.types'

Hi，thanks for your work! When I run the main_gen.py , the following ERROR message is displayed:

nohup: ignoring input Epoch: 33 Traceback (most recent call last): File "main_gen.py", line 30, in all_mol_dicts = runner.generate(num_gen, temperature=[node_temp, dist_temp, angle_temp, torsion_temp], max_atoms=max_atoms, min_atoms=min_atoms, focus_th=focus_th, contact_th=contact_th, add_final=True, known_binding_site=known_binding_site) File "/public/thw/GraphBP/GraphBP/runner.py", line 120, in generate data_lines = pd.read_csv( File "/home/anaconda/software/envs/GraphBP/lib/python3.8/site-packages/pandas/util/_decorators.py", line 311, in wrapper return func(*args, **kwargs) File "/home/anaconda/software/envs/GraphBP/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 680, in read_csv return _read(filepath_or_buffer, kwds) File "/home/anaconda/software/envs/GraphBP/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 575, in _read parser = TextFileReader(filepath_or_buffer, **kwds) File "/home/anaconda/software/envs/GraphBP/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 933, in init self._engine = self._make_engine(f, self.engine) File "/home/anaconda/software/envs/GraphBP/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 1217, in _make_engine self.handles = get_handle( # type: ignore[call-overload] File "/home/anaconda/software/envs/GraphBP/lib/python3.8/site-packages/pandas/io/common.py", line 789, in get_handle handle = open( FileNotFoundError: [Errno 2] No such file or directory: './data/crossdock2020/selected_test_targets.types'

Would you kindly help me to solve the problem ? Thanks !

opened by Hovi123123 4

RuntimeError: CUDA error: the provided PTX was compiled with an unsupported toolchain.

Hi, I met with the error while running: CUDA_VISIBLE_DEVICES=0 python main_gen.py

Traceback (most recent call last):
  File "main_gen.py", line 6, in <module>
    runner = Runner(conf)
  File "/home/yipyewmun/GitHub/GraphBP/GraphBP/runner.py", line 25, in __init__
    self.model = GraphBP(**conf['model'])
  File "/home/yipyewmun/GitHub/GraphBP/GraphBP/model/graphbp.py", line 40, in __init__
    self.feat_net = self.feat_net.to('cuda')
  File "/home/yipyewmun/anaconda3/envs/gen/lib/python3.8/site-packages/torch/nn/modules/module.py", line 852, in to
    return self._apply(convert)
  File "/home/yipyewmun/anaconda3/envs/gen/lib/python3.8/site-packages/torch/nn/modules/module.py", line 530, in _apply
    module._apply(fn)
  File "/home/yipyewmun/anaconda3/envs/gen/lib/python3.8/site-packages/torch/nn/modules/module.py", line 552, in _apply
    param_applied = fn(param)
  File "/home/yipyewmun/anaconda3/envs/gen/lib/python3.8/site-packages/torch/nn/modules/module.py", line 850, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
RuntimeError: CUDA error: the provided PTX was compiled with an unsupported toolchain.
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Any possible ideas on how I can resolve this? :)

opened by yipy0005 2

How many the training samples and the test samples and why not comapred with other sota models?

Hi,

I have noticed that the dataset consists of 500K samples after filtering which is described in your paper. However, the it2_tt_0_lowrmsd_mols_test0_fixed.types file includes more than 480K samples while it2_tt_0_lowrmsd_mols_test0_fixed.types file includes more than 230K samples.

You have selected 10 target proteins as LiGAN for test evaluation, but 90 protein-ligand pairs in the test set as reference. Does it mean that generated 100 samples for each target protein would be compared with these pairs?

Could you please tell me how many training samples can obtain the results in your paper?

Besides, why did you not compare the model with Luo's model (Luo, S., Guan, J., Ma, J., and Peng, J. A 3d generative model for structure-based drug design. In Thirty-Fifth Conference on Neural Information Processing Systems, 2021a), which you mentioned in the related works?

Thank you very much!

opened by Layne-Huang 1
Code for evaluation

Hi, thanks for the awesome work. I did not find the code to evaluate the generated molecules, e.g., the code to compute the affinity. Would you kindly provide this part of code?

opened by zaixizhang 1
The initial ligand

Nice Paper，could you tell me how to add the atom and atomic coordinate of the initial ligand in the sample of protein pocket? Just like smiles seq2seq, we give the prompt of sequence like CCCC. In this work, we give the initial ligand like atom and xyz.

opened by 1121091694 1
Very strange result

Thanks for the great work and sharing the code.

I run it according to the instruction using the provided train_epoch33 model. I test it on one pocket, I picked a line from the train file and put into the test_selected_targets.type as below: 1 0 1.90832 1A1C_MALDO_2_433_0/1m7y_A_rec.pdb 1A1C_MALDO_2_433_0/1m7y_ppg_uff2.sdf.gz #-4.20313

I run main_gen then main_eval.py. But main_eval will raise exception for every generated molecule by the line in bond_adder: Chem.SanitizeMol(rd_mol,Chem.SANITIZE_ALL^Chem.SANITIZE_KEKULIZE) So I modify it to Chem.SanitizeMol(rd_mol) Then everything is smooth.

However, the results look really strange. I can post some here for your reference.

Is there something I do it wrong in the step described above? Looking forward to your reply.

opened by simmed00 7

Owner

DIVE Lab, Texas A&M University

GitHub

Official implementation of AAAI-21 paper "Label Confusion Learning to Enhance Text Classification Models"

Description: This is the official implementation of our AAAI-21 accepted paper Label Confusion Learning to Enhance Text Classification Models. The str

101 Nov 25, 2022

Official PyTorch implementation for paper Context Matters: Graph-based Self-supervised Representation Learning for Medical Images

Context Matters: Graph-based Self-supervised Representation Learning for Medical Images Official PyTorch implementation for paper Context Matters: Gra

49 Nov 23, 2022

The official implementation of NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021]. https://arxiv.org/pdf/2101.12378.pdf

NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021] Release Notes The offical PyTorch implementation of NeMo, p

76 Nov 23, 2022

StyleGAN2-ADA - Official PyTorch implementation

Abstract: Training generative adversarial networks (GAN) using too little data typically leads to discriminator overfitting, causing training to diverge. We propose an adaptive discriminator augmentation mechanism that significantly stabilizes training in limited data regimes.

3.2k Dec 30, 2022

Official implementation of the ICLR 2021 paper

You Only Need Adversarial Supervision for Semantic Image Synthesis Official PyTorch implementation of the ICLR 2021 paper "You Only Need Adversarial S

272 Dec 28, 2022

Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

This is the official PyTorch implementation of our paper: "Joint Object Detection and Multi-Object Tracking with Graph Neural Networks". Our project website and video demos are here.

443 Dec 6, 2022

Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

CIPS -- Official Pytorch Implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis Requirements pip install -r requi

Multimodal Lab @ Samsung AI Center Moscow

201 Dec 21, 2022

Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement".

HiSD: Image-to-image Translation via Hierarchical Style Disentanglement Official pytorch implementation of paper "Image-to-image Translation

364 Dec 14, 2022

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

111 Dec 31, 2022

Official PyTorch Implementation of Unsupervised Learning of Scene Flow Estimation Fusing with Local Rigidity

UnRigidFlow This is the official PyTorch implementation of UnRigidFlow (IJCAI2019). Here are two sample results (~10MB gif for each) of our unsupervis

28 Nov 16, 2022

Official implementation of our paper "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" in Pytorch.

LLA: Loss-aware Label Assignment for Dense Pedestrian Detection This project provides an implementation for "LLA: Loss-aware Label Assignment for Dens

35 Dec 6, 2022

Official implementation of Self-supervised Graph Attention Networks (SuperGAT), ICLR 2021.

SuperGAT Official implementation of Self-supervised Graph Attention Networks (SuperGAT). This model is presented at How to Find Your Friendly Neighbor

127 Dec 28, 2022

An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

PyTorch implementation of SFNet This is the implementation of the paper "SFNet: Learning Object-aware Semantic Correspondence". For more information,

87 Dec 30, 2022

This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

BiPointNet: Binary Neural Network for Point Clouds Created by Haotong Qin, Zhongang Cai, Mingyuan Zhang, Yifu Ding, Haiyu Zhao, Shuai Yi, Xianglong Li

59 Dec 17, 2022

Official code implementation for "Personalized Federated Learning using Hypernetworks"

Personalized Federated Learning using Hypernetworks This is an official implementation of Personalized Federated Learning using Hypernetworks paper. [

121 Dec 25, 2022

StyleGAN2 - Official TensorFlow Implementation

10.1k Dec 28, 2022

Old Photo Restoration (Official PyTorch Implementation)

Bringing Old Photo Back to Life (CVPR 2020 oral)

11.3k Dec 30, 2022

Official implementation of "GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators" (NeurIPS 2020)

GS-WGAN This repository contains the implementation for GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators (NeurIPS

46 Nov 9, 2022

Official PyTorch implementation of Spatial Dependency Networks.

Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling Đorđe Miladinović Aleksandar Stanić Stefan Bauer Jürgen Schmid

34 Jan 19, 2022

Official implementation of "Generating 3D Molecules for Target Protein Binding"

Related tags

Overview

Generating 3D Molecules for Target Protein Binding

Requirements

Preparing Data

Run

Reference

Comments

Data preparation BUG

FileNotFoundError:No such file or directory: './data/crossdock2020/selected_test_targets.types'

RuntimeError: CUDA error: the provided PTX was compiled with an unsupported toolchain.

How many the training samples and the test samples and why not comapred with other sota models?

Code for evaluation

The initial ligand

Very strange result

Owner

DIVE Lab, Texas A&M University

Official implementation of AAAI-21 paper "Label Confusion Learning to Enhance Text Classification Models"

Official PyTorch implementation for paper Context Matters: Graph-based Self-supervised Representation Learning for Medical Images

The official implementation of NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021]. https://arxiv.org/pdf/2101.12378.pdf

StyleGAN2-ADA - Official PyTorch implementation

Official implementation of the ICLR 2021 paper

Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement".

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

Official PyTorch Implementation of Unsupervised Learning of Scene Flow Estimation Fusing with Local Rigidity

Official implementation of our paper "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" in Pytorch.

Official implementation of Self-supervised Graph Attention Networks (SuperGAT), ICLR 2021.

An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

Official code implementation for "Personalized Federated Learning using Hypernetworks"

StyleGAN2 - Official TensorFlow Implementation

Old Photo Restoration (Official PyTorch Implementation)

Official implementation of "GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators" (NeurIPS 2020)

Official PyTorch implementation of Spatial Dependency Networks.