Official implementation of "Generating 3D Molecules for Target Protein Binding"

Overview

Generating 3D Molecules for Target Protein Binding

This is the official implementation of the GraphBP method proposed in the following paper.

Meng Liu, Youzhi Luo, Kanji Uchino, Koji Maruhashi, and Shuiwang Ji. "Generating 3D Molecules for Target Protein Binding".

Requirements

We include key dependencies below. The versions we used are in the parentheses. Our detailed environmental setup is available in environment.yml.

  • PyTorch (1.9.0)
  • PyTorch Geometric (1.7.2)
  • rdkit-pypi (2021.9.3)
  • biopython (1.79)
  • openbabel (3.3.1)

Preparing Data

  • Download and extract the CrossDocked2020 dataset:
wget https://bits.csb.pitt.edu/files/crossdock2020/CrossDocked2020_v1.1.tgz -P data/crossdock2020/
tar -C data/crossdock2020/ -xzf data/crossdock2020/CrossDocked2020_v1.1.tgz
wget https://bits.csb.pitt.edu/files/it2_tt_0_lowrmsd_mols_train0_fixed.types -P data/crossdock2020/
wget https://bits.csb.pitt.edu/files/it2_tt_0_lowrmsd_mols_test0_fixed.types -P data/crossdock2020/

Note: (1) The unzipping process could take a lot of time. Unzipping on SSD is much faster!!! (2) Several samples in the training set cannot be processed by our code. Hence, we recommend replacing the it2_tt_0_lowrmsd_mols_train0_fixed.types file with a new one, where these samples are deleted. The new one is available here.

  • Split data files:
python scripts/split_sdf.py data/crossdock2020/it2_tt_0_lowrmsd_mols_train0_fixed.types data/crossdock2020
python scripts/split_sdf.py data/crossdock2020/it2_tt_0_lowrmsd_mols_test0_fixed.types data/crossdock2020

Run

  • Train GraphBP from scratch:
CUDA_VISIBLE_DEVICES=${you_gpu_id} python main.py

Note: GraphBP can be trained on a 48GB GPU with batchsize=16. Our trained model is avaliable here.

  • Generate atoms in the 3D space with the trained model:
CUDA_VISIBLE_DEVICES=${you_gpu_id} python main_gen.py
  • Postprocess and then save the generated molecules:
CUDA_VISIBLE_DEVICES=${you_gpu_id} python main_eval.py

Reference

@article{liu2022graphbp,
      title={Generating 3D Molecules for Target Protein Binding},
      author={Meng Liu and Youzhi Luo and Kanji Uchino and Koji Maruhashi and Shuiwang Ji},
      journal={arXiv preprint arXiv:2204.09410},
      year={2022},
}
Comments
  • Data preparation BUG

    Data preparation BUG

    #image

    Dear Authors, when we run your code "python scripts/split_sdf.py data/crossdock2020/it2_tt_0_lowrmsd_mols_train0_fixed.types data/crossdock2020", we have downloaded and replaced the old train0_fixed.types file, but we also got one BUG as shown in the screen shoot. How can I solve this? Ask for help, thank you a lot!

    opened by JackAILab 4
  • FileNotFoundError:No such file or directory: './data/crossdock2020/selected_test_targets.types'

    FileNotFoundError:No such file or directory: './data/crossdock2020/selected_test_targets.types'

    Hi,thanks for your work! When I run the main_gen.py , the following ERROR message is displayed:

    nohup: ignoring input Epoch: 33 Traceback (most recent call last): File "main_gen.py", line 30, in all_mol_dicts = runner.generate(num_gen, temperature=[node_temp, dist_temp, angle_temp, torsion_temp], max_atoms=max_atoms, min_atoms=min_atoms, focus_th=focus_th, contact_th=contact_th, add_final=True, known_binding_site=known_binding_site) File "/public/thw/GraphBP/GraphBP/runner.py", line 120, in generate data_lines = pd.read_csv( File "/home/anaconda/software/envs/GraphBP/lib/python3.8/site-packages/pandas/util/_decorators.py", line 311, in wrapper return func(*args, **kwargs) File "/home/anaconda/software/envs/GraphBP/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 680, in read_csv return _read(filepath_or_buffer, kwds) File "/home/anaconda/software/envs/GraphBP/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 575, in _read parser = TextFileReader(filepath_or_buffer, **kwds) File "/home/anaconda/software/envs/GraphBP/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 933, in init self._engine = self._make_engine(f, self.engine) File "/home/anaconda/software/envs/GraphBP/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 1217, in _make_engine self.handles = get_handle( # type: ignore[call-overload] File "/home/anaconda/software/envs/GraphBP/lib/python3.8/site-packages/pandas/io/common.py", line 789, in get_handle handle = open( FileNotFoundError: [Errno 2] No such file or directory: './data/crossdock2020/selected_test_targets.types'

    Would you kindly help me to solve the problem ? Thanks !

    opened by Hovi123123 4
  • RuntimeError: CUDA error: the provided PTX was compiled with an unsupported toolchain.

    RuntimeError: CUDA error: the provided PTX was compiled with an unsupported toolchain.

    Hi, I met with the error while running: CUDA_VISIBLE_DEVICES=0 python main_gen.py

    Traceback (most recent call last):
      File "main_gen.py", line 6, in <module>
        runner = Runner(conf)
      File "/home/yipyewmun/GitHub/GraphBP/GraphBP/runner.py", line 25, in __init__
        self.model = GraphBP(**conf['model'])
      File "/home/yipyewmun/GitHub/GraphBP/GraphBP/model/graphbp.py", line 40, in __init__
        self.feat_net = self.feat_net.to('cuda')
      File "/home/yipyewmun/anaconda3/envs/gen/lib/python3.8/site-packages/torch/nn/modules/module.py", line 852, in to
        return self._apply(convert)
      File "/home/yipyewmun/anaconda3/envs/gen/lib/python3.8/site-packages/torch/nn/modules/module.py", line 530, in _apply
        module._apply(fn)
      File "/home/yipyewmun/anaconda3/envs/gen/lib/python3.8/site-packages/torch/nn/modules/module.py", line 552, in _apply
        param_applied = fn(param)
      File "/home/yipyewmun/anaconda3/envs/gen/lib/python3.8/site-packages/torch/nn/modules/module.py", line 850, in convert
        return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
    RuntimeError: CUDA error: the provided PTX was compiled with an unsupported toolchain.
    CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
    For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
    

    Any possible ideas on how I can resolve this? :)

    opened by yipy0005 2
  • How many the training samples and the test samples and why not comapred with other sota models?

    How many the training samples and the test samples and why not comapred with other sota models?

    Hi,

    I have noticed that the dataset consists of 500K samples after filtering which is described in your paper. However, the it2_tt_0_lowrmsd_mols_test0_fixed.types file includes more than 480K samples while it2_tt_0_lowrmsd_mols_test0_fixed.types file includes more than 230K samples.

    You have selected 10 target proteins as LiGAN for test evaluation, but 90 protein-ligand pairs in the test set as reference. Does it mean that generated 100 samples for each target protein would be compared with these pairs?

    Could you please tell me how many training samples can obtain the results in your paper?

    Besides, why did you not compare the model with Luo's model (Luo, S., Guan, J., Ma, J., and Peng, J. A 3d generative model for structure-based drug design. In Thirty-Fifth Conference on Neural Information Processing Systems, 2021a), which you mentioned in the related works?

    Thank you very much!

    opened by Layne-Huang 1
  • Code for evaluation

    Code for evaluation

    Hi, thanks for the awesome work. I did not find the code to evaluate the generated molecules, e.g., the code to compute the affinity. Would you kindly provide this part of code?

    opened by zaixizhang 1
  • The initial ligand

    The initial ligand

    Nice Paper,could you tell me how to add the atom and atomic coordinate of the initial ligand in the sample of protein pocket? Just like smiles seq2seq, we give the prompt of sequence like CCCC. In this work, we give the initial ligand like atom and xyz.

    opened by 1121091694 1
  • Very strange result

    Very strange result

    Thanks for the great work and sharing the code.

    I run it according to the instruction using the provided train_epoch33 model. I test it on one pocket, I picked a line from the train file and put into the test_selected_targets.type as below: 1 0 1.90832 1A1C_MALDO_2_433_0/1m7y_A_rec.pdb 1A1C_MALDO_2_433_0/1m7y_ppg_uff2.sdf.gz #-4.20313

    I run main_gen then main_eval.py. But main_eval will raise exception for every generated molecule by the line in bond_adder: Chem.SanitizeMol(rd_mol,Chem.SANITIZE_ALL^Chem.SANITIZE_KEKULIZE) So I modify it to Chem.SanitizeMol(rd_mol) Then everything is smooth.

    However, the results look really strange. I can post some here for your reference. 2 5 13 21

    Is there something I do it wrong in the step described above? Looking forward to your reply.

    opened by simmed00 7
Owner
DIVE Lab, Texas A&M University
DIVE Lab, Texas A&M University
Official implementation of AAAI-21 paper "Label Confusion Learning to Enhance Text Classification Models"

Description: This is the official implementation of our AAAI-21 accepted paper Label Confusion Learning to Enhance Text Classification Models. The str

null 97 Nov 9, 2022
Official PyTorch implementation for paper Context Matters: Graph-based Self-supervised Representation Learning for Medical Images

Context Matters: Graph-based Self-supervised Representation Learning for Medical Images Official PyTorch implementation for paper Context Matters: Gra

null 48 Nov 9, 2022
The official implementation of NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021]. https://arxiv.org/pdf/2101.12378.pdf

NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021] Release Notes The offical PyTorch implementation of NeMo, p

Angtian Wang 75 Oct 10, 2022
StyleGAN2-ADA - Official PyTorch implementation

Abstract: Training generative adversarial networks (GAN) using too little data typically leads to discriminator overfitting, causing training to diverge. We propose an adaptive discriminator augmentation mechanism that significantly stabilizes training in limited data regimes.

NVIDIA Research Projects 3.1k Nov 20, 2022
Official implementation of the ICLR 2021 paper

You Only Need Adversarial Supervision for Semantic Image Synthesis Official PyTorch implementation of the ICLR 2021 paper "You Only Need Adversarial S

Bosch Research 264 Nov 21, 2022
Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

This is the official PyTorch implementation of our paper: "Joint Object Detection and Multi-Object Tracking with Graph Neural Networks". Our project website and video demos are here.

Richard Wang 442 Nov 15, 2022
Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

CIPS -- Official Pytorch Implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis Requirements pip install -r requi

Multimodal Lab @ Samsung AI Center Moscow 200 Oct 22, 2022
Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement".

HiSD: Image-to-image Translation via Hierarchical Style Disentanglement Official pytorch implementation of paper "Image-to-image Translation

null 364 Nov 15, 2022
Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

Jie Liu 68 Oct 20, 2022
Official PyTorch Implementation of Unsupervised Learning of Scene Flow Estimation Fusing with Local Rigidity

UnRigidFlow This is the official PyTorch implementation of UnRigidFlow (IJCAI2019). Here are two sample results (~10MB gif for each) of our unsupervis

Liang Liu 28 Nov 16, 2022
Official implementation of our paper "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" in Pytorch.

LLA: Loss-aware Label Assignment for Dense Pedestrian Detection This project provides an implementation for "LLA: Loss-aware Label Assignment for Dens

null 34 Aug 31, 2022
Official implementation of Self-supervised Graph Attention Networks (SuperGAT), ICLR 2021.

SuperGAT Official implementation of Self-supervised Graph Attention Networks (SuperGAT). This model is presented at How to Find Your Friendly Neighbor

Dongkwan Kim 127 Nov 16, 2022
An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

PyTorch implementation of SFNet This is the implementation of the paper "SFNet: Learning Object-aware Semantic Correspondence". For more information,

CV Lab @ Yonsei University 84 Nov 4, 2022
This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

BiPointNet: Binary Neural Network for Point Clouds Created by Haotong Qin, Zhongang Cai, Mingyuan Zhang, Yifu Ding, Haiyu Zhao, Shuai Yi, Xianglong Li

Haotong Qin 58 Nov 6, 2022
Official code implementation for "Personalized Federated Learning using Hypernetworks"

Personalized Federated Learning using Hypernetworks This is an official implementation of Personalized Federated Learning using Hypernetworks paper. [

Aviv Shamsian 115 Nov 20, 2022
StyleGAN2 - Official TensorFlow Implementation

StyleGAN2 - Official TensorFlow Implementation

NVIDIA Research Projects 10k Nov 21, 2022
Old Photo Restoration (Official PyTorch Implementation)

Bringing Old Photo Back to Life (CVPR 2020 oral)

Microsoft 11.1k Nov 18, 2022
Official implementation of "GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators" (NeurIPS 2020)

GS-WGAN This repository contains the implementation for GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators (NeurIPS

null 46 Nov 9, 2022
Official PyTorch implementation of Spatial Dependency Networks.

Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling Đorđe Miladinović   Aleksandar Stanić   Stefan Bauer   Jürgen Schmid

Djordje Miladinovic 34 Jan 19, 2022