Repository for publicly available deep learning models developed in Rosetta community

Overview

trRosetta2

This package contains deep learning models and related scripts used by Baker group in CASP14.

Installation

Linux/Mac

  1. clone the package
git clone https://github.com/RosettaCommons/trRosetta2
cd trRosetta2
  1. create conda environment using one of the .yml files: casp14-baker-linux-cpu.yml, casp14-baker-linux-gpu.yml, casp14-baker-mac-cpu.yml
conda env create -f casp14-baker-linux-gpu.yml
conda activate casp14-baker
  1. download network weights [1.1G]
wget https://files.ipd.uw.edu/pub/trRosetta2/weights.tar.bz2
tar xf weights.tar.bz2
  1. download and install third-party software
./install_dependencies.sh
  1. download sequence and structure databases
# uniclust30 [46G]
wget http://wwwuser.gwdg.de/~compbiol/uniclust/2020_06/UniRef30_2020_06_hhsuite.tar.gz
mkdir -p UniRef30_2020_06
tar xf UniRef30_2020_06_hhsuite.tar.gz -C ./UniRef30_2020_06

# structure templates [8.3G]
wget https://files.ipd.uw.edu/pub/trRosetta2/pdb100_2020Mar11.tar.gz
tar xf pdb100_2020Mar11.tar.gz

Obtain a PyRosetta licence and install the package in the newly created casp14-baker conda environment (link).

Usage

mkdir -p examples/T1078
./run_pipeline.sh example/T1078.fa example/T1078

Links

References

[1] I Anishchenko, M Baek, H Park, J Dauparas, N Hiranuma, S Mansoor, I Humphrey, D Baker. Protein structure prediction guided by predicted inter-residue geometries. In: CASP14 Abstract Book, 2020

[2] H Park, M Baek, N Hiranuma, I Anishchenko, S Mansoor, J Dauparas, D Baker. Model refinement guided by an interplay between Deep-learning and Rosetta. In: CASP14 Abstract Book, 2020

[3] M Baek, I Anishchenko, H Park, I Humphrey, D Baker. Protein oligomer structure predictions guided by predicted inter-chain contacts. In: CASP14 Abstract Book, 2020

Comments
  • Adding Benchmark scripts and outline for run-pipeline.py

    Adding Benchmark scripts and outline for run-pipeline.py

    @gjoni @minkbaek this pull request is a follow up to our last meeting. It adds:

    • outline for run-pipeline.py (ie Python rewrite of run_pipeline.sh script)
    • add script to install dependencies (part of run-pipeline.py script now)
    • add scripts needed for Benchmark integration
    opened by lyskov 2
  •  T1078 example couldn't get the model file

    T1078 example couldn't get the model file

    Thank you very much for the release of this software, this work is significant for biological research. After the installation is complete, I follow the instructions to run the sample code. It shows: bash run_pipeline.sh example/T1078.fa example/T1078 Running hhsearch Running trRefine /home/rrrna/trRosetta2 Picking final models Final models saved in: example/T1078/model The program generates a lot of files in the T1078 folder, but there is no file in model/. T1078$ ls DONE_iter0 log pdb-msa t000_.msa0.a3m t000_.ss2 DONE_iter1 model pdb-tbm t000_.msa0.ss2.a3m t000_.tape.npy hhblits parallel.list pdb-trRefine t000_.msa.npz trRefine_fold.list

    opened by mooerccx 1
  • Does the

    Does the "run_pipeline.sh" script ignore "Run modeling w/ trRefine output" part automatically?

    The "Run trRefine" part creates a file named DONE_iter1 which make "! -f $WDIR/DONE_iter1" false.

    ############################################################
    # 7. Run trRefine
    ############################################################
    if [ ! -s $WDIR/t000_.trRefine.npz ]
    then
        echo "Running trRefine"
        cd $WDIR
        python $PIPEDIR/trRefine/run_trRefine_DAN.py -msa_npz $WDIR/t000_.msa.npz \
            -tbm_npz $WDIR/t000_.tbm.npz -pdb_dir_s $WDIR/pdb-msa $WDIR/pdb-tbm \
            -a3m_fn $WDIR/t000_.msa0.a3m -hhr_fn $WDIR/t000_.hhr\
            -n_core $CPU > $WDIR/log/trRefine.stdout 2> $WDIR/log/trRefine.stderr
        cd ..
        touch $WDIR/DONE_iter1
    fi
    
    
    ############################################################
    # 8. Run modeling w/ trRefine output
    ############################################################
    
    if [ ! -f $WDIR/DONE_iter1 ]
    
    opened by EstherBear 1
  • json.decoder issue

    json.decoder issue

    Generating TAPE features 243B [00:00, 284458.80B/s] Traceback (most recent call last): File "/home/ngayatri/trRosetta2/trRosetta2/tape/get_embeddings.py", line 25, in model = ProteinBertModel.from_pretrained('bert-base') File "/home/ngayatri/miniconda3/envs/casp14-baker/lib/python3.6/site-packages/tape/models/modeling_utils.py", line 470, in from_pretrained *kwargs File "/home/ngayatri/miniconda3/envs/casp14-baker/lib/python3.6/site-packages/tape/models/modeling_utils.py", line 172, in from_pretrained config = cls.from_json_file(resolved_config_file) File "/home/ngayatri/miniconda3/envs/casp14-baker/lib/python3.6/site-packages/tape/models/modeling_utils.py", line 202, in from_json_file return cls.from_dict(json.loads(text)) File "/home/ngayatri/miniconda3/envs/casp14-baker/lib/python3.6/json/init.py", line 354, in loads return _default_decoder.decode(s) File "/home/ngayatri/miniconda3/envs/casp14-baker/lib/python3.6/json/decoder.py", line 339, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/home/ngayatri/miniconda3/envs/casp14-baker/lib/python3.6/json/decoder.py", line 357, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) Running sequence-based trRosetta Folding trRosetta models Running trRefine /home/ngayatri/trRosetta2/trRosetta2 Folding trRefine models ls: cannot access /home/ngayatri/trRosetta2/trRosetta2/example/T1078/pdb-trRefine/model.pdb: No such file or directory Running DeepAccNet-msa on trRefine models

    please help me solve this issue

    opened by GayatriNavle 4
  • Pick final model not works

    Pick final model not works

    I tried TrRosetta2 on 1crn fasta. I had to recompile hhsuite to avoid segmentation fault issue.

    All the pipeline look works fine but at the end final_pick.py fail because it not found _acc file in pdb-TrRefine.

    I would like to ask help to fix this issue.

    opened by luigidibiasi 0
  • Error while running trRosetta2

    Error while running trRosetta2

    Hi, I get this error while running the pipeline script. No PDB model is generated. Could you please help me out with this ?

    Traceback (most recent call last): File "./get_embeddings.py", line 25, in model = ProteinBertModel.from_pretrained('bert-base') File "/mnt/NewHDD/anaconda3/envs/casp14-baker/lib/python3.6/site-packages/tape/models/modeling_utils.py", line 470, in from_pretrained **kwargs File "/mnt/NewHDD/anaconda3/envs/casp14-baker/lib/python3.6/site-packages/tape/models/modeling_utils.py", line 172, in from_pretrained config = cls.from_json_file(resolved_config_file) File "/mnt/NewHDD/anaconda3/envs/casp14-baker/lib/python3.6/site-packages/tape/models/modeling_utils.py", line 202, in from_json_file return cls.from_dict(json.loads(text)) File "/mnt/NewHDD/anaconda3/envs/casp14-baker/lib/python3.6/json/init.py", line 354, in loads return _default_decoder.decode(s) File "/mnt/NewHDD/anaconda3/envs/casp14-baker/lib/python3.6/json/decoder.py", line 339, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/mnt/NewHDD/anaconda3/envs/casp14-baker/lib/python3.6/json/decoder.py", line 357, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

    Best,

    Anupam

    opened by anu-bioinfo 5
  • How to fold and dock for homodimer and heterodimer

    How to fold and dock for homodimer and heterodimer

    Hi,

    Thanks for sharing this tool. And the run_pipeline.sh is just for monomer structure prediction, right? Could you also share the whole pipeline scripts for the homo- and hetero-structure prediction. Thanks in advance.

    opened by ZhiYeG 0
  • Error message running Tape about undefined symbol omp_get_num_procs

    Error message running Tape about undefined symbol omp_get_num_procs

    Error message running Tape about undefined symbol omp_get_num_procs Resolved using conda install --channel conda-forge llvm-openmp

    https://github.com/ContinuumIO/anaconda-issues/issues/10195

    opened by FrBonnardel 0
Owner
null
Reverse engineering Rosetta 2 in M1 Mac

Project Champollion About this project Rosetta 2 is an emulation mechanism to run the x86_64 applications on Arm-based Apple Silicon with Ahead-Of-Tim

FFRI Security, Inc. 258 Jan 7, 2023
Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis. You write a high level configuration file specifying your in

Blue Collar Bioinformatics 917 Jan 3, 2023
A community run, 5-day PyTorch Deep Learning Bootcamp

Deep Learning Winter School, November 2107. Tel Aviv Deep Learning Bootcamp : http://deep-ml.com. About Tel-Aviv Deep Learning Bootcamp is an intensiv

Shlomo Kashani. 1.3k Sep 4, 2021
This is a Machine Learning Based Hand Detector Project, It Uses Machine Learning Models and Modules Like Mediapipe, Developed By Google!

Machine Learning Hand Detector This is a Machine Learning Based Hand Detector Project, It Uses Machine Learning Models and Modules Like Mediapipe, Dev

Popstar Idhant 3 Feb 25, 2022
Computer Vision Script to recognize first person motion, developed as final project for the course "Machine Learning and Deep Learning"

Overview of The Code BaseColab/MLDL_FPAR.pdf: it contains the full explanation of our work Base Colab: it contains the base colab used to perform all

Simone Papicchio 4 Jul 16, 2022
A Repository of Community-Driven Natural Instructions

A Repository of Community-Driven Natural Instructions TLDR; this repository maintains a community effort to create a large collection of tasks and the

AI2 244 Jan 4, 2023
An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.

SERank An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow

Zhihu 44 Oct 20, 2022
Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

Stock Price Prediction Using Deep Learning Univariate Time Series Predicting stock price using historical data of a company using Neural networks for

Abdultawwab Safarji 7 Nov 27, 2022
Repository for the electrical and ICT benchmark model developed in the ERIGrid 2.0 project.

Benchmark Model Electrical and ICT System This repository contains the documentation, code, and models for the electrical and ICT benchmark model deve

ERIGrid 2.0 1 Nov 29, 2021
RITA is a family of autoregressive protein models, developed by LightOn in collaboration with the OATML group at Oxford and the Debora Marks Lab at Harvard.

RITA: a Study on Scaling Up Generative Protein Sequence Models RITA is a family of autoregressive protein models, developed by a collaboration of Ligh

LightOn 69 Dec 22, 2022
ManimML is a project focused on providing animations and visualizations of common machine learning concepts with the Manim Community Library.

ManimML ManimML is a project focused on providing animations and visualizations of common machine learning concepts with the Manim Community Library.

null 259 Jan 4, 2023
Official repository for the CVPR 2021 paper "Learning Feature Aggregation for Deep 3D Morphable Models"

Deep3DMM Official repository for the CVPR 2021 paper Learning Feature Aggregation for Deep 3D Morphable Models. Requirements This code is tested on Py

null 38 Dec 27, 2022
This is the repository for paper NEEDLE: Towards Non-invertible Backdoor Attack to Deep Learning Models.

This is the repository for paper NEEDLE: Towards Non-invertible Backdoor Attack to Deep Learning Models.

null 1 Oct 25, 2021
An experiment on the performance of homemade Q-learning AIs in Agar.io depending on their state representation and available actions

Agar.io_Q-Learning_AI An experiment on the performance of homemade Q-learning AIs in Agar.io depending on their state representation and available act

null 1 Jun 9, 2022
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

Machine Learning From Scratch About Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The purpose

Erik Linder-Norén 21.8k Jan 9, 2023
Keras community contributions

keras-contrib : Keras community contributions Keras-contrib is deprecated. Use TensorFlow Addons. The future of Keras-contrib: We're migrating to tens

Keras 1.6k Dec 21, 2022