Extract atomic fingerprints from molecules using pretrained GROVER
Using pretrained GROVER to extract the atomic fingerprints from molecule. The fingerprints can be used for further tasks.
GROVER is short for Graph Representation frOm self-superVised mEssage passing tRansformer which is a Transformer-based self-supervised message-passing neural network by Rong and colleagues as in the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.
Install requirements
- Create and activate a conda environment:
conda create --name grover python=3.6.8
conda activate grover
- Install requirements from
requirements.txt
file. Additionally, installtorchinfo
:
conda install -c conda-forge -c pytorch -c acellera -c RMG --file=requirements.txt
pip install torchinfo
Download the pretrained models
There are two pretrained models provided by the original authors. Download, extract and save the .pt
file in models_pretrained/
.
Inference figerprints
Run the main.py
file:
python main.py
Details about the arguments can be viewed in the setup_parser()
function found in the main.py
, or by running:
python main.py -h
If no arguments are specified, then the default arguments will be used.
By default, the outputs are saved in extracted_fingerprint
. The outputs include 3 files:
atom_fp.npy
: contains the atomic fingerprints.distance.npy
: contains the pair-wise shortest relative distance matrices between nodes of the molecular graphs.smiles.txt
: contains the SMILES strings of the molecules.
In order to read the .npy
files, please refer to this part in the numpy.save
documentation