Rapid experimentation and scaling of deep learning models on molecular and crystal graphs.

Nathan Frey

Last update: Dec 6, 2022

Related tags

Deep Learning machine-learning deep-learning molecular-dynamics computational-chemistry drug-discovery materials-science neural-architecture-search geometric-deep-learning

Overview

LitMatter

A template for rapid experimentation and scaling deep learning models on molecular and crystal graphs.

How to use

Clone this repository and start editing, or save it and use it as a template for new projects.
Edit lit_models/models.py with the PyTorch code for your model of interest.
Edit lit_data/data.py to load and process your PyTorch datasets.
Perform interactive experiments in prototyping.py.
Scale network training to any number of GPUs using the example batch scripts.

Principles

LitMatter uses PyTorch Lightning to organize PyTorch code so scientists can rapidly experiment with geometric deep learning and scale up to hundreds of GPUs without difficulty. Many amazing applied ML methods (even those with open-source code) are never used by the wider community because the important details are buried in hundreds of lines of boilerplate code. It may require a significant engineering effort to get the method working on a new dataset and in a different computing environment, and it can be hard to justify this effort before verifying that the method will provide some advantage. Packaging your code with the LitMatter template makes it easy for other researchers to experiment with your models and scale them beyond common benchmark datasets.

Features

Maximum flexibility. LitMatter supports arbitrary PyTorch models and dataloaders.
Eliminate boilerplate. Engineering code is abstracted away, but still accessible if needed.
Full end-to-end pipeline. Data processing, model construction, training, and inference can be launched from the command line, in a Jupyter notebook, or through a SLURM job.
Lightweight. Using the template is easier than not using it; it reduces infrastructure overhead for simple and complex deep learning projects.

Examples

The example notebooks show how to use LitMatter to scale model training for different applications.

Prototyping GNNs - train an equivariant graph neural network to predict quantum properties of small molecules.
Neural Force Fields - train a neural force field on molecular dynamics trajectories of small molecules.
DeepChem - train a PyTorch model in DeepChem on a MoleculeNet dataset.
🤗 - train a 🤗 language model to generate molecules.

Note that these examples have additional dependencies beyond the core depdencies of LitMatter.

References

If you use LitMatter for your own research and scaling experiments, please cite the following work: Frey, Nathan C., et al. "Scalable Geometric Deep Learning on Molecular Graphs." NeurIPS 2021 AI for Science Workshop. 2021.

@inproceedings{frey2021scalable,
  title={Scalable Geometric Deep Learning on Molecular Graphs},
  author={Frey, Nathan C and Samsi, Siddharth and McDonald, Joseph and Li, Lin and Coley, Connor W and Gadepally, Vijay},
  booktitle={NeurIPS 2021 AI for Science Workshop},
  year={2021}
}

Please also cite the relevant frameworks: PyG, PyTorch Distributed, PyTorch Lightning,

and any extensions you use: 🤗 , DeepChem, NFFs, etc.

Extensions

When you're ready to upgrade to fully configurable, reproducible, and scalable workflows, use hydra-zen. hydra-zen integrates seamlessly with LitMatter to self-document ML experiments and orchestrate multiple training runs for extensive hyperparameter sweeps.

Disclaimer

DISTRIBUTION STATEMENT A. Approved for public release. Distribution is unlimited.

Subject to FAR 52.227-11 – Patent Rights – Ownership by the Contractor (May 2014)
SPDX-License-Identifier: MIT

This material is based upon work supported by the Under Secretary of Defense for Research and Engineering under Air Force Contract No. FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Under Secretary of Defense for Research and Engineering.

The software/firmware is provided to you on an As-Is basis.

GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models

GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Model This repository is the official PyTorch implementation of GraphRNN, a graph gene

568 Dec 29, 2022

For auto aligning, cropping, and scaling HR and LR images for training image based neural networks

ImgAlign For auto aligning, cropping, and scaling HR and LR images for training image based neural networks Usage Make sure OpenCV is installed, 'pip

15 Dec 4, 2022

Image-Scaling Attacks and Defenses

Image-Scaling Attacks & Defenses This repository belongs to our publication: Erwin Quiring, David Klein, Daniel Arp, Martin Johns and Konrad Rieck. Ad

163 Nov 21, 2022

Official code for On Path Integration of Grid Cells: Group Representation and Isotropic Scaling (NeurIPS 2021)

On Path Integration of Grid Cells: Group Representation and Isotropic Scaling This repo contains the official implementation for the paper On Path Int

39 Nov 10, 2022

Unofficial PyTorch reimplementation of the paper Swin Transformer V2: Scaling Up Capacity and Resolution

PyTorch reimplementation of the paper Swin Transformer V2: Scaling Up Capacity and Resolution [arXiv 2021].

122 Dec 12, 2022

Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones

HaloNet - Pytorch Implementation of the Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones. This re

189 Nov 22, 2022

Comments

Ideas for other models that could be "Lit"-ed
Awesome! I see that you already have LitDimeNet (DimeNet), LitSchNet (SchNet), and LitNNConv (PyTorch NNConv). A non-exhaustive list of some others that might be interesting are:

LitCrabNet (CrabNet)

possibly via my pip/conda installable refactor

LitCGCNN (CGCNN)

LitGATGNN (GATGNN)

LitBOWSR (BOWSR)

This uses CGCNN, not sure if it uses torch aside from CGCNN

LitALIGNN (ALIGNN)

@ncfrey are there any others you think would be applicable?
enhancement help wanted
opened by sgbaird 2

Rapid experimentation and scaling of deep learning models on molecular and crystal graphs.

Related tags

Overview

LitMatter

How to use

Principles

Features

Examples

References

Extensions

Disclaimer

You might also like...

GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models

For auto aligning, cropping, and scaling HR and LR images for training image based neural networks

Image-Scaling Attacks and Defenses

Official code for On Path Integration of Grid Cells: Group Representation and Isotropic Scaling (NeurIPS 2021)

Unofficial PyTorch reimplementation of the paper Swin Transformer V2: Scaling Up Capacity and Resolution

Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones

A PyTorch implementation of " EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks."

Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Implementation of "Scaled-YOLOv4: Scaling Cross Stage Partial Network" using PyTorch framwork.

Comments

Ideas for other models that could be "Lit"-ed

Owner

Nathan Frey

Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models

Official PyTorch implementation of "Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets" (ICLR 2021)

source code for https://arxiv.org/abs/2005.11248 "Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics"

Lightweight, Python library for fast and reproducible experimentation :microscope:

piSTAR Lab is a modular platform built to make AI experimentation accessible and fun. (pistar.ai)

MolRep: A Deep Representation Learning Library for Molecular Property Prediction

PyKale is a PyTorch library for multimodal learning and transfer learning as well as deep learning and dimensionality reduction on graphs, images, texts, and videos

RAMA: Rapid algorithm for multicut problem

Simulator for FRC 2022 challenge: Rapid React

Code for the paper "JANUS: Parallel Tempered Genetic Algorithm Guided by Deep Neural Networks for Inverse Molecular Design"