Rapid experimentation and scaling of deep learning models on molecular and crystal graphs.

Overview

LitMatter

A template for rapid experimentation and scaling deep learning models on molecular and crystal graphs.

How to use

  1. Clone this repository and start editing, or save it and use it as a template for new projects.
  2. Edit lit_models/models.py with the PyTorch code for your model of interest.
  3. Edit lit_data/data.py to load and process your PyTorch datasets.
  4. Perform interactive experiments in prototyping.py.
  5. Scale network training to any number of GPUs using the example batch scripts.

Principles

LitMatter uses PyTorch Lightning to organize PyTorch code so scientists can rapidly experiment with geometric deep learning and scale up to hundreds of GPUs without difficulty. Many amazing applied ML methods (even those with open-source code) are never used by the wider community because the important details are buried in hundreds of lines of boilerplate code. It may require a significant engineering effort to get the method working on a new dataset and in a different computing environment, and it can be hard to justify this effort before verifying that the method will provide some advantage. Packaging your code with the LitMatter template makes it easy for other researchers to experiment with your models and scale them beyond common benchmark datasets.

Features

  • Maximum flexibility. LitMatter supports arbitrary PyTorch models and dataloaders.
  • Eliminate boilerplate. Engineering code is abstracted away, but still accessible if needed.
  • Full end-to-end pipeline. Data processing, model construction, training, and inference can be launched from the command line, in a Jupyter notebook, or through a SLURM job.
  • Lightweight. Using the template is easier than not using it; it reduces infrastructure overhead for simple and complex deep learning projects.

Examples

The example notebooks show how to use LitMatter to scale model training for different applications.

  • Prototyping GNNs - train an equivariant graph neural network to predict quantum properties of small molecules.
  • Neural Force Fields - train a neural force field on molecular dynamics trajectories of small molecules.
  • DeepChem - train a PyTorch model in DeepChem on a MoleculeNet dataset.
  • 🤗 - train a 🤗 language model to generate molecules.

Note that these examples have additional dependencies beyond the core depdencies of LitMatter.

References

If you use LitMatter for your own research and scaling experiments, please cite the following work: Frey, Nathan C., et al. "Scalable Geometric Deep Learning on Molecular Graphs." NeurIPS 2021 AI for Science Workshop. 2021.

@inproceedings{frey2021scalable,
  title={Scalable Geometric Deep Learning on Molecular Graphs},
  author={Frey, Nathan C and Samsi, Siddharth and McDonald, Joseph and Li, Lin and Coley, Connor W and Gadepally, Vijay},
  booktitle={NeurIPS 2021 AI for Science Workshop},
  year={2021}
}

Please also cite the relevant frameworks: PyG, PyTorch Distributed, PyTorch Lightning,

and any extensions you use: 🤗 , DeepChem, NFFs, etc.

Extensions

When you're ready to upgrade to fully configurable, reproducible, and scalable workflows, use hydra-zen. hydra-zen integrates seamlessly with LitMatter to self-document ML experiments and orchestrate multiple training runs for extensive hyperparameter sweeps.

Disclaimer

DISTRIBUTION STATEMENT A. Approved for public release. Distribution is unlimited.

© 2021 MASSACHUSETTS INSTITUTE OF TECHNOLOGY

Subject to FAR 52.227-11 – Patent Rights – Ownership by the Contractor (May 2014)
SPDX-License-Identifier: MIT

This material is based upon work supported by the Under Secretary of Defense for Research and Engineering under Air Force Contract No. FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Under Secretary of Defense for Research and Engineering.

The software/firmware is provided to you on an As-Is basis.

You might also like...
GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models

GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Model This repository is the official PyTorch implementation of GraphRNN, a graph gene

For auto aligning, cropping, and scaling HR and LR images for training image based neural networks

ImgAlign For auto aligning, cropping, and scaling HR and LR images for training image based neural networks Usage Make sure OpenCV is installed, 'pip

Image-Scaling Attacks and Defenses
Image-Scaling Attacks and Defenses

Image-Scaling Attacks & Defenses This repository belongs to our publication: Erwin Quiring, David Klein, Daniel Arp, Martin Johns and Konrad Rieck. Ad

Official code for On Path Integration of Grid Cells: Group Representation and Isotropic Scaling (NeurIPS 2021)
Official code for On Path Integration of Grid Cells: Group Representation and Isotropic Scaling (NeurIPS 2021)

On Path Integration of Grid Cells: Group Representation and Isotropic Scaling This repo contains the official implementation for the paper On Path Int

Unofficial PyTorch reimplementation of the paper Swin Transformer V2: Scaling Up Capacity and Resolution

PyTorch reimplementation of the paper Swin Transformer V2: Scaling Up Capacity and Resolution [arXiv 2021].

Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones
Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones

HaloNet - Pytorch Implementation of the Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones. This re

A PyTorch implementation of " EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks."

EfficientNet A PyTorch implementation of EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. [arxiv] [Official TF Repo] Implemen

Official pytorch implementation of
Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Implementation of "Scaled-YOLOv4: Scaling Cross Stage Partial Network" using PyTorch framwork.

YOLOv4-large This is the implementation of "Scaled-YOLOv4: Scaling Cross Stage Partial Network" using PyTorch framwork. YOLOv4-CSP YOLOv4-tiny YOLOv4-

Comments
Owner
Nathan Frey
Postdoc at MIT
Nathan Frey
Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models

Molecular Sets (MOSES): A benchmarking platform for molecular generation models Deep generative models are rapidly becoming popular for the discovery

MOSES 656 Dec 29, 2022
Official PyTorch implementation of "Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets" (ICLR 2021)

Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets This is the official PyTorch implementation for the paper Rapid Neural A

null 48 Dec 26, 2022
source code for https://arxiv.org/abs/2005.11248 "Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics"

Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics This work will be published in Nature Biomedical

International Business Machines 71 Nov 15, 2022
Lightweight, Python library for fast and reproducible experimentation :microscope:

Steppy What is Steppy? Steppy is a lightweight, open-source, Python 3 library for fast and reproducible experimentation. Steppy lets data scientist fo

minerva.ml 134 Jul 10, 2022
piSTAR Lab is a modular platform built to make AI experimentation accessible and fun. (pistar.ai)

piSTAR Lab WARNING: This is an early release. Overview piSTAR Lab is a modular deep reinforcement learning platform built to make AI experimentation a

piSTAR Lab 0 Aug 1, 2022
MolRep: A Deep Representation Learning Library for Molecular Property Prediction

MolRep: A Deep Representation Learning Library for Molecular Property Prediction Summary MolRep is a Python package for fairly measuring algorithmic p

AI-Health @NSCC-gz 83 Dec 24, 2022
PyKale is a PyTorch library for multimodal learning and transfer learning as well as deep learning and dimensionality reduction on graphs, images, texts, and videos

PyKale is a PyTorch library for multimodal learning and transfer learning as well as deep learning and dimensionality reduction on graphs, images, texts, and videos. By adopting a unified pipeline-based API design, PyKale enforces standardization and minimalism, via reusing existing resources, reducing repetitions and redundancy, and recycling learning models across areas.

PyKale 370 Dec 27, 2022
RAMA: Rapid algorithm for multicut problem

RAMA: Rapid algorithm for multicut problem Solves multicut (correlation clustering) problems orders of magnitude faster than CPU based solvers without

Paul Swoboda 60 Dec 13, 2022
Simulator for FRC 2022 challenge: Rapid React

rrsim Simulator for FRC 2022 challenge: Rapid React out-1.mp4 Usage In order to run the simulator use the following: python3 rrsim.py [config_path] wh

null 1 Jan 18, 2022
Code for the paper "JANUS: Parallel Tempered Genetic Algorithm Guided by Deep Neural Networks for Inverse Molecular Design"

JANUS: Parallel Tempered Genetic Algorithm Guided by Deep Neural Networks for Inverse Molecular Design This repository contains code for the paper: JA

Aspuru-Guzik group repo 55 Nov 29, 2022