The full training script for Enformer (Tensorflow Sonnet) on TPU clusters

Phil Wang

Last update: Oct 19, 2022

Related tags

Deep Learning enformer-tensorflow-sonnet-training-script

Overview

Enformer TPU training script (wip)

The full training script for Enformer (Tensorflow Sonnet) on TPU clusters, in an effort to migrate the model to pytorch.

This was pieced together from the Deepmind Enformer repository, the colab training notebook, as well as Basenji sequence augmentation code

It accounts for:

distributed TPU training
distributed datasets
distributed validation
gradient clipping
cross replica batchnorms
dataset augmentation

Training takes about 3 days on v3-64

Todo

fix script for differences in sequence length in basenji training data, which is ~130k vs ~190k bp as in paper

Citations

@article {Avsec2021.04.07.438649,
    author  = {Avsec, {\v Z}iga and Agarwal, Vikram and Visentin, Daniel and Ledsam, Joseph R. and Grabska-Barwinska, Agnieszka and Taylor, Kyle R. and Assael, Yannis and Jumper, John and Kohli, Pushmeet and Kelley, David R.},
    title   = {Effective gene expression prediction from sequence by integrating long-range interactions},
    elocation-id = {2021.04.07.438649},
    year    = {2021},
    doi     = {10.1101/2021.04.07.438649},
    publisher = {Cold Spring Harbor Laboratory},
    URL     = {https://www.biorxiv.org/content/early/2021/04/08/2021.04.07.438649},
    eprint  = {https://www.biorxiv.org/content/early/2021/04/08/2021.04.07.438649.full.pdf},
    journal = {bioRxiv}
}

Comments

Gradient clipping: why not global norm ?

In the paper they say "We clipped gradients to a maximum global norm of 0.2." In https://github.com/lucidrains/enformer-tensorflow-sonnet-training-script/blob/6de9af047ecc1d8158afb8f12d128c6d504c5511/train.py#L1060 you choose to do simple clip_by_norm with a value of 1.0 I just wanted to ask the reasoning behind this choice, why did you not use tf.clip_by_global_norm instead ?

Just wanted to know if it's something you already tried and chose against as I struggle to train the model properly myself(using enformer_pytorch)

opened by sheetalgiri 0

Static Features Classifier - A static features classifier for Point-Could clusters using an Attention-RNN model

Static Features Classifier This is a static features classifier for Point-Could

1 Jan 25, 2022

Tensorflow port of a full NetVLAD network

netvlad_tf The main intention of this repo is deployment of a full NetVLAD network, which was originally implemented in Matlab, in Python. We provide

225 Nov 8, 2022

Training vision models with full-batch gradient descent and regularization

Stochastic Training is Not Necessary for Generalization -- Training competitive vision models without stochasticity This repository implements trainin

32 Jan 6, 2023

In this project, two programs can help you take full agvantage of time on the model training with a remote server

In this project, two programs can help you take full agvantage of time on the model training with a remote server, which can push notification to your phone about the information during model training, like the model indices and unexpected interrupts. Then you can do something in time for your work.

8 Dec 27, 2022

Python script for Linear, Non-Linear Convection, Burger’s & Poisson Equation in 1D & 2D, 1D Diffusion Equation using Standard Wall Function, 2D Heat Conduction Convection equation with Dirichlet & Neumann BC, full Navier-Stokes Equation coupled with Poisson equation for Cavity and Channel flow in 2D using Finite Difference Method & Finite Volume Method.

Navier-Stokes-numerical-solution-using-Python- Python script for Linear, Non-Linear Convection, Burger’s & Poisson Equation in 1D & 2D, 1D D

89 Jan 4, 2023

2 Aug 21, 2022

The full training script for Enformer (Tensorflow Sonnet) on TPU clusters

Related tags

Overview

Enformer TPU training script (wip)

Todo

Citations

You might also like...

Static Features Classifier - A static features classifier for Point-Could clusters using an Attention-RNN model

Tensorflow port of a full NetVLAD network

Training vision models with full-batch gradient descent and regularization

In this project, two programs can help you take full agvantage of time on the model training with a remote server

Deploy tensorflow graphs for fast evaluation and export to tensorflow-less environments running numpy.

TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the TensorFlow platform

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Comments

Gradient clipping: why not global norm ?

Owner

Phil Wang

Implementation of Enformer, Deepmind's attention network for predicting gene expression, in Pytorch

TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Refactoring dalle-pytorch and taming-transformers for TPU VM

A minimal TPU compatible Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

Full body anonymization - Realistic Full-Body Anonymization with Surface-Guided GANs

A very lightweight monitoring system for Raspberry Pi clusters running Kubernetes.

Joint parameterization and fitting of stroke clusters

Automatic tool focused on deriving metallicities of open clusters