Learning to Initialize Neural Networks for Stable and Efficient Training

Chen Zhu

Last update: Dec 30, 2022

Related tags

Deep Learning gradinit

Overview

GradInit

This repository hosts the code for experiments in the paper, GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training.

Scripts for experiments on CIFAR-10 is currently available. Please refer to launch/run_gradinit_densenet.sh for DenseNet-100, launch/run_gradinit_wrn.sh for WRN-28-10, and launch/run_gradinit.sh for other networks shown in the paper. We will release the code for ImageNet and IWSLT experiments soon.

You might also like...

Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation", Haoxiang Wang, Han Zhao, Bo Li.

Bridging Multi-Task Learning and Meta-Learning Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Trainin

57 Dec 15, 2022

This is the PyTorch implementation of GANs N’ Roses: Stable, Controllable, Diverse Image to Image Translation

Official PyTorch repo for GAN's N' Roses. Diverse im2im and vid2vid selfie to anime translation.

1.1k Jan 1, 2023

This is the official implementation of the paper "Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation".

[CVPRW 2021] - Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation

6 May 3, 2022

RL agent to play μRTS with Stable-Baselines3

Gym-μRTS with Stable-Baselines3/PyTorch This repo contains an attempt to reproduce Gridnet PPO with invalid action masking algorithm to play μRTS usin

24 Nov 11, 2022

Self-driving car env with PPO algorithm from stable baseline3

Self-driving car with RL stable baseline3 Most of the project develop from https://github.com/GerardMaggiolino/Gym-Medium-Post Please check it out! Th

7 Dec 22, 2022

Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)

Complex-Valued Neural Networks (CVNN) Done by @NEGU93 - J. Agustin Barrachina Using this library, the only difference with a Tensorflow code is that y

1 Nov 12, 2021

A machine learning library for spiking neural networks. Supports training with both torch and jax pipelines, and deployment to neuromorphic hardware.

Rockpool Rockpool is a Python package for developing signal processing applications with spiking neural networks. Rockpool allows you to build network

21 Dec 14, 2022

[ICLR 2021] "CPT: Efficient Deep Neural Network Training via Cyclic Precision" by Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yining Ding, Vikas Chandra, Yingyan Lin

CPT: Efficient Deep Neural Network Training via Cyclic Precision Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yining Ding, Vikas Chandra, Yingyan Lin Accep

26 Oct 25, 2022

Efficient neural networks for analog audio effect modeling

micro-TCN Efficient neural networks for audio effect modeling

94 Dec 29, 2022

Comments

division of the loss value by eta

Hello! Thank you for a great paper. Could you please explain the idea behind the division of the loss value by eta? I didn't see any information related to this in the paper. In my experiments, I haven't observed any significant difference between the code with/without this division, but maybe I am missing something important.

https://github.com/zhuchen03/gradinit/blob/cb4685348b2a6d216f35c16cbf9c2450af062d6f/gradinit_utils.py#L231

opened by DenisKoposov 1
Code to run fairseq IWSLT experiments?

I would love to test your method out on language modeling tasks in fairseq.

Do you have the code to make table 2 (or just the GradInit rows in Table 2) handy?

opened by sshleifer 2
Great Stuff but Needs Better Usability

Hello,

Thanks for such a great work. Auto-initializing a DNN in a proper way definetely sounds amazing.

Yet, the usability needs to be significantly improved so that I can plug this in my existing networks.

It would be great if that could be as easy as installing and then importing an additional package.

We should maybe open a feature request in PyTorch so that they integrate this into the framework.

opened by kayuksel 12

Owner

Chen Zhu

GitHub

ElegantRL is featured with lightweight, efficient and stable, for researchers and practitioners.

Lightweight, efficient and stable implementations of deep reinforcement learning algorithms using PyTorch. ??

2.5k Jan 8, 2023

Lyapunov-guided Deep Reinforcement Learning for Stable Online Computation Offloading in Mobile-Edge Computing Networks

PyTorch code to reproduce LyDROO algorithm [1], which is an online computation offloading algorithm to maximize the network data processing capability subject to the long-term data queue stability and average power constraints. It applies Lyapunov optimization to decouple the multi-stage stochastic MINLP into deterministic per-frame MINLP subproblems and solves each subproblem via DROO algorithm. It includes:

87 Dec 28, 2022

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

4.7k Jan 1, 2023

Official repository for CVPR21 paper "Deep Stable Learning for Out-Of-Distribution Generalization".

StableNet StableNet is a deep stable learning method for out-of-distribution generalization. This is the official repo for CVPR21 paper "Deep Stable L

120 Dec 28, 2022

TeST: Temporal-Stable Thresholding for Semi-supervised Learning

TeST: Temporal-Stable Thresholding for Semi-supervised Learning TeST Illustration Semi-supervised learning (SSL) offers an effective method for large-

1 Jul 14, 2022

A Fast and Stable GAN for Small and High Resolution Imagesets - pytorch

A Fast and Stable GAN for Small and High Resolution Imagesets - pytorch The official pytorch implementation of the paper "Towards Faster and Stabilize

455 Jan 8, 2023

Ultra-Data-Efficient GAN Training: Drawing A Lottery Ticket First, Then Training It Toughly

Ultra-Data-Efficient GAN Training: Drawing A Lottery Ticket First, Then Training It Toughly Code for this paper Ultra-Data-Efficient GAN Tra

77 Oct 5, 2022

A PyTorch implementation of "Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" (KDD 2019).

ClusterGCN ⠀⠀ A PyTorch implementation of "Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" (KDD 2019). A

697 Dec 27, 2022

Simple converter for deploying Stable-Baselines3 model to TFLite and/or Coral

Running SB3 developed agents on TFLite or Coral Introduction I've been using Stable-Baselines3 to train agents against some custom Gyms, some of which

16 Oct 11, 2022

Additional code for Stable-baselines3 to load and upload models from the Hub.

Hugging Face x Stable-baselines3 A library to load and upload Stable-baselines3 models from the Hub. Installation With pip Examples [Todo: add colab t

34 Dec 10, 2022

Learning to Initialize Neural Networks for Stable and Efficient Training

Related tags

Overview

GradInit

You might also like...

Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation", Haoxiang Wang, Han Zhao, Bo Li.

This is the PyTorch implementation of GANs N’ Roses: Stable, Controllable, Diverse Image to Image Translation

This is the official implementation of the paper "Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation".

RL agent to play μRTS with Stable-Baselines3

Self-driving car env with PPO algorithm from stable baseline3

Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)

A machine learning library for spiking neural networks. Supports training with both torch and jax pipelines, and deployment to neuromorphic hardware.

[ICLR 2021] "CPT: Efficient Deep Neural Network Training via Cyclic Precision" by Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yining Ding, Vikas Chandra, Yingyan Lin

Efficient neural networks for analog audio effect modeling

Comments

division of the loss value by eta

Code to run fairseq IWSLT experiments?

Great Stuff but Needs Better Usability

Owner

Chen Zhu

ElegantRL is featured with lightweight, efficient and stable, for researchers and practitioners.

Lyapunov-guided Deep Reinforcement Learning for Stable Online Computation Offloading in Mobile-Edge Computing Networks

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

Official repository for CVPR21 paper "Deep Stable Learning for Out-Of-Distribution Generalization".

TeST: Temporal-Stable Thresholding for Semi-supervised Learning

A Fast and Stable GAN for Small and High Resolution Imagesets - pytorch

Ultra-Data-Efficient GAN Training: Drawing A Lottery Ticket First, Then Training It Toughly

A PyTorch implementation of "Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" (KDD 2019).

Simple converter for deploying Stable-Baselines3 model to TFLite and/or Coral

Additional code for Stable-baselines3 to load and upload models from the Hub.