Automatic learning-rate scheduler

Yuchen Jin

Last update: Nov 18, 2022

Related tags

Overview

AutoLRS

This is the PyTorch code implementation for the paper AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly published at ICLR 2021.

A TensorFlow version will appear in this repo later.

What is AutoLRS?

Finding a good learning rate schedule for a DNN model is non-trivial. The goal of AutoLRS is to automatically tune the learning rate (LR) over the course of training without human involvement. AutoLRS chops up the whole training process into a few training stages (each consists of τ steps), and its mission is to determine a constant LR for each training stage. AutoLRS treats the validation loss as a black-box function of LR, and uses Bayesian optimization (BO) to search for the best LR which can minimize the validation loss for each training stage. Because BO would require τ steps of training to evaluate the validation loss for each LR it explores, to reduce this cost, we only apply an LR to train the DNN for τ’ (τ’ << τ) steps and train an exponential time-series forecasting model to predict the loss after τ steps. In our default setting, τ’ = τ/10 and BO explores 10 LRs in each stage, so the number of steps for searching LR is equal to the number of steps for actual training.

AutoLRS does not depend on a pre-defined LR schedule, dataset, or a specified task and is compatible with almost all optimizers. The LR schedules auto-generated by AutoLRS lead to speedup over highly hand-tuned LR schedules for several state-of-the-art DNNs including ResNet-50, Transformer, and BERT.

Setup

$ pip install --user -r requirements.txt

How to use AutoLRS for your work?

autolrs_server.py is the brain of AutoLRS, which implements the search algorithm including BO and the exponential forecasting model.

autolrs_callback.py implements a callback which you can plug into your Pytorch training loop. The callback receives commands from the server via socket, adjusting the learning rate, saving/restoring model parameters and optimizer states according to commands sent from the server.

Notes

You need to pass two arguments min_lr and max_lr when launching autolrs_server.py to set the LR search interval. This interval can be found by an LR range test or simply set according to your experience. Do not set the min_lr too small (for example 1e-10), otherwise, BO will waste a lot of cycles to try exploring very small LR values.
The current AutoLRS does not search LR for warmup steps since warmup does not have an explicit optimization objective, such as minimizing the validation loss. Warmup usually takes very few steps, and its main purpose is to prevent deeper layers in a DNN from creating training instability, especially when training using a large batch size. You can manually add a warmup stage by setting warmup_step and warmup_lr when initializing the autolrs_callback.AutoLRS callback.

Example

We provide an example of using AutoLRS to train various DNNs on the CIFAR-10 dataset. The models are imported from kuangliu's great and simple pytorch-cifar repository.

Prerequisites: Python 3.6+, PyTorch 1.0+

Run the example

$ bash run.sh

Contact

Yuchen Jin

You can contact us at [email protected]. We would love to hear your questions and feedback!

Poster

Comments

The Resnet-50 On ImageNet top-1 acc

hey, I am also running this model recently. See your paper only posted pictures, but there is no specific top-1 acc value. I want to make some reference. So can you provide it? Thanks. .

opened by atztao 2

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

H2O H2O is an in-memory platform for distributed, scalable machine learning. H2O uses familiar interfaces like R, Python, Scala, Java, JSON and the Fl

6.1k Jan 5, 2023

An integration of several popular automatic augmentation methods, including OHL (Online Hyper-Parameter Learning for Auto-Augmentation Strategy) and AWS (Improving Auto Augment via Augmentation Wise Weight Sharing) by Sensetime Research.

45 Dec 8, 2022

Automatic learning-rate scheduler

Related tags

Overview

AutoLRS

What is AutoLRS?

Setup

How to use AutoLRS for your work?

Notes

Example

Contact

Poster

You might also like...

Implementation of "Meta-rPPG: Remote Heart Rate Estimation Using a Transductive Meta-Learner"

PyTorch implementation of the paper Deep Networks from the Principle of Rate Reduction

Official NumPy Implementation of Deep Networks from the Principle of Rate Reduction (2021)

CBREN: Convolutional Neural Networks for Constant Bit Rate Video Quality Enhancement

A Pytorch Implementation of a continuously rate adjustable learned image compression framework.

This project implements "virtual speed" from heart rate monito

Jigsaw Rate Severity of Toxic Comments

An integration of several popular automatic augmentation methods, including OHL (Online Hyper-Parameter Learning for Auto-Augmentation Strategy) and AWS (Improving Auto Augment via Augmentation Wise Weight Sharing) by Sensetime Research.

Comments

The Resnet-50 On ImageNet top-1 acc

Owner

Yuchen Jin

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Source code of the paper Meta-learning with an Adaptive Task Scheduler.

Image data augmentation scheduler for albumentations transforms

DexterRedTool - Dexter's Red Team Tool that creates cronjob/task scheduler to consistently creates users

PyTorch implementation of some learning rate schedulers for deep learning researcher.

Automatic self-diagnosis program (python required)Automatic self-diagnosis program (python required)

Pytorch implementation of Learning Rate Dropout.

AdamW optimizer and cosine learning rate annealing with restarts