Implements Gradient Centralization and allows it to use as a Python package in TensorFlow

Overview

Gradient Centralization TensorFlow Twitter

PyPI Upload Python Package Flake8 Lint Python Version

Binder Open In Colab

GitHub license PEP8 GitHub stars GitHub forks GitHub watchers

This Python package implements Gradient Centralization in TensorFlow, a simple and effective optimization technique for Deep Neural Networks as suggested by Yong et al. in the paper Gradient Centralization: A New Optimization Technique for Deep Neural Networks. It can both speedup training process and improve the final generalization performance of DNNs.

Installation

Run the following to install:

pip install gradient-centralization-tf

Usage

gctf.centralized_gradients_for_optimizer

Create a centralized gradients functions for a specified optimizer.

Arguments:

  • optimizer: a tf.keras.optimizers.Optimizer object. The optimizer you are using.

Example:

>>> opt = tf.keras.optimizers.Adam(learning_rate=0.1)
>>> optimizer.get_gradients = gctf.centralized_gradients_for_optimizer(opt)
>>> model.compile(optimizer = opt, ...)

gctf.get_centralized_gradients

Computes the centralized gradients.

This function is ideally not meant to be used directly unless you are building a custom optimizer, in which case you could point get_gradients to this function. This is a modified version of tf.keras.optimizers.Optimizer.get_gradients.

Arguments:

  • optimizer: a tf.keras.optimizers.Optimizer object. The optimizer you are using.
  • loss: Scalar tensor to minimize.
  • params: List of variables.

Returns:

A gradients tensor.

gctf.optimizers

Pre built updated optimizers implementing GC.

This module is speciially built for testing out GC and in most cases you would be using gctf.centralized_gradients_for_optimizer though this module implements gctf.centralized_gradients_for_optimizer. You can directly use all optimizers with tf.keras.optimizers updated for GC.

Example:

>>> model.compile(optimizer = gctf.optimizers.adam(learning_rate = 0.01), ...)
>>> model.compile(optimizer = gctf.optimizers.rmsprop(learning_rate = 0.01, rho = 0.91), ...)
>>> model.compile(optimizer = gctf.optimizers.sgd(), ...)

Returns:

A tf.keras.optimizers.Optimizer object.

Developing gctf

To install gradient-centralization-tf, along with tools you need to develop and test, run the following in your virtualenv:

git clone [email protected]:Rishit-dagli/Gradient-Centralization-TensorFlow
# or clone your own fork

pip install -e .[dev]

License

Copyright 2020 Rishit Dagli

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Comments
  • On windows Tensorflow 2.5 it gives error

    On windows Tensorflow 2.5 it gives error

    On windows 10 with miniconda enviroment tensorflow 2.5 gives error on centralized_gradients.py file.

    the solution is change import keras.backend as K with import tensorflow.keras.backend as K

    bug 
    opened by mgezer 5
  • The results in the mnist example are wrong/misleading

    The results in the mnist example are wrong/misleading

    Describe the bug The results in your colab ipython notebook are misleading: https://colab.research.google.com/github/Rishit-dagli/Gradient-Centralization-TensorFlow/blob/main/examples/gctf_mnist.ipynb

    In this example, the model is first trained with a normal Adam optimizer:

    model.compile(optimizer = tf.keras.optimizers.Adam(),
                  loss = 'sparse_categorical_crossentropy',
                  metrics = ['accuracy'])
    
    history_no_gctf = model.fit(training_images, training_labels, epochs=5, callbacks = [time_callback_no_gctf])
    

    And afterwards the same model is recompiled with the gctf.optimizers.adam(). However, recompiling a keras model does not reset the weights. This means that in the first fit call the model is trained and then in the second fit call with the new optimizer the same model is used and of course then the results are better.

    This can be fixed, by recreating the model for the second run, by just adding these few lines:

    import gctf #import gctf
    
    time_callback_gctf = TimeHistory()
    
    # Model architecture
    model = tf.keras.models.Sequential([
                                        tf.keras.layers.Flatten(), 
                                        tf.keras.layers.Dense(512, activation=tf.nn.relu),
                                        tf.keras.layers.Dense(256, activation=tf.nn.relu),
                                        tf.keras.layers.Dense(64, activation=tf.nn.relu),
                                        tf.keras.layers.Dense(512, activation=tf.nn.relu),
                                        tf.keras.layers.Dense(256, activation=tf.nn.relu),
                                        tf.keras.layers.Dense(64, activation=tf.nn.relu), 
                                        tf.keras.layers.Dense(10, activation=tf.nn.softmax)])
    
    model.compile(optimizer = gctf.optimizers.adam(),
                  loss = 'sparse_categorical_crossentropy',
                  metrics=['accuracy'])
    
    history_gctf = model.fit(training_images, training_labels, epochs=5, callbacks=[time_callback_gctf])
    

    However, then the results are not better than without gctf:

    Type                   Execution time    Accuracy      Loss
    -------------------  ----------------  ----------  --------
    Model without gctf:           24.7659    0.88825   0.305801
    Model with gctf               24.7881    0.889567  0.30812
    

    Could you please clarify what happens here. I tried this gctf.optimizers.adam() optimizer in my own research and it didn't change the results at all and now after seeing it doesn't work in the example which was constructed here. Makes me question the results of this paper.

    To Reproduce Execute the colab file given in the repository: https://colab.research.google.com/github/Rishit-dagli/Gradient-Centralization-TensorFlow/blob/main/examples/gctf_mnist.ipynb

    Expected behavior The right comparison would be if both models start from a random initialization, not that the second model can start with the already pre-trained weights.

    Looking forward to a fast a swift explanation.

    Best, Max

    question 
    opened by themasterlink 2
  • Wider dependency requirements

    Wider dependency requirements

    The package as of now to be installed requires tensorflow ~= 2.4.0 and keras ~= 2.4.0. It turns out that this is sometimes problematic for folks who have custom installations of TensorFlow and a winder requirement could be set up.

    enhancement 
    opened by Rishit-dagli 1
  • Release 0.0.3

    Release 0.0.3

    This release includes some fixes and improvements

    βœ… Bug Fixes / Improvements

    • Allow wider versions for TensorFlow and Keras while installing the package (#14 )
    • Fixed incorrect usage example in docstrings and description for centralized_gradients_for_optimizer (#13 )
    • Add clear aims for each of the examples of using gctf (#15 )
    • Updates PyPi classifiers to clearly show the aims of this project. This should have no changes in the way you use this package (#18 )
    • Add clear instructions for using this with custom optimizers i.e. directly use get_centralized_gradients however a complete example has not been pushed due to the reasons mentioned in the issue (#16 )
    opened by Rishit-dagli 0
  • Add an

    Add an "About The Examples" section

    Add an "About The Examples" section which contains a summary of the usage example notebooks and links to run it on Binder and Colab.


    Close #15

    opened by Rishit-dagli 0
  • Update relevant pypi classifiers

    Update relevant pypi classifiers

    Add PyPI classifiers for:

    • Development status
    • Intended Audience
    • Topic

    Further also added the Programming Language :: Python :: 3 :: Only classifer


    Closes #18

    opened by Rishit-dagli 0
  • Update pypi classifiers

    Update pypi classifiers

    I am specifically thinking of adding three more categories of pypi classifiers:

    • Development status
    • Intended Audience
    • Topic

    Apart from this I also think it would be great to add the Programming Language :: Python :: 3 :: Only to make sure the audience to know that this package is intended for Python 3 only.

    opened by Rishit-dagli 0
  • Add an

    Add an "About the examples" section

    It would be great to write an "About the example" section which could demonstrate in short what the example notebooks aim to achieve and show.

    documentation 
    opened by Rishit-dagli 0
  • Error in usage example for gctf.centralized_gradients_for_optimizer

    Error in usage example for gctf.centralized_gradients_for_optimizer

    I noticed that the docstrings for gctf.centralized_gradients_for_optimizer have an error in the example usage section. The example creates an Adam optimizer instance and saves it to opt however the centralized_gradients_for_optimizer is applied on optimizer which ideally does not exist and running the example would result in an error.

    documentation 
    opened by Rishit-dagli 0
  • [ImgBot] Optimize images

    [ImgBot] Optimize images

    opened by imgbot[bot] 0
  • [ImgBot] Optimize images

    [ImgBot] Optimize images

    opened by imgbot[bot] 0
Releases(v0.0.3)
  • v0.0.3(Mar 11, 2021)

    This release includes some fixes and improvements

    βœ… Bug Fixes / Improvements

    • Allow wider versions for TensorFlow and Keras while installing the package (#14 )
    • Fixed incorrect usage example in docstrings and description for centralized_gradients_for_optimizer (#13 )
    • Add clear aims for each of the examples of using gctf (#15 )
    • Updates PyPi classifiers to clearly show the aims of this project. This should have no changes in the way you use this package (#18 )
    • Add clear instructions for using this with custom optimizers i.e. directly use get_centralized_gradients however a complete example has not been pushed due to the reasons mentioned in the issue (#16 )
    Source code(tar.gz)
    Source code(zip)
  • v0.0.2(Feb 21, 2021)

    This release includes some fixes and improvements

    βœ… Bug Fixes / Improvements

    • Fix the issue of supporting multiple modules
    • Fix multiple typos.
    Source code(tar.gz)
    Source code(zip)
  • v0.0.1(Feb 20, 2021)

Owner
Rishit Dagli
High School, Ted-X, Ted-Ed speaker|Mentor, TFUG Mumbai|International Speaker|Microsoft Student Ambassador|#ExploreML Facilitator
Rishit Dagli
A PyTorch implementation of Learning to learn by gradient descent by gradient descent

Intro PyTorch implementation of Learning to learn by gradient descent by gradient descent. Run python main.py TODO Initial implementation Toy data LST

Ilya Kostrikov 300 Dec 11, 2022
HiddenMarkovModel implements hidden Markov models with Gaussian mixtures as distributions on top of TensorFlow

Class HiddenMarkovModel HiddenMarkovModel implements hidden Markov models with Gaussian mixtures as distributions on top of TensorFlow 2.0 Installatio

Susara Thenuwara 2 Nov 3, 2021
The Generic Manipulation Driver Package - Implements a ROS Interface over the robotics toolbox for Python

Armer Driver Armer aims to provide an interface layer between the hardware drivers of a robotic arm giving the user control in several ways: Joint vel

QUT Centre for Robotics (QCR) 13 Nov 26, 2022
This package implements THOR: Transformer with Stochastic Experts.

THOR: Transformer with Stochastic Experts This PyTorch package implements Taming Sparsely Activated Transformer with Stochastic Experts. Installation

Microsoft 45 Nov 22, 2022
Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

============================================================================================================ `MILA will stop developing Theano <https:

null 9.6k Dec 31, 2022
Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

============================================================================================================ `MILA will stop developing Theano <https:

null 9.6k Jan 6, 2023
Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

============================================================================================================ `MILA will stop developing Theano <https:

null 9.3k Feb 12, 2021
Implements Stacked-RNN in numpy and torch with manual forward and backward functions

Recurrent Neural Networks Implements simple recurrent network and a stacked recurrent network in numpy and torch respectively. Both flavours implement

Vishal R 1 Nov 16, 2021
This repository implements and evaluates convolutional networks on the MΓΆbius strip as toy model instantiations of Coordinate Independent Convolutional Networks.

Orientation independent MΓΆbius CNNs This repository implements and evaluates convolutional networks on the MΓΆbius strip as toy model instantiations of

Maurice Weiler 59 Dec 9, 2022
deep-table implements various state-of-the-art deep learning and self-supervised learning algorithms for tabular data using PyTorch.

deep-table implements various state-of-the-art deep learning and self-supervised learning algorithms for tabular data using PyTorch.

null 63 Oct 17, 2022
Implements the training, testing and editing tools for "Pluralistic Image Completion"

Pluralistic Image Completion ArXiv | Project Page | Online Demo | Video(demo) This repository implements the training, testing and editing tools for "

Chuanxia Zheng 615 Dec 8, 2022
Some code of the implements of Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network

3D-GMPDCNN Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network PyTorch implementation of "Geological Modeling Usin

null 5 Nov 21, 2022
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

Distributed (Deep) Machine Learning Community 23.6k Dec 31, 2022
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

Distributed (Deep) Machine Learning Community 20.6k Feb 13, 2021
Model search is a framework that implements AutoML algorithms for model architecture search at scale

Model search (MS) is a framework that implements AutoML algorithms for model architecture search at scale. It aims to help researchers speed up their exploration process for finding the right model architecture for their classification problems (i.e., DNNs with different types of layers).

Google 3.2k Dec 31, 2022
Implements MLP-Mixer: An all-MLP Architecture for Vision.

MLP-Mixer-CIFAR10 This repository implements MLP-Mixer as proposed in MLP-Mixer: An all-MLP Architecture for Vision. The paper introduces an all MLP (

Sayak Paul 51 Jan 4, 2023