PyTorch reimplementation of the Smooth ReLU activation function proposed in the paper "Real World Large Scale Recommendation Systems Reproducibility and Smooth Activations" [arXiv 2022].

Christoph Reich

Last update: Jan 2, 2023

Related tags

Overview

Smooth ReLU in PyTorch

Unofficial PyTorch reimplementation of the Smooth ReLU (SmeLU) activation function proposed in the paper Real World Large Scale Recommendation Systems Reproducibility and Smooth Activations by Gil I. Shamir and Dong Lin.

This repository includes an easy-to-use pure PyTorch implementation of the Smooth ReLU.

In case you run into performance issues with this implementation, please have a look at my Triton SmeLU implementation.

Installation

The SmeLU can be installed by using pip.

pip install git+https://github.com/ChristophReich1996/SmeLU

Example Usage

The SmeLU can be simply used as a standard nn.Module:

import torch
import torch.nn as nn
from smelu import SmeLU

network: nn.Module = nn.Sequential(
    nn.Linear(2, 2),
    SmeLU(),
    nn.Linear(2, 2)
)

output: torch.Tensor = network(torch.rand(16, 2))

For a more detailed examples on hwo to use this implementation please refer to the example file (requires Matplotlib to be installed).

The SmeLU takes the following parameters.

Parameter	Description	Type
beta	Beta value if the SmeLU activation function. Default 2.	float

Reference

@article{Shamir2022,
        title={{Real World Large Scale Recommendation Systems Reproducibility and Smooth Activations}},
        author={Shamir, Gil I and Lin, Dong},
        journal={{arXiv preprint arXiv:2202.06499}},
        year={2022}
}

Comments

How to convert activation fuctions to utilise SmeLu?

Dear @ChristophReich1996,

Amazing work with the implementation of SmeLu!

This is not an issue but I a question as to how one will utilise the activation function.

class QNetwork(nn.Module):

    def __init__(self, action_dim, state_dim, hidden_dim):
        super(QNetwork, self).__init__()
        self.fc_1 = nn.Linear(state_dim, hidden_dim)
        self.fc_2 = nn.Linear(hidden_dim, hidden_dim)
        self.fc_3 = nn.Linear(hidden_dim, action_dim)

    def forward(self, inp):
        
        x1 = F.leaky_relu(self.fc_1(inp))
        x1 = F.leaky_relu(self.fc_2(x1))
        x1 = self.fc_3(x1)

        return x1

Could I find out how does one use the SmeLu function here? The instantiation of the SmeLu function is tripping me a bit.

x1 = SmeLu(self.fc_1(inp))

^ and is this the correct way to use the function?

opened by rllyryan 1

Arxiv harvester - Poor man's simple harvester for arXiv resources

Poor man's simple harvester for arXiv resources This modest Python script takes

5 Oct 18, 2022

PyTorch reimplementation of the paper Involution: Inverting the Inherence of Convolution for Visual Recognition [CVPR 2021].

Involution: Inverting the Inherence of Convolution for Visual Recognition Unofficial PyTorch reimplementation of the paper Involution: Inverting the I

100 Dec 1, 2022

Unofficial PyTorch reimplementation of the paper Swin Transformer V2: Scaling Up Capacity and Resolution

PyTorch reimplementation of the paper Swin Transformer V2: Scaling Up Capacity and Resolution [arXiv 2021].

122 Dec 12, 2022

A PyTorch implementation of Mugs proposed by our paper "Mugs: A Multi-Granular Self-Supervised Learning Framework".

Mugs: A Multi-Granular Self-Supervised Learning Framework This is a PyTorch implementation of Mugs proposed by our paper "Mugs: A Multi-Granular Self-

62 Nov 8, 2022

Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

HackED 2022 Team 3IQ - 2022 Imposter Detector By Aneeljyot Alagh, Curtis Kan, Jo

3 Aug 20, 2022

Unofficial Tensorflow 2 implementation of the paper Implicit Neural Representations with Periodic Activation Functions

Siren: Implicit Neural Representations with Periodic Activation Functions The unofficial Tensorflow 2 implementation of the paper Implicit Neural Repr

2 Jun 27, 2022

Reimplementation of the paper `Human Attention Maps for Text Classification: Do Humans and Neural Networks Focus on the Same Words? (ACL2020)`

Human Attention for Text Classification Re-implementation of the paper Human Attention Maps for Text Classification: Do Humans and Neural Networks Foc

15 Dec 13, 2021

An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Speech Resynthesis from Discrete Disentangled Self-Supervised Representations Implementation of the method described in the Speech Resynthesis from Di

253 Jan 6, 2023

Reimplementation of the paper "Attention, Learn to Solve Routing Problems!" in jax/flax.

JAX + Attention Learn To Solve Routing Problems Reinplementation of the paper Attention, Learn to Solve Routing Problems! using Jax and Flax. Fully su

7 Dec 1, 2022

PyTorch reimplementation of the Smooth ReLU activation function proposed in the paper "Real World Large Scale Recommendation Systems Reproducibility and Smooth Activations" [arXiv 2022].

Related tags

Overview

Smooth ReLU in PyTorch

Installation

Example Usage

Reference

You might also like...

Arxiv harvester - Poor man's simple harvester for arXiv resources

PyTorch reimplementation of the paper Involution: Inverting the Inherence of Convolution for Visual Recognition [CVPR 2021].

Unofficial PyTorch reimplementation of the paper Swin Transformer V2: Scaling Up Capacity and Resolution

A PyTorch implementation of Mugs proposed by our paper "Mugs: A Multi-Granular Self-Supervised Learning Framework".

Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

Unofficial Tensorflow 2 implementation of the paper Implicit Neural Representations with Periodic Activation Functions

Reimplementation of the paper `Human Attention Maps for Text Classification: Do Humans and Neural Networks Focus on the Same Words? (ACL2020)`

An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Reimplementation of the paper "Attention, Learn to Solve Routing Problems!" in jax/flax.

Comments

How to convert activation fuctions to utilise SmeLu?

Owner

Christoph Reich

Tensorflow Implementation of SMU: SMOOTH ACTIVATION FUNCTION FOR DEEP NETWORKS USING SMOOTHING MAXIMUM TECHNIQUE

Rational Activation Functions - Replacing Padé Activation Units

Companion code for the paper "An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their Asymptotic Overconfidence" (NeurIPS 2021)

Source for the paper "Universal Activation Function for machine learning"

An implementation for the loss function proposed in Decoupled Contrastive Loss paper.

Implementation of parameterized soft-exponential activation function.

We have implemented shaDow-GNN as a general and powerful pipeline for graph representation learning. For more details, please find our paper titled Deep Graph Neural Networks with Shallow Subgraph Samplers, available on arXiv (https//arxiv.org/abs/2012.01380).

Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

arxiv-sanity, but very lite, simply providing the core value proposition of the ability to tag arxiv papers of interest and have the program recommend similar papers.

Listing arxiv - Personalized list of today's articles from ArXiv