Official Repsoitory for "Activate or Not: Learning Customized Activation." [CVPR 2021]

Last update: Dec 27, 2022

Related tags

Deep Learning acon

Overview

CVPR 2021 | Activate or Not: Learning Customized Activation.

This repository contains the official Pytorch implementation of the paper Activate or Not: Learning Customized Activation, CVPR 2021.

ACON

We propose a novel activation function we term the ACON that explicitly learns to activate the neurons or not. Below we show the ACON activation function and its first derivatives. β controls how fast the first derivative asymptotes to the upper/lower bounds, which are determined by p1 and p2.

Training curves

We show the training curves of different activations here.

TFNet

To show the effectiveness of the proposed acon family, we also provide an extreme simple toy funnel network (TFNet) made only by pointwise convolution and ACON-FReLU operators.

Main results

The following results are the ImageNet top-1 accuracy relative improvements compared with the ReLU baselines. The relative improvements of Meta-ACON are about twice as much as SENet.

The comparison between ReLU, Swish and ACON-C. We show improvements without additional amount of FLOPs and parameters:

Model	FLOPs	#Params.	top-1 err. (ReLU)	top-1 err. (Swish)	top-1 err. (ACON)
ShuffleNetV2 0.5x	41M	1.4M	39.4	38.3 (+1.1)	37.0 (+2.4)
ShuffleNetV2 1.5x	299M	3.5M	27.4	26.8 (+0.6)	26.5 (+0.9)
ResNet 50	3.9G	25.5M	24.0	23.5 (+0.5)	23.2 (+0.8)
ResNet 101	7.6G	44.4M	22.8	22.7 (+0.1)	21.8 (+1.0)
ResNet 152	11.3G	60.0M	22.3	22.2 (+0.1)	21.2 (+1.1)

Next, by adding a negligible amount of FLOPs and parameters, meta-ACON shows sigificant improvements:

Model	FLOPs	#Params.	top-1 err.
ShuffleNetV2 0.5x (meta-acon)	41M	1.7M	34.8 (+4.6)
ShuffleNetV2 1.5x (meta-acon)	299M	3.9M	24.7 (+2.7)
ResNet 50 (meta-acon)	3.9G	25.7M	22.0 (+2.0)
ResNet 101 (meta-acon)	7.6G	44.8M	21.0 (+1.8)
ResNet 152 (meta-acon)	11.3G	60.5M	20.5 (+1.8)

The simple TFNet without the SE modules can outperform the state-of-the art light-weight networks without the SE modules.

	FLOPs	#Params.	top-1 err.
MobileNetV2 0.17	42M	1.4M	52.6
ShuffleNetV2 0.5x	41M	1.4M	39.4
TFNet 0.5	43M	1.3M	36.6 (+2.8)
MobileNetV2 0.6	141M	2.2M	33.3
ShuffleNetV2 1.0x	146M	2.3M	30.6
TFNet 1.0	135M	1.9M	29.7 (+0.9)
MobileNetV2 1.0	300M	3.4M	28.0
ShuffleNetV2 1.5x	299M	3.5M	27.4
TFNet 1.5	279M	2.7M	26.0 (+1.4)
MobileNetV2 1.4	585M	5.5M	25.3
ShuffleNetV2 2.0x	591M	7.4M	25.0
TFNet 2.0	474M	3.8M	24.3 (+0.7)

Trained Models

OneDrive download: Link
BaiduYun download: Link (extract code: 13fu)

Usage

Requirements

Download the ImageNet dataset and move validation images to labeled subfolders. To do this, you can use the following script: https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh

Train:

python train.py  --train-dir YOUR_TRAINDATASET_PATH --val-dir YOUR_VALDATASET_PATH

Eval:

python train.py --eval --eval-resume YOUR_WEIGHT_PATH --train-dir YOUR_TRAINDATASET_PATH --val-dir YOUR_VALDATASET_PATH

Citation

If you use these models in your research, please cite:

@inproceedings{ma2021activate,
  title={Activate or Not: Learning Customized Activation},
  author={Ma, Ningning and Zhang, Xiangyu and Liu, Ming and Sun, Jian},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
  year={2021}
}

Comments

Use acon in the pre-trained model

Hi, thanks for your amazing work! How to use ACON in the pre-trained model? Can I directly replace all activation functions in the ImageNet pre-trained network with ACON, and then finetune it in the downstream task?

opened by chenshen03 3
Parameters of nn.Conv2d in MetaAconC

MetaAconC: self.fc1 = nn.Conv2d(width, max(r,width//r), kernel_size=1, stride=1, bias=True) self.bn1 = nn.BatchNorm2d(max(r,width//r)) self.fc2 = nn.Conv2d(max(r,width//r), width, kernel_size=1, stride=1, bias=True) self.bn2 = nn.BatchNorm2d(width)

It should be “nn.Conv2d(width, max(r,width//r), kernel_size=1, stride=1, bias=False)”？

opened by dosemeion 1
Experimental results on CIFAR-100

Hi, Thanks for your nice work.

From the paper, the improvements on ImageNet are obvious, which is attractive to me. Recently, I conduct experiments on CIFAR-100 with ResNet-18 and the meta-ACON. However, the performance is not satisfactory. Here, I want to know if you can provide some results on CIFAR-100 for reference.

Very thanks.

opened by jiequancui 9
centernet种使用预训练权重问题

您好，我这两天在backbone是resnet50的centerNet网络中，，参照MetaACON中的程序，在resnet50 bn2后也加了一行self.acon=MetaAconC(planes),， forward中也将out = self.relu(out)改为：out = self.acon(out)，，，训练的时候bach_size设为16，使用了nms，学习率0.001，权值衰减是0.0005，然后预训练权重分别加载了res50.acon.pth和res50.metaacon.pth，用的voc数据集进行的训练、验证，最后结果map特别小，相比原来的centernet网络特别小。我不太懂是哪里出了问题，或者是不是不能直接用你提供的预训练权值文件？我弄了一天多还是不行。所以来叨扰大佬了。谢谢。

opened by ymSunshine 5
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 16, 1, 1])

ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 16, 1, 1]) 会在这一行，beta = self.sigmoid(self.bn2(self.fc2(self.bn1(self.fc1(x.mean(dim=2, keepdims=True).mean(dim=3, keepdims=True)))))) 会莫名其妙出现这个错误，然后早dataloader中使用了drop_last=True仍然没用

opened by starsky68 15

Official Repsoitory for "Activate or Not: Learning Customized Activation." [CVPR 2021]

Related tags

Overview

CVPR 2021 | Activate or Not: Learning Customized Activation.

ACON

Training curves

TFNet

Main results

Trained Models

Usage

Requirements

Citation

Comments

Use acon in the pre-trained model

Parameters of nn.Conv2d in MetaAconC

Experimental results on CIFAR-100

centernet种使用预训练权重问题

ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 16, 1, 1])

Owner

[CVPR 21] Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021.

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

Official code of the paper "ReDet: A Rotation-equivariant Detector for Aerial Object Detection" (CVPR 2021)

Official code for the paper: Deep Graph Matching under Quadratic Constraint (CVPR 2021)

Official implementation for (Refine Myself by Teaching Myself : Feature Refinement via Self-Knowledge Distillation, CVPR-2021)

Official PyTorch implementation of RobustNet (CVPR 2021 Oral)

Official PyTorch Code of GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection (CVPR 2021)

Official code for the CVPR 2021 paper "How Well Do Self-Supervised Models Transfer?"

This is an official implementation of our CVPR 2021 paper "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression" (https://arxiv.org/abs/2104.02300)

Official PyTorch code of Holistic 3D Scene Understanding from a Single Image with Implicit Representation (CVPR 2021)

Official code of CVPR 2021's PLOP: Learning without Forgetting for Continual Semantic Segmentation

The official implementation of our CVPR 2021 paper - Hybrid Rotation Averaging: A Fast and Robust Rotation Averaging Approach

Official pytorch implementation of Rainbow Memory (CVPR 2021)

This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.

Official Pytorch implementation of "Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video", CVPR 2021

Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021

Official repo for AutoInt: Automatic Integration for Fast Neural Volume Rendering in CVPR 2021

The official implementation of Equalization Loss v1 & v2 (CVPR 2020, 2021) based on MMDetection.

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.