🔪 Elimination based Lightweight Neural Net with Pretrained Weights

snoop2head

Last update: Jul 12, 2022

Related tags

Overview

ELimNet

ELimNet: Eliminating Layers in a Neural Network Pretrained with Large Dataset for Downstream Task

Removed top layers from pretrained EfficientNetB0 and ResNet18 to construct lightweight CNN model with less than 1M #params.
Assessed on Trash Annotations in Context(TACO) Dataset sampled for 6 classes with 20,851 images.
Compared performance with lightweight models generated with Optuna's Neural Architecture Search(NAS) constituted with same convolutional blocks.

Quickstart

Installation

# clone the repository
git clone https://github.com/snoop2head/elimnet

# fetch image dataset and unzip
!wget -cq https://aistages-prod-server-public.s3.amazonaws.com/app/Competitions/000081/data/data.zip
!unzip ./data.zip -d ./

Train

# finetune on the dataset with pretrained model
python train.py --model ./model/efficientnet_b0.yaml

# finetune on the dataset with ElimNet
python train.py --model ./model/efficientnet_b0_elim_3.yaml

Inference

# inference with the lastest ran model
python inference.py --model_dir ./exp/latest/

Performance

Performance is compared with (1) original pretrained model and (2) Optuna NAS constructed models with no pretrained weights.

Indicates that top convolutional layers eliminated pretrained CNN models outperforms empty Optuna NAS models generated with same convolutional blocks.
Suggests that eliminating top convolutional layers creates lightweight model that shows similar(or better) classifcation performance with original pretrained model.
Reduces parameters to 7%(or less) of its original parameters while maintaining(or improving) its performance. Saves inference time by 20% or more by eliminating top convolutional layters.

ELimNet vs Pretrained Models (Train)

[100 epochs]	# of Parameters	# of Layers	Train	Validation	Test F1
Pretrained EfficientNet B0	4.0M	352	Loss: 0.43 Acc: 81.23% F1: 0.84	Loss: 0.469 Acc: 82.17% F1: 0.76	0.7493
EfficientNet B0 Elim 2	0.9M	245	Loss:0.652 Acc: 87.22% F1: 0.84	Loss: 0.622 Acc: 87.22% F1: 0.77	0.7603
EfficientNet B0 Elim 3	0.30M	181	Loss: 0.602 Acc: 78.17% F1: 0.74	Loss: 0.661 Acc: 77.41% F1: 0.74	0.7349

Resnet18	11.17M	69	Loss: 0.578 Acc: 78.90% F1: 0.76	Loss: 0.700 Acc: 76.17% F1: 0.719	-
Resnet18 Elim 2	0.68M	37	Loss: 0.447 Acc: 83.73% F1: 0.71	Loss: 0.712 Acc: 75.42% F1: 0.71	-

ELimNet vs Pretrained Models (Inference)

	# of Parameters	# of Layers	CPU times (sec)	CUDA time (sec)	Test Inference Time (sec)
Pretrained EfficientNet B0	4.0M	352	3.9s	4.0s	105.7s
EfficientNet B0 Elim 2	0.9M	245	4.1s	13.0s	83.4s
EfficientNet B0 Elim 3	0.30M	181	3.0s	9.0s	73.5s

Resnet18	11.17M	69	-	-	-
Resnet18 Elim 2	0.68M	37	-	-	-

ELimNet vs Empty Optuna NAS Models (Train)

[100 epochs]	# of Parameters	# of Layers	Train	Valid	Test F1
Empty MobileNet V3	4.2M	227	Loss 0.925 Acc: 65.18% F1: 0.58	Loss 0.993 Acc: 62.83% F1: 0.56	-
Empty EfficientNet B0	1.3M	352	Loss 0.867 Acc: 67.28% F1: 0.61	Loss 0.898 Acc: 66.80% F1: 0.61	0.6337

Empty DWConv & InvertedResidualv3 NAS	0.08M	66	-	Loss: 0.766 Acc: 71.71% F1: 0.68	0.6740
Empty MBConv NAS	0.33M	141	Loss: 0.786 Acc: 70.72% F1: 0.66	Loss: 0.866 Acc: 68.09% F1: 0.62	0.6245

Resnet18 Elim 2	0.68M	37	Loss: 0.447 Acc: 83.73% F1: 0.71	Loss: 0.712 Acc: 75.42% F1: 0.71	-
EfficientNet B0 Elim 3	0.30M	181	Loss: 0.602 Acc: 78.17% F1: 0.74	Loss: 0.661 Acc: 77.41% F1: 0.74	0.7603

ELimNet vs Empty Optuna NAS Models (Inference)

	# of Parameters	# of Layers	CPU times (sec)	CUDA time (sec)	Test Inference Time (sec)
Empty MobileNet V3	4.2M	227	4	13	-
Empty EfficientNet B0	1.3M	352	3.780	3.782	68.4s

Empty DWConv & InvertedResidualv3 NAS	0.08M	66	1	3.5	61.1s
Empty MBConv NAS	0.33M	141	2.14	7.201	67.1s

Resnet18 Elim 2	0.68M	37	-	-	-
EfficientNet B0 Elim 3	0.30M	181	3.0s	9s	73.5s

Background & WiP

Background

NLP tasks are usually downstream tasks of finetuning large pretrained transformers models(i.e. BERT, RoBERTa, XLNet).
Removing top transformers layers may yield 40% reduction in size while preserving up to 98.2% of the performance.
Likewise, for computer vision's classification task, removing convolutional top layers from pretrained models are tested.

Work in Progress

Will test the performance of replacing convolutional blocks with pretrained weights with a single convolutional layer without pretrained weights.
Will add ResNet18's inference time data and compare with Optuna's NAS constructed lightweight model.
Will test on pretrained MobileNetV3, MnasNet on torchvision with elimination based lightweight model architecture search.
Will be applied on other small datasets such as Fashion MNIST dataset and Plant Village dataset.

Others

"Empty" stands for model with no pretrained weights.
"EfficientNet B0 Elim 2" means 2 convolutional blocks have been eliminated from pretrained EfficientNet B0. Number next to "Elim" annotates how many convolutional blocks have been removed.
Table's performance illustrates best performance out of 100 epochs of finetuning on TACO Dataset.

Authors

Comments

Test on pretrained mobilenetv3 and on vgg11
Eliminate conv blocks from pretrained mobilenet_v3_large

Eliminate conv blocks from pretrained vgg11

Set baseline vgg11 as vgg11 where fully connected layers are dropped

Experiment conditions are the same as the following:

Augment: None

Optimizer: SGDP(momentum = 0.9)

(Initial) Learning Rate: 0.01

Scheduler: OneCycleLR

Epochs: 100

Best Model Selection(Save strategy): Validation Loss

FP16: True
opened by snoop2head 3
add: vgg11 as model
CV-19 team reports that VGG with 11 layers(including fully connected layers) yields F1 score of 0.7720.

But current implementation only yields validation F1 score of 0.7547. WandB log is attached as following.
opened by snoop2head 1
Add Resnet18 module
Resnet18 모듈 추가하였습니다.

argument: [out_channel, pretrained] out_channel: 64, 128, 256, 512 pretrained: True, False (default:True)

TEST:

[1, Resnet18, [512, True]] : PASS

[1, Resnet18, [512, False]] : PASS

[1, Resnet18, [256, True]] : PASS

[1, Resnet18, [128, True]] : PASS

[1, Resnet18, [64, True]] : PASS

[1, Resnet18, [511, True]] : raise Exception "out_channel: 512, 256, 128 or 64"

이외에 테스트해야할 예외상황 또는 수정필요한 부분 코멘트 부탁드릴게요.
opened by lkm2835 0
Enable layer numbers based elimination from input arguments
Currently supports output channel based elimination

In order to fulfill the goal of lightweight model(less layers and less FLOPS), enable layer numbers based elimination for each pretrained models

It is observed that deleting more than 4 convolutional blocks on pretrained efficientnetb0 and on mobilenet_v3_large degrades models' performance.
opened by snoop2head 1
Repeat inference experiments to obtain confidence interval of elapsed time
For the inference experiments, get following times for each pretrained models

Data Loading

Profiling

Forward passing

Repeat the inference experiments since time elapsed has discrepancy according to CPU cores utilized through time. Suggested by @hihellohowareyou
opened by snoop2head 1
Illustrate performance gain of each pretrained models & how each models got lighter
For each pretrained models, display

wandb log

table which consists of maximum f1 score in 100 epochs and FLOPS(either inference time)

percentage gain in both metrics of f1 score and FLOPS(either inference time)
opened by snoop2head 2
Check the discrepancy between two resnet18 implementations

Performance between these code blocks are different. https://github.com/snoop2head/ELimNet/blob/799a5c25199b925dc3bf035d95c81a5a9765ba0c/src/modules/resnet18.py#L9-L52

https://github.com/snoop2head/ELimNet/blob/799a5c25199b925dc3bf035d95c81a5a9765ba0c/train_resnet_elim.py#L31-L59

opened by snoop2head 1
Use torchvision version 0.11.0
Project needs to use torchvision 0.11.0 where it provides follwowing pretrained models.

import torchvision.models as models mobilenet_v3_large = models.mobilenet_v3_large() mobilenet_v3_small = models.mobilenet_v3_small() efficientnet_b0 = models.efficientnet_b0()

torchvision 0.11.0 requires version above 1.9.0.

However, problem arose in inference.py when using torch version above 1.8.0 https://github.com/snoop2head/ELimNet/blob/590152df6fd0cfa63718af581ca7c2d1956acb50/inference.py#L22-L25

Currently using torch==1.7.0 and torchvision==0.10.0 as a workaround
opened by snoop2head 2

Owner

snoop2head

break, compose, display

GitHub https://wandb.ai/elimnet/ElimNet

Neural networks applied in recognizing guitar chords using python, AutoML.NET with C# and .NET Core

Chord Recognition Demo application The demo application is written in C# with .NETCore. As of July 9, 2020, the only version available is for windows

24 Oct 22, 2022

U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

104 Nov 25, 2022

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

1.2k Jan 4, 2023

U-2-Net: U Square Net - Modified for paired image training of style transfer

U2-Net: U Square Net Modified for paired image training of style transfer This is an unofficial repo making use of the code which was made available b

43 Oct 3, 2022

RGBD-Net - This repository contains a pytorch lightning implementation for the 3DV 2021 RGBD-Net paper.

[3DV 2021] We propose a new cascaded architecture for novel view synthesis, called RGBD-Net, which consists of two core components: a hierarchical depth regression network and a depth-aware generator network.

4 May 26, 2022

LWCC: A LightWeight Crowd Counting library for Python that includes several pretrained state-of-the-art models.

LWCC: A LightWeight Crowd Counting library for Python LWCC is a lightweight crowd counting framework for Python. It wraps four state-of-the-art models

39 Dec 28, 2022

Neural-net-from-scratch - A simple Neural Network from scratch in Python using the Pymathrix library

A Simple Neural Network from scratch A Simple Neural Network from scratch in Pyt

2 Jan 7, 2022

Lightweight mmm - Lightweight (Bayesian) Media Mix Model

Lightweight (Bayesian) Media Mix Model This is not an official Google product. L

342 Jan 3, 2023

DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight or group of weights, in order to achieve a given trade-off between model size and accuracy.

Differentiable Model Compression via Pseudo Quantization Noise DiffQ performs differentiable quantization using pseudo quantization noise. It can auto

145 Dec 30, 2022

Code for Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights

Piggyback: https://arxiv.org/abs/1801.06519 Pretrained masks and backbones are available here: https://uofi.box.com/s/c5kixsvtrghu9yj51yb1oe853ltdfz4q

165 Nov 22, 2022

PyTorch implementation of Wide Residual Networks with 1-bit weights by McDonnell (ICLR 2018)

1-bit Wide ResNet PyTorch implementation of training 1-bit Wide ResNets from this paper: Training wide residual networks for deployment using a single

122 Dec 7, 2022

🐥A PyTorch implementation of OpenAI's finetuned transformer language model with a script to import the weights pre-trained by OpenAI

PyTorch implementation of OpenAI's Finetuned Transformer Language Model This is a PyTorch implementation of the TensorFlow code provided with OpenAI's

1.4k Jan 5, 2023

Vanilla and Prototypical Networks with Random Weights for image classification on Omniglot and mini-ImageNet. Made with Python3.

vanilla-rw-protonets-project Vanilla Prototypical Networks and PNs with Random Weights for image classification on Omniglot and mini-ImageNet. Made wi

8 Aug 31, 2022

🔪 Elimination based Lightweight Neural Net with Pretrained Weights

Related tags

Overview

ELimNet

Quickstart

Installation

Train

Inference

Performance

ELimNet vs Pretrained Models (Train)

ELimNet vs Pretrained Models (Inference)

ELimNet vs Empty Optuna NAS Models (Train)

ELimNet vs Empty Optuna NAS Models (Inference)

Background & WiP

Background

Work in Progress

Others

Authors

Comments

Owner

snoop2head

Neural networks applied in recognizing guitar chords using python, AutoML.NET with C# and .NET Core

U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

U-2-Net: U Square Net - Modified for paired image training of style transfer

RGBD-Net - This repository contains a pytorch lightning implementation for the 3DV 2021 RGBD-Net paper.

LWCC: A LightWeight Crowd Counting library for Python that includes several pretrained state-of-the-art models.

Neural-net-from-scratch - A simple Neural Network from scratch in Python using the Pymathrix library

Lightweight mmm - Lightweight (Bayesian) Media Mix Model

DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight or group of weights, in order to achieve a given trade-off between model size and accuracy.

Code for Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights

PyTorch implementation of Wide Residual Networks with 1-bit weights by McDonnell (ICLR 2018)

🐥A PyTorch implementation of OpenAI's finetuned transformer language model with a script to import the weights pre-trained by OpenAI

Vanilla and Prototypical Networks with Random Weights for image classification on Omniglot and mini-ImageNet. Made with Python3.

Voice of Pajlada with model and weights.

High level network definitions with pre-trained weights in TensorFlow

A program that can analyze videos according to the weights you select

Inflated i3d network with inception backbone, weights transfered from tensorflow

A python code to convert Keras pre-trained weights to Pytorch version

The original weights of some Caffe models, ported to PyTorch.