code for "AttentiveNAS Improving Neural Architecture Search via Attentive Sampling"

Facebook Research

Last update: Oct 26, 2022

Related tags

Text Data & NLP AttentiveNAS

Overview

AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling

This repository contains PyTorch evaluation code, training code and pretrained models for AttentiveNAS.

For details see AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling by Dilin Wang, Meng Li, Chengyue Gong and Vikas Chandra.

If you find this project useful in your research, please consider cite:

@article{wang2020attentivenas,
  title={AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling},
  author={Wang, Dilin and Li, Meng and Gong, Chengyue and Chandra, Vikas},
  journal={arXiv preprint arXiv:2011.09011},
  year={2020}
}

Pretrained models and data

Download our pretrained AttentiveNAS models and a (sub-network, FLOPs) lookup table from Google Drive and put them under folder ./attentive_nas_data

Evaluation

To evaluate our pre-trained AttentiveNAS models, from AttentiveNAS-A0 to A6, on ImageNet val with a single GPU, run:

python test_attentive_nas.py --config-file ./configs/eval_attentive_nas_models.yml --model a[0-6]

Expected results:

Name	MFLOPs	Top-1 (%)
AttentiveNAS-A0	203	77.3
AttentiveNAS-A1	279	78.4
AttentiveNAS-A2	317	78.8
AttentiveNAS-A3	357	79.1
AttentiveNAS-A4	444	79.8
AttentiveNAS-A5	491	80.1
AttentiveNAS-A6	709	80.7

Training

To train our AttentiveNAS models from scratch, run

python train_supernet.py --config-file configs/train_attentive_nas_models.yml --machine-rank ${machine_rank} --num-machines ${num_machines} --dist-url ${dist_url}

We adopt SGD training on 64 GPUs. The mini-batch size is 32 per GPU; all training hyper-parameters are specified in train_attentive_nas_models.yml.

License

The majority of AttentiveNAS is licensed under CC-BY-NC, however portions of the project are available under separate license terms: Once For All is licensed under the Apache 2.0 license.

Contributing

We actively welcome your pull requests! Please see CONTRIBUTING and CODE_OF_CONDUCT for more info.

Comments

Acc predictor

The following is how to convert a subnetwork configuration to accuracy predictor compatibale inputs as u provide" res = [cfg['resolution']] for k in ['width', 'depth', 'kernel_size', 'expand_ratio']: res += cfg[k] input = np.asarray(res).reshape((1, -1))

Does the order ['resolution', 'width', 'depth', 'kernel_size', 'expand_ratio'] matter?

opened by Tongzhou0101 3
Accuracy Predictor

Hi, thanks for the great work! I have a question about the usage of the accuracy predictor.

Specifically, a predictor is used to get the acc of sub-networks and rank them during training, as described in your paper. But in the code, I didn't find where the predictor is used, like here (https://github.com/facebookresearch/AttentiveNAS/blob/88ad92f82dc343a0e7d681f1fb9a8deeb45be928/train_attentive_nas.py#L291), the criterion(model(input)) is used the get the predicted acc instead.

I am a little confused about this part, is there any important code I missed or any statement I misunderstood? Looking forward to your reply : )

opened by minghaoBD 2
The actual way to get best/worst pareto models

It seems that the accuracy predictor is not fed into the sampler. Instead of using accuracy predictor, the code shows the way to get the pareto models is to rank the computed losses of k models after running forward. I am confused that the way to get the best/worst pareto model during training is different from the details in the paper. Am I misunderstanding the paper or missing the details of the code?

opened by RachelXu7 1
The supernet appears to be reinitialize during the training process

The supernet appears to be reinitialize during the training process. I have met this issue when I'm running the AlphaNet. The log is as follows: Example 1: [10/09 16:00:53]: Epoch: [4][ 50/312] Time 2.075 ( 2.485) Data 0.000 ( 0.273) Loss 4.9844e+00 (4.9407e+00) Acc@1 17.43 ( 16.29) Acc@5 37.01 ( 35.80) [10/09 16:01:15]: Epoch: [4][ 60/312] Time 2.258 ( 2.431) Data 0.000 ( 0.228) Loss 4.9118e+00 (4.9424e+00) Acc@1 15.50 ( 16.19) Acc@5 34.94 ( 35.68) [10/09 16:01:37]: Epoch: [4][ 70/312] Time 2.368 ( 2.400) Data 0.000 ( 0.196) Loss 6.8941e+00 (5.1301e+00) Acc@1 0.10 ( 14.50) Acc@5 0.81 ( 32.05) [10/09 16:01:59]: Epoch: [4][ 80/312] Time 1.940 ( 2.374) Data 0.000 ( 0.172) Loss 6.8695e+00 (5.3466e+00) Acc@1 0.10 ( 12.73) Acc@5 0.76 ( 28.20)

Example 2: [10/11 08:46:30]: Epoch: [169][170/312] Time 2.279 ( 2.272) Data 0.000 ( 0.082) Loss 3.7633e+00 (3.6145e+00) Acc@1 41.94 ( 43.52) Acc@5 64.28 ( 67.07) [10/11 08:46:53]: Epoch: [169][180/312] Time 2.159 ( 2.270) Data 0.000 ( 0.077) Loss 3.7879e+00 (3.6247e+00) Acc@1 39.58 ( 43.30) Acc@5 63.65 ( 66.86) [10/11 08:47:15]: Epoch: [169][190/312] Time 2.206 ( 2.266) Data 0.000 ( 0.073) Loss 6.7652e+00 (3.6773e+00) Acc@1 0.22 ( 42.50) Acc@5 0.68 ( 65.76) [10/11 08:47:37]: Epoch: [169][200/312] Time 2.339 ( 2.262) Data 0.000 ( 0.069) Loss 6.8340e+00 (3.8188e+00) Acc@1 0.07 ( 40.39) Acc@5 0.44 ( 62.51)

After re-initialization, the supernet gradually fits again if training continues. Is it because of the sandwich rule？

opened by liwei9719 1
Codes about evolutionary search on the ImageNet validation set

It seems that the codes about evolutionary search are not included in this repo, will you guys open source them?

Also, the codes for generating the look-up table are also not included, it will be useful to have them:)

opened by chenbohua3 1
Question about ImageNet dataset

Could you please clarify the exact ImageNet Dataset you have used for training and testing? Is it the original ILSVRC2012 from https://image-net.org/challenges/LSVRC/2012/2012-downloads.php or from somewhere else? Is it train and validation split or you have used test split as well? What was the number of images in train and validation dataset?

Also have you made any preprocessing, e.g. resizing or you took the raw images?

opened by marianpetruk 1
pretrained AttentiveNAS models I downloaded is corrupt

pretrained AttentiveNAS models I downloaded from https://drive.google.com/file/d/1cCla-OQNIAn-rjsY2b832DuP59ZKr8uh/view?usp=sharing is corrupt.I don't know if I'm wrong. thanks

opened by yangyang90 2

Owner

Facebook Research

GitHub

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

This codebase is being actively maintained, please create and issue if you have issues using it Basics All data files are included under losses and ea

32 Nov 9, 2021

A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code or write code yourself

Scriptfab - What is it? A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code

3 Jul 28, 2021

Code for the Python code smells video on the ArjanCodes channel.

7 Python code smells This repository contains the code for the Python code smells video on the ArjanCodes channel (watch the video here). The example

55 Dec 29, 2022

Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation This is the official PyTorch implementation

564 Jan 8, 2023

Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2.

Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2. It is trained (finetuned) on a curated list of approximately 45K Python (~470MB) files gathered from the Github. Currently, it just works properly on Python but not bad at other languages (thanks to GPT-2's power).

91 Sep 23, 2022

Code-autocomplete, a code completion plugin for Python

Code AutoComplete code-autocomplete, a code completion plugin for Python.

13 Jan 7, 2023

Code of paper: A Recurrent Vision-and-Language BERT for Navigation

Recurrent VLN-BERT Code of the Recurrent-VLN-BERT paper: A Recurrent Vision-and-Language BERT for Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian

109 Dec 21, 2022

When doing audio and video sentiment recognition, I found that a lot of code is duplicated, often a function in different time debugging for a long time, based on this problem, I want to manage all the previous work, organized into an open source library can be iterative. For their own use and others.

FastAudioVisual Our project is developed here. The goal finish time is March 01, 2021 What is FastAudioVisual? FastAudioVisual is a tool that allows u

39 Oct 27, 2022

Easy, fast, effective, and automatic g-code compression!

Getting to the meat of g-code. Easy, fast, effective, and automatic g-code compression! MeatPack nearly doubles the effective data rate of a standard

97 Nov 21, 2022

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

textgenrnn Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly tr

4.8k Dec 30, 2022

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

T5: Text-To-Text Transfer Transformer The t5 library serves primarily as code for reproducing the experiments in Exploring the Limits of Transfer Lear

4.6k Jan 1, 2023

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

textgenrnn Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly tr

4.3k Feb 18, 2021

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

T5: Text-To-Text Transfer Transformer The t5 library serves primarily as code for reproducing the experiments in Exploring the Limits of Transfer Lear

3.2k Feb 17, 2021

Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks

Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks. It takes raw videos/images + text as inputs, and outputs task predictions. ClipBERT is designed based on 2D CNNs and transformers, and uses a sparse sampling strategy to enable efficient end-to-end video-and-language learning.

612 Jan 4, 2023

code for "AttentiveNAS Improving Neural Architecture Search via Attentive Sampling"

Related tags

Overview

AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling

Pretrained models and data

Evaluation

Training

License

Contributing

Comments

Acc predictor

Accuracy Predictor

The actual way to get best/worst pareto models

The supernet appears to be reinitialize during the training process

Codes about evolutionary search on the ImageNet validation set

Question about ImageNet dataset

pretrained AttentiveNAS models I downloaded is corrupt

Owner

Facebook Research

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code or write code yourself

Code for the Python code smells video on the ArjanCodes channel.

Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2.

Code-autocomplete, a code completion plugin for Python

Code of paper: A Recurrent Vision-and-Language BERT for Navigation

Easy, fast, effective, and automatic g-code compression!

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks

Collection of scripts to pinpoint obfuscated code

Code associated with the "Data Augmentation using Pre-trained Transformer Models" paper

Official code for "Parser-Free Virtual Try-on via Distilling Appearance Flows", CVPR 2021

This is the source code of RPG (Reward-Randomized Policy Gradient)

Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].