code for "AttentiveNAS Improving Neural Architecture Search via Attentive Sampling"

Overview

AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling

This repository contains PyTorch evaluation code, training code and pretrained models for AttentiveNAS.

For details see AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling by Dilin Wang, Meng Li, Chengyue Gong and Vikas Chandra.

If you find this project useful in your research, please consider cite:

@article{wang2020attentivenas,
  title={AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling},
  author={Wang, Dilin and Li, Meng and Gong, Chengyue and Chandra, Vikas},
  journal={arXiv preprint arXiv:2011.09011},
  year={2020}
}

Pretrained models and data

Download our pretrained AttentiveNAS models and a (sub-network, FLOPs) lookup table from Google Drive and put them under folder ./attentive_nas_data

Evaluation

To evaluate our pre-trained AttentiveNAS models, from AttentiveNAS-A0 to A6, on ImageNet val with a single GPU, run:

python test_attentive_nas.py --config-file ./configs/eval_attentive_nas_models.yml --model a[0-6]

Expected results:

Name MFLOPs Top-1 (%)
AttentiveNAS-A0 203 77.3
AttentiveNAS-A1 279 78.4
AttentiveNAS-A2 317 78.8
AttentiveNAS-A3 357 79.1
AttentiveNAS-A4 444 79.8
AttentiveNAS-A5 491 80.1
AttentiveNAS-A6 709 80.7

Training

To train our AttentiveNAS models from scratch, run

python train_supernet.py --config-file configs/train_attentive_nas_models.yml --machine-rank ${machine_rank} --num-machines ${num_machines} --dist-url ${dist_url}

We adopt SGD training on 64 GPUs. The mini-batch size is 32 per GPU; all training hyper-parameters are specified in train_attentive_nas_models.yml.

License

The majority of AttentiveNAS is licensed under CC-BY-NC, however portions of the project are available under separate license terms: Once For All is licensed under the Apache 2.0 license.

Contributing

We actively welcome your pull requests! Please see CONTRIBUTING and CODE_OF_CONDUCT for more info.

Comments
  • Acc predictor

    Acc predictor

    The following is how to convert a subnetwork configuration to accuracy predictor compatibale inputs as u provide" res = [cfg['resolution']] for k in ['width', 'depth', 'kernel_size', 'expand_ratio']: res += cfg[k] input = np.asarray(res).reshape((1, -1))

    Does the order ['resolution', 'width', 'depth', 'kernel_size', 'expand_ratio'] matter?

    opened by Tongzhou0101 3
  • Accuracy Predictor

    Accuracy Predictor

    Hi, thanks for the great work! I have a question about the usage of the accuracy predictor.

    Specifically, a predictor is used to get the acc of sub-networks and rank them during training, as described in your paper. But in the code, I didn't find where the predictor is used, like here (https://github.com/facebookresearch/AttentiveNAS/blob/88ad92f82dc343a0e7d681f1fb9a8deeb45be928/train_attentive_nas.py#L291), the criterion(model(input)) is used the get the predicted acc instead.

    I am a little confused about this part, is there any important code I missed or any statement I misunderstood? Looking forward to your reply : )

    opened by minghaoBD 2
  • The actual way to get best/worst pareto models

    The actual way to get best/worst pareto models

    It seems that the accuracy predictor is not fed into the sampler. Instead of using accuracy predictor, the code shows the way to get the pareto models is to rank the computed losses of k models after running forward. I am confused that the way to get the best/worst pareto model during training is different from the details in the paper. Am I misunderstanding the paper or missing the details of the code?

    opened by RachelXu7 1
  • The supernet appears to be reinitialize during the training process

    The supernet appears to be reinitialize during the training process

    The supernet appears to be reinitialize during the training process. I have met this issue when I'm running the AlphaNet. The log is as follows: Example 1: [10/09 16:00:53]: Epoch: [4][ 50/312] Time 2.075 ( 2.485) Data 0.000 ( 0.273) Loss 4.9844e+00 (4.9407e+00) Acc@1 17.43 ( 16.29) Acc@5 37.01 ( 35.80) [10/09 16:01:15]: Epoch: [4][ 60/312] Time 2.258 ( 2.431) Data 0.000 ( 0.228) Loss 4.9118e+00 (4.9424e+00) Acc@1 15.50 ( 16.19) Acc@5 34.94 ( 35.68) [10/09 16:01:37]: Epoch: [4][ 70/312] Time 2.368 ( 2.400) Data 0.000 ( 0.196) Loss 6.8941e+00 (5.1301e+00) Acc@1 0.10 ( 14.50) Acc@5 0.81 ( 32.05) [10/09 16:01:59]: Epoch: [4][ 80/312] Time 1.940 ( 2.374) Data 0.000 ( 0.172) Loss 6.8695e+00 (5.3466e+00) Acc@1 0.10 ( 12.73) Acc@5 0.76 ( 28.20)

    Example 2: [10/11 08:46:30]: Epoch: [169][170/312] Time 2.279 ( 2.272) Data 0.000 ( 0.082) Loss 3.7633e+00 (3.6145e+00) Acc@1 41.94 ( 43.52) Acc@5 64.28 ( 67.07) [10/11 08:46:53]: Epoch: [169][180/312] Time 2.159 ( 2.270) Data 0.000 ( 0.077) Loss 3.7879e+00 (3.6247e+00) Acc@1 39.58 ( 43.30) Acc@5 63.65 ( 66.86) [10/11 08:47:15]: Epoch: [169][190/312] Time 2.206 ( 2.266) Data 0.000 ( 0.073) Loss 6.7652e+00 (3.6773e+00) Acc@1 0.22 ( 42.50) Acc@5 0.68 ( 65.76) [10/11 08:47:37]: Epoch: [169][200/312] Time 2.339 ( 2.262) Data 0.000 ( 0.069) Loss 6.8340e+00 (3.8188e+00) Acc@1 0.07 ( 40.39) Acc@5 0.44 ( 62.51)

    After re-initialization, the supernet gradually fits again if training continues. Is it because of the sandwich rule?

    opened by liwei9719 1
  • Codes about evolutionary search on the ImageNet validation set

    Codes about evolutionary search on the ImageNet validation set

    It seems that the codes about evolutionary search are not included in this repo, will you guys open source them?

    Also, the codes for generating the look-up table are also not included, it will be useful to have them:)

    opened by chenbohua3 1
  • Question about ImageNet dataset

    Question about ImageNet dataset

    Could you please clarify the exact ImageNet Dataset you have used for training and testing? Is it the original ILSVRC2012 from https://image-net.org/challenges/LSVRC/2012/2012-downloads.php or from somewhere else? Is it train and validation split or you have used test split as well? What was the number of images in train and validation dataset?

    Also have you made any preprocessing, e.g. resizing or you took the raw images?

    opened by marianpetruk 1
  •  pretrained AttentiveNAS models I downloaded is corrupt

    pretrained AttentiveNAS models I downloaded is corrupt

    pretrained AttentiveNAS models I downloaded from https://drive.google.com/file/d/1cCla-OQNIAn-rjsY2b832DuP59ZKr8uh/view?usp=sharing is corrupt.I don't know if I'm wrong. thanks

    opened by yangyang90 2
Owner
Facebook Research
Facebook Research
Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

This codebase is being actively maintained, please create and issue if you have issues using it Basics All data files are included under losses and ea

Justin Terry 32 Nov 9, 2021
A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code or write code yourself

Scriptfab - What is it? A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code

DevNugget 3 Jul 28, 2021
Code for the Python code smells video on the ArjanCodes channel.

7 Python code smells This repository contains the code for the Python code smells video on the ArjanCodes channel (watch the video here). The example

null 55 Dec 29, 2022
Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation This is the official PyTorch implementation

Salesforce 564 Jan 8, 2023
Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2.

Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2. It is trained (finetuned) on a curated list of approximately 45K Python (~470MB) files gathered from the Github. Currently, it just works properly on Python but not bad at other languages (thanks to GPT-2's power).

Galois Autocompleter 91 Sep 23, 2022
Code-autocomplete, a code completion plugin for Python

Code AutoComplete code-autocomplete, a code completion plugin for Python.

xuming 13 Jan 7, 2023
Code of paper: A Recurrent Vision-and-Language BERT for Navigation

Recurrent VLN-BERT Code of the Recurrent-VLN-BERT paper: A Recurrent Vision-and-Language BERT for Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian

YicongHong 109 Dec 21, 2022
Easy, fast, effective, and automatic g-code compression!

Getting to the meat of g-code. Easy, fast, effective, and automatic g-code compression! MeatPack nearly doubles the effective data rate of a standard

Scott Mudge 97 Nov 21, 2022
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

textgenrnn Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly tr

Max Woolf 4.8k Dec 30, 2022
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

T5: Text-To-Text Transfer Transformer The t5 library serves primarily as code for reproducing the experiments in Exploring the Limits of Transfer Lear

Google Research 4.6k Jan 1, 2023
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

textgenrnn Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly tr

Max Woolf 4.3k Feb 18, 2021
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

T5: Text-To-Text Transfer Transformer The t5 library serves primarily as code for reproducing the experiments in Exploring the Limits of Transfer Lear

Google Research 3.2k Feb 17, 2021
Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks

Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks. It takes raw videos/images + text as inputs, and outputs task predictions. ClipBERT is designed based on 2D CNNs and transformers, and uses a sparse sampling strategy to enable efficient end-to-end video-and-language learning.

Jie Lei 雷杰 612 Jan 4, 2023
Collection of scripts to pinpoint obfuscated code

Obfuscation Detection (v1.0) Author: Tim Blazytko Automatically detect control-flow flattening and other state machines Description: Scripts and binar

Tim Blazytko 230 Nov 26, 2022
Code associated with the "Data Augmentation using Pre-trained Transformer Models" paper

Data Augmentation using Pre-trained Transformer Models Code associated with the Data Augmentation using Pre-trained Transformer Models paper Code cont

null 44 Dec 31, 2022
Official code for "Parser-Free Virtual Try-on via Distilling Appearance Flows", CVPR 2021

Parser-Free Virtual Try-on via Distilling Appearance Flows, CVPR 2021 Official code for CVPR 2021 paper 'Parser-Free Virtual Try-on via Distilling App

null 395 Jan 3, 2023
This is the source code of RPG (Reward-Randomized Policy Gradient)

RPG (Reward-Randomized Policy Gradient) Zhenggang Tang*, Chao Yu*, Boyuan Chen, Huazhe Xu, Xiaolong Wang, Fei Fang, Simon Shaolei Du, Yu Wang, Yi Wu (

null 40 Nov 25, 2022
Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].

PLBART Code pre-release of our work, Unified Pre-training for Program Understanding and Generation accepted at NAACL 2021. Note. A detailed documentat

Wasi Ahmad 138 Dec 30, 2022