Code for Parameter Prediction for Unseen Deep Architectures (NeurIPS 2021)

Related tags

Deep Learning ppuda
Overview

Parameter Prediction for Unseen Deep Architectures (NeurIPS 2021)

authors: Boris Knyazev, Michal Drozdzal, Graham Taylor, Adriana Romero-Soriano

Overview Results on ResNet-50

This repository contains the code to train and evaluate Graph HyperNetworks (GHNs). This repository also contains the DeepNets-1M dataset of neural architectures proposed in our paper to train and evaluate GHNs. Our improved GHNs trained on our DeepNets-1M allow to predict parameters for diverse networks, even if they are very different from those used to train GHNs (e.g. ResNet-50). Parameter prediction by GHNs is performed in a single forward pass and on average takes < 1 second either on GPU or CPU!

Table of Contents

Requirements and installation

The main requirements are:

  • Python 3.6+
  • PyTorch 1.9+
  • NetworkX

For graph visualizations we use pygraphviz. To make it work, graphviz may need to be installed as sudo apt-get install graphviz graphviz-dev.

To install the ppuda package

pip install .  # use pip install -e . to install an editable version

To obtain all packages required for every file

pip install -r requirements.txt

It's also possible to use conda to install this by running:

conda create --name ppuda --file requirements.txt --channel default --channel anaconda --channel conda-forge --channel pytorch

Available GHNs

We release five GHNs. Below are top-1 accuracies on CIFAR-10 and top-5 accuracies on ImageNet using the parameters predicted by one of the GHNs.

Model ResNet-50* Best Architecture (index)
MLP-CIFAR-10 17.7 60.2 (167)
GHN-1-CIFAR-10 19.2 59.9 (179)
GHN-2-CIFAR-10 58.6 77.1 (210)
GHN-1-ImageNet 6.9 32.1 (295)
GHN-2-ImageNet 5.3 48.3 (85)

* ResNet-50 is an unseen architecture (i.e. trained GHNs have not observed such or even similar architectures during training). Our GHNs can still predict good parameters for such unseen architectures. On ImageNet, even though the performance is low, the predicted parameters are very useful for fine-tuning.

Denotes the architecture index in the test split of DeepNets-1M.

Each GHN checkpoint takes just a few megabytes and is stored in the checkpoints folder of this repository.

Minimal example: predict parameters for ResNet-50

ImageNet:

from ppuda.ghn.nn import GHN2
import torchvision.models as models

ghn = GHN2('imagenet')      # load our GHN-2 trained on ImageNet
model = models.resnet50()   # ResNet-50 or any other torchvision model
model = ghn(model)          # predict parameters in < 1 second on GPU/CPU

# That's it! The model can be now evaluated on ImageNet to obtain top5=5.2%.

CIFAR-10:

from ppuda.ghn.nn import GHN2
import torchvision.models as models

# On CIFAR-10, we have an additional step of adjusting 
# the first layer(s) of the network for a 32x32 image size,
# since torchvision models expect a 224x224 input, 
# while GHNs on CIFAR-10 were trained on 32x32 inputs.

from ppuda.utils import adjust_net

ghn = GHN2('cifar10')                    # load our GHN-2 trained on CIFAR-10
model = models.resnet50(num_classes=10)  # ResNet-50 
model = adjust_net(model)                # adjust to a 32x32 input
model = ghn(model)                       # predict parameters in < 1 second on GPU/CPU

# That's it! The model can be now evaluated on CIFAR-10 to obtain top1=58.6%.

Full example for ResNet-50 and other torchvision models can be found in examples/torch_models.py and examples/all_torch_models.py. See other examples in examples.

Note 1: For the networks with batch norm, the running statistics of batch norm layers are not predicted (since these statistics are not trainable parameters). So to evaluate such networks, our code computes batch statistics on the evaluation set with batch size = 64. The networks without batch norm (e.g. in our BN-Free split) have the same accuracies regardless of the batch size.

Note 2: To evaluate/train on ImageNet, follow the data instructions below on how to prepare the ImageNet dataset.

Data

DeepNets-1M

To train or evaluate on DeepNets-1M, first download the dataset file by running ./data/download.sh.

To generate a new DeepNets-1M dataset, the following command can be used:

python experiments/net_generator.py train 1000000 ./data

The dataset generated using this command should be close to our training dataset. Other splits can be regenerated by specifying the split as the first argument.

CIFAR-10

CIFAR-10 is downloaded automatically and is saved in the --data_dir folder (default is ./data).

ImageNet

We implemented a simple wrapper of the torchvision.datasets.ImageNet implementation. The ImageNet root folder imagenet is expected to be in the ./data folder by default with the following structure:

./data
│   imagenet
│   │   train
|   |   |    n01440764
|   |   |    n01443537
|   |   |    ...
│   │   val
|   |   |    n01440764
|   |   |    n01443537
|   |   |    ...
│   │   ILSVRC2012_devkit_t12.tar.gz
│   deepnets1m_train.hdf5       
|   deepnets1m_train_meta.json
|   ...

Both imagenet/train and imagenet/val must contain separate folders for each class. Follow the official instructions on how to obtain ImageNet (ILSVRC 2012) data.

Reproducing main results

The arguments of our scripts are described in config.py. The default hyperparameters are based on our paper. Below, the examples to run the scripts and override the default hyperparameters are shown.

DeepNets-1M results

Training GHN

  • GHN-1 on CIFAR-10: python experiments/train_ghn.py --name ghn1

  • GHN-2 on CIFAR-10: python experiments/train_ghn.py -m 8 -n -v 50 --ln --name ghn2

  • MLP on CIFAR-10: python experiments/train_ghn.py -m 8 -n -v 50 --ln -H mlp --name mlp

where -m 8 denotes meta batch size = 8, -n denotes to normalize predicted parameters, -v 50 denotes adding virtual edges to graphs with 50 as the maximum shortest path length, --ln denotes adding layer normalization before decoding the parameters, --name ghn2 denotes the directory name where to save trained checkpoints (which is combined with --save_dir to obtain the full path), -H mlp denotes using MLP instead of GatedGNN.

To train on Imagenet, use -d imagenet. To train GHNs on multiple GPUs (e.g. with a large meta batch size), add --multigpu to use all CUDA devices available (make sure to set CUDA_VISIBLE_DEVICES appropriately).

For example, to train GHN-2 on Imagenet and 4 GPUs: export CUDA_VISIBLE_DEVICES=0,1,2,3; python experiments/train_ghn.py -m 8 -n -v 50 --ln --name ghn2_imagenet -d imagenet --multigpu

Evaluating GHNs

  • Evaluate GHN-2 on CIFAR-10 on all architectures of $split from DeepNets-1M: python experiments/eval_ghn.py --ckpt ./checkpoints/ghn2_cifar10.pt -d cifar10 --split $split

  • Evaluate GHN-2 on CIFAR-10 on a single architecture from DeepNets-1M: python experiments/eval_ghn.py --ckpt ./checkpoints/ghn2_cifar10.pt -d cifar10 --split $split --arch $ind

where $split is one from val, test, wide, deep, dense, bnfree, predefined, $ind is an integer index of the architecture in a split.

Training and evaluating SGD

  • Train architecture=0 from the test split of DeepNets-1M for 50 epochs on CIFAR-10: python experiments/sgd/train_net.py --split test --arch 0 --epochs 50

  • Train the best architecture from the DARTS paper for 50 epochs on CIFAR-10: python experiments/sgd/train_net.py --arch DARTS --epochs 50

  • Train architecture=0 from the wide split of DeepNets-1M for 1 epoch on ImageNet: python experiments/sgd/train_net.py --split wide --arch 0 --epochs 1 -d imagenet

Fine-tuning predicted parameters on other tasks

The parameters predicted by GHN-2 trained on ImageNet can be fine-tuned on any vision dataset, such as CIFAR-10.

100-shot CIFAR-10

  • Fine-tune ResNet-50 initialized with the parameters predicted by GHN-1-ImageNet: python experiments/sgd/train_net.py --split predefined --arch 0 --epochs 50 -d cifar10 --n_shots 100 --wd 1e-3 --ckpt ./checkpoints/ghn1_imagenet.pt

  • Fine-tune ResNet-50 initialized with the parameters predicted by GHN-2-ImageNet: python experiments/sgd/train_net.py --split predefined --arch 0 --epochs 50 -d cifar10 --n_shots 100 --wd 1e-3 --ckpt ./checkpoints/ghn2_imagenet.pt

  • Fine-tune ResNet-50 initialized randomly with Kaiming He's method: python experiments/sgd/train_net.py --split predefined --arch 0 --epochs 50 -d cifar10 --n_shots 100 --wd 1e-3

  • Fine-tune ResNet-50 pretrained on Imagenet: python experiments/sgd/train_net.py --split predefined --arch 0 --epochs 50 -d cifar10 --n_shots 100 --wd 1e-3 --pretrained

  • Fine-tune ViT initialized with the parameters predicted by GHN-2-ImageNet: python experiments/sgd/train_net.py --split predefined --arch 1 --epochs 50 -d cifar10 --n_shots 100 --wd 1e-3 --ckpt ./checkpoints/ghn2_imagenet.pt

  • Fine-tune DARTS initialized with the parameters predicted by GHN-2-ImageNet: python experiments/sgd/train_net.py --arch DARTS --epochs 50 -d cifar10 --n_shots 100 --wd 1e-3 --init_channels 48 --layers 14 --ckpt ./checkpoints/ghn2_imagenet.pt

--wd 1e-3 was generally the best in these experiments. To report the results in the paper, we also tuned the initial learning rate on the 200 validation images of the 100-shot CIFAR-10 training set, so the results obtained with the scripts above might be a bit different from the reported ones.

Object detection

In the paper, we fine-tune on Penn-Fudan object detection. Our experiments are based on PyTorch Object Detection Finetuning Tutorial.

The dataset can be downloaded from here and should be put inside the ./data folder like ./data/PennFudanPed.

The commands to fine-tune/train networks for object detection are similar to those for 100-shot CIFAR-10 above, but are based on the experiments/sgd/detector/train_detector.py script and the hyperparameters from the tutorial. For example, to fine-tune DARTS initialized with the parameters predicted by GHN-2-ImageNet.

python experiments/sgd/detector/train_detector.py -d PennFudanPed --arch DARTS --ckpt ./checkpoints/ghn2_imagenet.pt --init_channels 48 --layers 14

Property prediction

To train and evaluate regression models on top of graph embeddings extracted using GHN-2-CIFAR-10:

python experiments/property_prediction.py cifar10 ./checkpoints/ghn2_cifar10.pt

The script will evaluate the four properties of architectures discussed in the paper: accuracy on the clean test set, accuracy on a corrupted test set, inference speed, and speed of convergence.

The extracted embeddings in the .npy format for each GHN are available in the checkpoints folder, but will be recomputed if they are missing.

NAS

Training the best (in terms of accuracy in this example) architecture on CIFAR-10 with SGD for 600 epochs according to the DARTS protocol:

python experiments/sgd/train_net.py --split search --arch 35133 --epochs 600 --cutout --drop_path_prob 0.2 --auxiliary

Architecture 35133 was found to be the best in the search split on CIFAR-10 using our GHN-2.

Visualization

Example of visualizing the computational graph of ResNet-50.

import torchvision
from ppuda.deepnets1m.graph import Graph

Graph(torchvision.models.resnet50()).visualize(node_size=100)

Example of visualizing the computational graph of the best DARTS network.

from ppuda.deepnets1m.graph import Graph
from ppuda.deepnets1m.net import Network
from ppuda.deepnets1m.genotypes import DARTS

model = Network(C=48, num_classes=1000, genotype=DARTS, n_cells=14)
Graph(model).visualize(node_size=50)
ResNet-50 ViT DARTS

See more examples for different architectures in examples/graph_visualization.ipynb.

License

The majority of PPUDA is licensed under MIT license, however portions of the project are available under separate license terms: DARTS is licensed under the Apache 2.0 license and NetworkX is licensed under the 3-Clause BSD license.

Contributions

Please submit a pull request or open a github issue (see the details). Make sure to comply with our code of conduct.

Acknowledgements

We thank the Vector AI Engineering team (Gerald Shen, Maria Koshkina and Deval Pandya) for code review.

Citation

@inproceedings{knyazev2021parameter,
  title={Parameter Prediction for Unseen Deep Architectures},
  author={Knyazev, Boris and Drozdzal, Michal and Taylor, Graham W and Romero-Soriano, Adriana},
  booktitle={Advances in Neural Information Processing Systems},
  year={2021}  
}
Comments
  • Improved fine-tuning, ConvNeXt support, improved training speed of GHNs

    Improved fine-tuning, ConvNeXt support, improved training speed of GHNs

    Training times

    Implementation of some steps in the forward pass of GHNs is improved to speed up the training time of GHNs without altering their overall behavior.

    Speed is measured on NVIDIA A100-40GB in terms of seconds per training iteration on ImageNet (averaged for the first 50 iterations). 4xA100 are used for meta-batch size (bm) = 8. Measurements can be noisy because of potentially other users using some computational resources of the same cluster node.

    | Model | AMP* | Current version | This PR | Estimated total speed up for 300/150 epochs** |---|:---------:|:---------:|:------------------------:|:------------------------:| | MLP with bm = 1 | ✘ | 0.30 sec/iter | 0.22 sec/iter | 5.0 days -> 3.7 days | MLP with bm = 8 | ✔ | 1.64 sec/iter | 1.01 sec/iter | 13.7 days -> 8.4 days | GHN-2 with bm = 1 | ✘ | 0.77 sec/iter | 0.70 sec/iter | 12.9 days -> 11.7 days | GHN-2 with bm = 8 | ✔ | 3.80 sec/iter | 3.08 sec/iter | 31.7 days -> 25.7 days | GHN-2 with bm = 8 | ✘ | 3.45 sec/iter | 2.80 sec/iter | 28.8 days -> 23.4 days

    • *Automatic Mixed Precision (enabled by the --amp argument in the code)
    • **To estimate the total training time, 300 epochs is used for bm=1 and 150 epochs is used for bm=8 (according to the paper).

    Fine-tuning and ConvNeXt support

    According to the report (Pretraining a Neural Network before Knowing Its Architecture) showing improved fine-tuning results, the following arguments are added to the code: --opt, --init, --imsize, --beta, --layer and file ppuda/utils/init.py with initialization functions. Also argument --val is added to enable evaluation on the validation data rather than testing data during training.

    • For example, to obtain fine-tuning results of GHN-orth for ResNet-50 according to the report: python experiments/sgd/train_net.py --val --split predefined --arch 0 --epochs 300 -d cifar10 --n_shots 100 --lr 0.01 --wd 0.01 --ckpt ./checkpoints/ghn2_imagenet.pt --opt sgd --init orth --imsize 32 --beta 3e-5 --layer 37

    • For ConvNeXt-Base: python experiments/sgd/train_net.py --val --arch convnext_base -b 48 --epochs 300 -d cifar10 --n_shots 100 --lr 0.001 --wd 0.1 --ckpt ./checkpoints/ghn2_imagenet.pt --opt adamw --init orth --imsize 32 --beta 3e-5 --layer 94. Multiple warnings will be printed that some layers (layer_scale) of ConvNeXt are not supported by GHNs, which is intended.

    A simple example to try parameter prediction for ConvNeXt is to run:

    python examples/torch_models.py cifar10 convnext_base

    Code correctness

    To make sure that the evaluation results (classification accuracies of predicted parameters) reported in the paper are the same as in this PR, the GHNs were evaluated on selected architectures and the same results were obtained (see the table below).

    | Model | ResNet-50 | ViT | Test Architecture (index in the test split) | |---|:---------:|:------------------------:|:------------------------:| | GHN-2-CIFAR-10 (top 1 acc) | 58.6% | 11.4% | 77.1% (210) | | GHN-2-ImageNet (top5 acc) | 5.3% | 4.4% | 48.3% (85) |

    To further confirm the correctness of the updated code, the training loss and top1 accuracy of training GHN-2 on CIFAR-10 for 3 epochs are reported in the table below. The command used in this benchmark is: python experiments/train_ghn.py -m 8 -n -v 50 --ln.

    | Version | Epoch 1 | Epoch 2 | Epoch 3 | |---|:---------:|:------------------------:|:------------------------:| | Current version | loss=2.41, top1=17.23 | loss=2.02, top1=20.62 | loss=1.94, top1=24.56 | | This PR | loss=2.51, top1=17.58 | loss=2.01, top1=21.62 | loss=1.90, top1=25.88 |

    These results can be noisy because of several factors like random batches, initialization of GHN, etc.

    Other

    Python script experiments/train_ghn_stable.py is added to automatically resume training GHNs from the last saved checkpoint (if any) if the run failed for some reason (e.g. OOM, nan loss, etc.). Now instead of running python experiments/train_ghn.py -m 8 -n -v 50 --ln one can use python experiments/train_ghn_stable.py experiments/train_ghn.py -m 8 -n -v 50 --ln.

    CLA Signed 
    opened by bknyaz 0
  • MLP example and resuming GHN training

    MLP example and resuming GHN training

    • Added support of graph construction for MLP in ppuda/deepnets1m/graph.py
    • Added an example to predict parameters for MLP in examples/mlp.ipynb
    • Renamed expected_image_sz to expected_input_sz in the code for more general semantics
    • Added support of resuming GHN training after the script crashes in experiments/train_ghn.py using the --ckpt argument
    CLA Signed 
    opened by bknyaz 0
  • Improved training speed of GHNs, extra results for CIFAR-10

    Improved training speed of GHNs, extra results for CIFAR-10

    Training times

    Implementation of some steps in the decoder of GHNs is improved to speed up the training time of GHNs without altering their overall behavior. These improvements mainly affect the speed when a meta-batch size > 1 is used (see the tables below).

    Speed is measured on NVIDIA Quadro RTX 6000 in terms of seconds per training iteration (averaged for the first 100 iterations).

    CIFAR-10

    | Model | Current version | Our PR | Estimated total speed up for 300 epochs |---|:---------:|:------------------------:|:------------------------:| | MLP with meta-batch size bm = 1 | 0.21 sec/iter | 0.13 sec/iter | 0.5 days -> 0.3 days | MLP with meta-batch size bm = 8 | 6.35 sec/iter | 0.89 sec/iter | 15.5 days -> 2.2 days | GHN-2 with meta-batch size bm = 1 | 0.77 sec/iter | 0.72 sec/iter | 1.9 days -> 1.8 days | GHN-2 with meta-batch size bm = 8 | 7.74 sec/iter | 1.99 sec/iter | 18.9 days -> 4.9 days

    ImageNet

    | Model | Current version | Our PR | Estimated total speed up for 300/150 epochs* |---|:---------:|:------------------------:|:------------------------:| | MLP with bm = 1 | 0.53 sec/iter | 0.37 sec/iter | 8.9 days -> 6.2 days | MLP with bm = 8 (4 GPUs) | 1.78 sec/iter | 1.36 sec/iter | 14.9 days -> 11.4 days | GHN-2 with bm = 1 | 1.08 sec/iter | 0.92 sec/iter | 18.0 days -> 15.4 days | GHN-2 with bm = 8 (4 GPUs) | 3.78 sec/iter | 3.50 sec/iter | 31.6 days -> 29.2 days

    *To estimate the total training time, 300 epochs is used for bm=1 and 150 epochs is used for bm=8 (according to the paper).

    When 4 GPUs and bm = 8 is used, the speed up is not significant, because each GPU receives only two architectures.

    Evaluation of GHNs

    To make sure that the evaluation results (classification accuracies of predicted parameters) reported in the paper are the same as in this PR, the GHNs were evaluated on selected architectures and the same results were obtained (see the table below).

    | Model | ResNet-50 | ViT | Test Architecture (index in the test split) | |---|:---------:|:------------------------:|:------------------------:| | GHN-2-CIFAR-10 | 58.6 | 11.4 | 77.1 (210) | | GHN-2-ImageNet | 5.3 | 4.4 | 48.3 (85) |

    Extra results on CIFAR-10

    • Added file data/results_cifar10_extended.json containing training and evaluation curves (when training with SGD) for 1402 evaluation architectures of DeepNets-1M.
    • The results of evaluation GHNs on validation and noisy images of CIFAR-10 are added.
    • Jupyter notebook examples/read_cifar10_results.ipynb is added to read and visualize the training/evaluation curves. The image below is obtained using this notebook.

    c10_extended_results

    Other minor updates

    • --amp flag was added that can be used to decrease GPU memory consumption and, in some cases, improve speed (this flag was used to measure speed on ImageNet with 4 GPUs)
    • md5sum values of the DeepNets-1M files were added to make it easier to verify the dataset
    CLA Signed 
    opened by bknyaz 0
  • Updates

    Updates

    Summary of updates

    README

    • updated the pip installation instruction with a public repo link (now can be installed without git clone)
    • added a link to more checkpoints of GHNs and trained networks
    • added a Colab notebook and the link to the video walking through the notebook (links are added in README)
    • added a gif animation of the Colab notebook
    • added the link to the paper pdf

    Code

    • fixed an issue with loading some pretrained networks (in fine-tuning experiments)
    • fixed an issue with loading pretrained GHNs from arbitrary directories
    CLA Signed 
    opened by bknyaz 0
  • Fine tuning of network

    Fine tuning of network

    Thank you for your work and release the code generously.

    In the google colab sample, the accuracy of the model was about 60%. I was thinking if we could continue to train the predicted model and achieve higher accuracy. Ideally it would be faster than training a new model from scratch.

    Have you tried this before?

    opened by CheukHinHoJerry 3
  • Runtime Error: Loss is nan

    Runtime Error: Loss is nan

    Hey, While i was training ghn and mlp models, at around 220 epochs, i had the following error: error <class 'RuntimeError'> the loss is nan, unable to proceed. Do you have any solution for this?

    Error Message: error <class 'RuntimeError'> the loss is nan, unable to proceed error <class 'RuntimeError'> the loss is nan, unable to proceed error <class 'RuntimeError'> the loss is nan, unable to proceed error <class 'RuntimeError'> the loss is nan, unable to proceed error <class 'RuntimeError'> the loss is nan, unable to proceed error <class 'RuntimeError'> the loss is nan, unable to proceed Out of patience (after 15 attempts to continue), please restart the job with another seed !!! Traceback (most recent call last): File "/ppuda/experiments/train_ghn.py", line 168, in main() File "/ppuda/experiments/train_ghn.py", line 105, in main loss = trainer.update(nets_torch, images, targets, ghn=ghn, graphs=graphs) File "/ppuda/../ppuda/ppuda/utils/trainer.py", line 101, in update raise RuntimeError('the loss is {}, unable to proceed'.format(loss)) RuntimeError: the loss is nan, unable to proceed

    opened by MichailChatzianastasis 1
Owner
Facebook Research
Facebook Research
[MedIA2021]MIDeepSeg: Minimally Interactive Segmentation of Unseen Objects from Medical Images Using Deep Learning

MIDeepSeg: Minimally Interactive Segmentation of Unseen Objects from Medical Images Using Deep Learning [MedIA or Arxiv] and [Demo] This repository pr

Healthcare Intelligence Laboratory 92 Dec 8, 2022
NFT-Price-Prediction-CNN - Using visual feature extraction, prices of NFTs are predicted via CNN (Alexnet and Resnet) architectures.

NFT-Price-Prediction-CNN - Using visual feature extraction, prices of NFTs are predicted via CNN (Alexnet and Resnet) architectures.

null 5 Nov 3, 2022
Deep Learning: Architectures & Methods Project: Deep Learning for Audio Super-Resolution

Deep Learning: Architectures & Methods Project: Deep Learning for Audio Super-Resolution Figure: Example visualization of the method and baseline as a

Oliver Hahn 16 Dec 23, 2022
Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation

Unseen Object Clustering: Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation Introduction In this work, we propose a new method

NVIDIA Research Projects 132 Dec 13, 2022
Codes of paper "Unseen Object Amodal Instance Segmentation via Hierarchical Occlusion Modeling"

Unseen Object Amodal Instance Segmentation (UOAIS) Seunghyeok Back, Joosoon Lee, Taewon Kim, Sangjun Noh, Raeyoung Kang, Seongho Bak, Kyoobin Lee This

GIST-AILAB 92 Dec 13, 2022
ImageNet-CoG is a benchmark for concept generalization. It provides a full evaluation framework for pre-trained visual representations which measure how well they generalize to unseen concepts.

The ImageNet-CoG Benchmark Project Website Paper (arXiv) Code repository for the ImageNet-CoG Benchmark introduced in the paper "Concept Generalizatio

NAVER 23 Oct 9, 2022
Keras like implementation of Deep Learning architectures from scratch using numpy.

Mini-Keras Keras like implementation of Deep Learning architectures from scratch using numpy. How to contribute? The project contains implementations

MANU S PILLAI 5 Oct 10, 2021
A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data

A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data Overview Clustering analysis is widely utilized in single-cell RNA-seque

AI-Biomed @NSCC-gz 3 May 8, 2022
Code image classification of MNIST dataset using different architectures: simple linear NN, autoencoder, and highway network

Deep Learning for image classification pip install -r http://webia.lip6.fr/~baskiotisn/requirements-amal.txt Train an autoencoder python3 train_auto

Hector Kohler 0 Mar 30, 2022
Official Code for AdvRush: Searching for Adversarially Robust Neural Architectures (ICCV '21)

AdvRush Official Code for AdvRush: Searching for Adversarially Robust Neural Architectures (ICCV '21) Environmental Set-up Python == 3.6.12, PyTorch =

null 11 Dec 10, 2022
Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"

T-Few This repository contains the official code for the paper: "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learni

null 220 Dec 31, 2022
Code to reproduce the experiments from our NeurIPS 2021 paper " The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective"

Code To run: python runner.py new --save <SAVE_NAME> --data <PATH_TO_DATA_DIR> --dataset <DATASET> --model <model_name> [options] --n 1000 - train - t

Geoff Pleiss 5 Dec 12, 2022
An implementation of "MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing" (ICML 2019).

MixHop and N-GCN ⠀ A PyTorch implementation of "MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing" (ICML 2019)

Benedek Rozemberczki 393 Dec 13, 2022
Vision Transformer and MLP-Mixer Architectures

Vision Transformer and MLP-Mixer Architectures Update (2.7.2021): Added the "When Vision Transformers Outperform ResNets..." paper, and SAM (Sharpness

Google Research 6.4k Jan 4, 2023
Open source implementation of AceNAS: Learning to Rank Ace Neural Architectures with Weak Supervision of Weight Sharing

AceNAS This repo is the experiment code of AceNAS, and is not considered as an official release. We are working on integrating AceNAS as a built-in st

Yuge Zhang 6 Sep 7, 2022
YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset

YOLOv5 ?? is a family of object detection architectures and models pretrained on the COCO dataset, and represents Ultralytics open-source research int

阿才 73 Dec 16, 2022
An experimental technique for efficiently exploring neural architectures.

SMASH: One-Shot Model Architecture Search through HyperNetworks An experimental technique for efficiently exploring neural architectures. This reposit

Andy Brock 478 Aug 4, 2022
This repo contains implementation of different architectures for emotion recognition in conversations.

Emotion Recognition in Conversations Updates ?? ?? ?? Date Announcements 03/08/2021 ?? ?? We have released a new dataset M2H2: A Multimodal Multiparty

Deep Cognition and Language Research (DeCLaRe) Lab 1k Dec 30, 2022
Learning Versatile Neural Architectures by Propagating Network Codes

Learning Versatile Neural Architectures by Propagating Network Codes Mingyu Ding, Yuqi Huo, Haoyu Lu, Linjie Yang, Zhe Wang, Zhiwu Lu, Jingdong Wang,

Mingyu Ding 36 Dec 6, 2022