Differentiable architecture search for convolutional and recurrent networks

Overview

Differentiable Architecture Search

Code accompanying the paper

DARTS: Differentiable Architecture Search
Hanxiao Liu, Karen Simonyan, Yiming Yang.
arXiv:1806.09055.

darts

The algorithm is based on continuous relaxation and gradient descent in the architecture space. It is able to efficiently design high-performance convolutional architectures for image classification (on CIFAR-10 and ImageNet) and recurrent architectures for language modeling (on Penn Treebank and WikiText-2). Only a single GPU is required.

Requirements

Python >= 3.5.5, PyTorch == 0.3.1, torchvision == 0.2.0

NOTE: PyTorch 0.4 is not supported at this moment and would lead to OOM.

Datasets

Instructions for acquiring PTB and WT2 can be found here. While CIFAR-10 can be automatically downloaded by torchvision, ImageNet needs to be manually downloaded (preferably to a SSD) following the instructions here.

Pretrained models

The easist way to get started is to evaluate our pretrained DARTS models.

CIFAR-10 (cifar10_model.pt)

cd cnn && python test.py --auxiliary --model_path cifar10_model.pt
  • Expected result: 2.63% test error rate with 3.3M model params.

PTB (ptb_model.pt)

cd rnn && python test.py --model_path ptb_model.pt
  • Expected result: 55.68 test perplexity with 23M model params.

ImageNet (imagenet_model.pt)

cd cnn && python test_imagenet.py --auxiliary --model_path imagenet_model.pt
  • Expected result: 26.7% top-1 error and 8.7% top-5 error with 4.7M model params.

Architecture search (using small proxy models)

To carry out architecture search using 2nd-order approximation, run

cd cnn && python train_search.py --unrolled     # for conv cells on CIFAR-10
cd rnn && python train_search.py --unrolled     # for recurrent cells on PTB

Note the validation performance in this step does not indicate the final performance of the architecture. One must train the obtained genotype/architecture from scratch using full-sized models, as described in the next section.

Also be aware that different runs would end up with different local minimum. To get the best result, it is crucial to repeat the search process with different seeds and select the best cell(s) based on validation performance (obtained by training the derived cell from scratch for a small number of epochs). Please refer to fig. 3 and sect. 3.2 in our arXiv paper.

progress_convolutional_normal progress_convolutional_reduce progress_recurrent

Figure: Snapshots of the most likely normal conv, reduction conv, and recurrent cells over time.

Architecture evaluation (using full-sized models)

To evaluate our best cells by training from scratch, run

cd cnn && python train.py --auxiliary --cutout            # CIFAR-10
cd rnn && python train.py                                 # PTB
cd rnn && python train.py --data ../data/wikitext-2 \     # WT2
            --dropouth 0.15 --emsize 700 --nhidlast 700 --nhid 700 --wdecay 5e-7
cd cnn && python train_imagenet.py --auxiliary            # ImageNet

Customized architectures are supported through the --arch flag once specified in genotypes.py.

The CIFAR-10 result at the end of training is subject to variance due to the non-determinism of cuDNN back-prop kernels. It would be misleading to report the result of only a single run. By training our best cell from scratch, one should expect the average test error of 10 independent runs to fall in the range of 2.76 +/- 0.09% with high probability.

cifar10 ptb ptb

Figure: Expected learning curves on CIFAR-10 (4 runs), ImageNet and PTB.

Visualization

Package graphviz is required to visualize the learned cells

python visualize.py DARTS

where DARTS can be replaced by any customized architectures in genotypes.py.

Citation

If you use any part of this code in your research, please cite our paper:

@article{liu2018darts,
  title={DARTS: Differentiable Architecture Search},
  author={Liu, Hanxiao and Simonyan, Karen and Yang, Yiming},
  journal={arXiv preprint arXiv:1806.09055},
  year={2018}
}
Comments
  • How was random baseline done in paper?

    How was random baseline done in paper?

    Hi --

    In the paper, you describe fairly strong baseline performance from random architectures. Are you able to give a little more information about how those random baselines were done? Specifically, is that the average of a number of random runs, or just a single random run?

    Thanks ~ Ben

    opened by bkj 24
  • Poor PTB test performance?

    Poor PTB test performance?

    Hi Hanxiao,

    I train this model https://github.com/quark0/darts/blob/master/rnn/genotypes.py#L33 on PTB. I obtain the similar validation performance (val ppl = 59.0), but the test ppl (61.3) is much higher than the reported results (55.7). Is there any suggestion?

    Thanks

    opened by D-X-Y 12
  • How to derive the architecture? Running the train_search on cifar10, but the architecture is different

    How to derive the architecture? Running the train_search on cifar10, but the architecture is different

    Hi I have 2 questions about how to derive the final architecture:

    1. I use the default search script on cifar10 python train_search.py --unrolled --seed 0 to search the architecture. I found the architecture is different with the one provided by the paper (also in the genotypes.py of the repo). If I change the seed, the architecture will also change. In the paper, the authors mentioned the results are obtained by 4 runs. So my questions: Do the 4 runs use the same architecture? Or use 4 different architectures?, and How the illustrated architecture in the paper is selected?

    2. On my own run to search the architecture, I found the probability of the zero op is the highest. However, in the paper, the authors mentioned that zero op is not used in the final architecture (Sec.2.4). This is also confirmed in the code. My question is If zero op is not used, why we add a zeros in the searching space? It is really weird since if we do not excluded the zero op, all the ops will be zero ;-(. Does the author have the same problems? For example, the alphas for normal cell is

    [[0.1838, 0.0982, 0.081 , 0.1736, 0.1812, 0.0846, 0.091 , 0.1066], [0.4717, 0.0458, 0.0496, 0.0945, 0.1113, 0.0556, 0.0953, 0.0762], [0.2946, 0.1425, 0.0855, 0.1768, 0.0837, 0.0735, 0.0731, 0.0704], [0.3991, 0.0631, 0.0581, 0.1053, 0.1307, 0.0577, 0.1043, 0.0817], [0.6298, 0.0382, 0.035 , 0.0658, 0.0435, 0.0551, 0.0605, 0.0721], [0.3526, 0.0974, 0.0693, 0.1346, 0.1245, 0.0697, 0.091 , 0.061 ], [0.4829, 0.06 , 0.0612, 0.115 , 0.0969, 0.065 , 0.0624, 0.0565], [0.6591, 0.0303, 0.0282, 0.0558, 0.0578, 0.054 , 0.0581, 0.0568], [0.7612, 0.0199, 0.0207, 0.0294, 0.0343, 0.0442, 0.0431, 0.0472], [0.3519, 0.1231, 0.0692, 0.1381, 0.0925, 0.076 , 0.0748, 0.0744], [0.4767, 0.0781, 0.0679, 0.1216, 0.0679, 0.0701, 0.0548, 0.0629], [0.6769, 0.032 , 0.0292, 0.0547, 0.0533, 0.0427, 0.0614, 0.0498], [0.7918, 0.0191, 0.0199, 0.0279, 0.0423, 0.0223, 0.0392, 0.0375], [0.8325, 0.0153, 0.0158, 0.0199, 0.0284, 0.0255, 0.0313, 0.0313]]

    Each row is the probability of ['none', 'max_pool_3x3','avg_pool_3x3','skip_connect','sep_conv_3x3','sep_conv_5x5','dil_conv_3x3','dil_conv_5x5'] for each edge.

    opened by Ian09 12
  • The code to calculate the multiply-add operations?

    The code to calculate the multiply-add operations?

    In section 3.4.1, it is said that " the number of multiply-add operations in the model is restricted to be less than 600M". Would you mind to provide the code to calculate the number of multiply-add operations?

    Best Regards,

    opened by D-X-Y 11
  • Less search cost

    Less search cost

    Hi, I run the first order cnn search with a single GPU(Titan X). It costs only 11 hours, much less than 1.5 GPU days reported in the paper. Amazing....... The log is shown bellow. By the way, the result architectures are different in different runs. How did you pick the architecture reported in the paper?

    Thank you in advance. Yukang

    07/08 05:51:51 AM train 250 1.791708e-02 99.670070 100.000000 07/08 05:53:27 AM train 300 1.792845e-02 99.667774 100.000000 07/08 05:55:03 AM train 350 1.800602e-02 99.675036 100.000000 07/08 05:56:20 AM train_acc 99.684000 07/08 05:56:20 AM valid 000 4.737676e-01 90.625000 98.437500 07/08 05:56:31 AM valid 050 4.070231e-01 89.430147 99.264706 07/08 05:56:41 AM valid 100 4.179213e-01 89.279084 99.443069 07/08 05:56:51 AM valid 150 4.126024e-01 89.093543 99.503311 07/08 05:56:53 AM valid_acc 89.110000 07/08 05:56:53 AM epoch 49 lr 1.023679e-03 07/08 05:56:53 AM genotype = Genotype(normal=[('skip_connect', 0), ('sep_conv_3x3', 1), ('skip_connect', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('skip_connect', 0), ('skip_connect', 1)], normal_concat=[2, 3, 4, 5], reduce=[('max_pool_3x3', 0), ('max_pool_3x3', 1), ('skip_connect', 2), ('max_pool_3x3', 0), ('skip_connect', 3), ('max_pool_3x3', 1), ('skip_connect', 2), ('max_pool_3x3', 0)], reduce_concat=[2, 3, 4, 5]) 07/08 05:56:55 AM train 000 7.202163e-03 100.000000 100.000000 07/08 05:58:30 AM train 050 1.960013e-02 99.632353 100.000000 07/08 06:00:05 AM train 100 1.947881e-02 99.659653 100.000000 07/08 06:01:40 AM train 150 1.897141e-02 99.679222 100.000000 07/08 06:03:15 AM train 200 1.902238e-02 99.681281 100.000000 07/08 06:04:51 AM train 250 1.879712e-02 99.688745 100.000000 07/08 06:06:26 AM train 300 1.801273e-02 99.719684 100.000000 07/08 06:08:00 AM train 350 1.759578e-02 99.724003 100.000000 07/08 06:09:17 AM train_acc 99.720000 07/08 06:09:17 AM valid 000 4.404125e-01 92.187500 98.437500 07/08 06:09:28 AM valid 050 4.015100e-01 89.552696 99.356618 07/08 06:09:38 AM valid 100 4.156907e-01 89.217203 99.489480 07/08 06:09:50 AM valid 150 4.085849e-01 89.155629 99.596440 07/08 06:09:51 AM valid_acc 89.200000

    opened by yukang2017 9
  • Out of memory trying to run CIFAR example

    Out of memory trying to run CIFAR example

    I am tring to run CIFAR example using pyro docker image, but I have cuda out of memory error:

    pyromancer@6d7a480a66c9:~/workspace/shared/darts/cnn$ python train_search.py --unrolled Experiment dir : search-EXP-20180712-214257 07/12 09:42:57 PM gpu device = 0 07/12 09:42:57 PM args = Namespace(arch_learning_rate=0.0003, arch_weight_decay=0.001, batch_size=64, cutout=False, cutout_length=16, data='../data', drop_path_prob=0.3, epochs=50, gpu=0, grad_clip=5, init_channels=16, layers=8, learning_rate=0.025, learning_rate_min=0.001, model_path='saved_models', momentum=0.9, report_freq=50, save='search-EXP-20180712-214257', seed=2, train_portion=0.5, unrolled=True, weight_decay=0.0003) 07/12 09:43:00 PM param size = 1.930618MB Files already downloaded and verified Files already downloaded and verified 07/12 09:43:01 PM epoch 0 lr 2.500000e-02 07/12 09:43:01 PM genotype = Genotype(normal=[('avg_pool_3x3', 0), ('dil_conv_5x5', 1), ('dil_conv_3x3', 1), ('dil_conv_5x5', 2), ('max_pool_3x3', 1), ('avg_pool_3x3', 0), ('dil_conv_5x5', 1), ('avg_pool_3x3', 0)], normal_concat=range(2, 6), reduce=[('avg_pool_3x3', 1), ('avg_pool_3x3', 0), ('sep_conv_3x3', 1), ('dil_conv_5x5', 2), ('sep_conv_3x3', 2), ('avg_pool_3x3', 3), ('max_pool_3x3', 4), ('dil_conv_5x5', 0)], reduce_concat=range(2, 6)) tensor([[ 0.1249, 0.1249, 0.1252, 0.1251, 0.1250, 0.1250, 0.1250, 0.1249], [ 0.1250, 0.1248, 0.1251, 0.1250, 0.1250, 0.1250, 0.1251, 0.1251], [ 0.1250, 0.1250, 0.1250, 0.1250, 0.1250, 0.1250, 0.1251, 0.1249], [ 0.1249, 0.1249, 0.1249, 0.1250, 0.1249, 0.1250, 0.1253, 0.1251], [ 0.1249, 0.1251, 0.1251, 0.1250, 0.1249, 0.1249, 0.1249, 0.1251], [ 0.1250, 0.1250, 0.1252, 0.1251, 0.1249, 0.1249, 0.1250, 0.1249], [ 0.1249, 0.1253, 0.1250, 0.1248, 0.1248, 0.1251, 0.1251, 0.1250], [ 0.1249, 0.1251, 0.1251, 0.1252, 0.1250, 0.1248, 0.1250, 0.1249], [ 0.1251, 0.1250, 0.1250, 0.1250, 0.1249, 0.1251, 0.1249, 0.1250], [ 0.1249, 0.1248, 0.1252, 0.1247, 0.1251, 0.1249, 0.1252, 0.1251], [ 0.1249, 0.1249, 0.1251, 0.1250, 0.1248, 0.1250, 0.1249, 0.1254], [ 0.1251, 0.1250, 0.1250, 0.1250, 0.1252, 0.1250, 0.1249, 0.1250], [ 0.1251, 0.1249, 0.1250, 0.1250, 0.1251, 0.1249, 0.1250, 0.1251], [ 0.1251, 0.1252, 0.1251, 0.1247, 0.1252, 0.1249, 0.1249, 0.1250]], device='cuda:0') tensor([[ 0.1251, 0.1251, 0.1251, 0.1250, 0.1247, 0.1250, 0.1251, 0.1250], [ 0.1250, 0.1249, 0.1251, 0.1250, 0.1250, 0.1248, 0.1251, 0.1248], [ 0.1252, 0.1251, 0.1250, 0.1250, 0.1249, 0.1249, 0.1250, 0.1250], [ 0.1248, 0.1249, 0.1250, 0.1249, 0.1252, 0.1250, 0.1251, 0.1251], [ 0.1252, 0.1249, 0.1250, 0.1250, 0.1250, 0.1249, 0.1250, 0.1251], [ 0.1249, 0.1251, 0.1250, 0.1250, 0.1250, 0.1251, 0.1250, 0.1249], [ 0.1249, 0.1249, 0.1251, 0.1251, 0.1246, 0.1251, 0.1251, 0.1251], [ 0.1250, 0.1247, 0.1250, 0.1251, 0.1252, 0.1250, 0.1250, 0.1250], [ 0.1252, 0.1249, 0.1252, 0.1247, 0.1249, 0.1251, 0.1250, 0.1250], [ 0.1248, 0.1251, 0.1251, 0.1249, 0.1249, 0.1249, 0.1251, 0.1252], [ 0.1249, 0.1250, 0.1250, 0.1251, 0.1251, 0.1251, 0.1249, 0.1249], [ 0.1250, 0.1249, 0.1249, 0.1252, 0.1250, 0.1250, 0.1251, 0.1250], [ 0.1249, 0.1251, 0.1249, 0.1251, 0.1252, 0.1250, 0.1248, 0.1250], [ 0.1251, 0.1253, 0.1249, 0.1250, 0.1248, 0.1249, 0.1248, 0.1251]], device='cuda:0') THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1524586445097/work/aten/src/THC/generic/THCStorage.cu line=58 error=2 : out of memory Traceback (most recent call last): File "train_search.py", line 200, in <module> main() File "train_search.py", line 124, in main train_acc, train_obj, arch_grad_norm = train(train_queue, search_queue, model, architect, criterion, optimizer, lr) File "train_search.py", line 152, in train arch_grad_norm = architect.step(input, target, input_search, target_search, lr, optimizer, unrolled=args.unrolled) File "/home/pyromancer/workspace/shared/darts/cnn/architect.py", line 37, in step input_train, target_train, input_valid, target_valid, eta, network_optimizer) File "/home/pyromancer/workspace/shared/darts/cnn/architect.py", line 53, in _backward_step_unrolled model_unrolled = self._compute_unrolled_model(input_train, target_train, eta, network_optimizer) File "/home/pyromancer/workspace/shared/darts/cnn/architect.py", line 23, in _compute_unrolled_model loss = self.model._loss(input, target) File "/home/pyromancer/workspace/shared/darts/cnn/model_search.py", line 110, in _loss logits = self(input) File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__ result = self.forward(*input, **kwargs) File "/home/pyromancer/workspace/shared/darts/cnn/model_search.py", line 104, in forward s0, s1 = s1, cell(s0, s1, weights) File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__ result = self.forward(*input, **kwargs) File "/home/pyromancer/workspace/shared/darts/cnn/model_search.py", line 54, in forward s = sum(self._ops[offset+j](h, weights[offset+j]) for j, h in enumerate(states)) File "/home/pyromancer/workspace/shared/darts/cnn/model_search.py", line 54, in <genexpr> s = sum(self._ops[offset+j](h, weights[offset+j]) for j, h in enumerate(states)) File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__ result = self.forward(*input, **kwargs) File "/home/pyromancer/workspace/shared/darts/cnn/model_search.py", line 22, in forward return sum(w * op(x) for w, op in zip(weights, self._ops)) File "/home/pyromancer/workspace/shared/darts/cnn/model_search.py", line 22, in <genexpr> return sum(w * op(x) for w, op in zip(weights, self._ops)) RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1524586445097/work/aten/src/THC/generic/THCStorage.cu:58

    Card is GeForce GTX 1080 8119MiB on ubuntu linux box.

    opened by AlexMikhalev 8
  • Training results of `train_search.py`?

    Training results of `train_search.py`?

    What's the recommended way to train the results of train_search.py? This is the end of my log.txt:

    ...
    2018-06-27 13:25:46,378 epoch 49 lr 1.023679e-03
    2018-06-27 13:25:46,379 genotype = Genotype(normal=[('skip_connect', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 2), ('sep_conv_3x3', 1), ('skip_connect', 0), ('sep_conv_3x3', 0), ('skip_connect', 1)], normal_concat=range(2, 6), reduce=[('max_pool_3x3', 0), ('max_pool_3x3', 1), ('max_pool_3x3', 0), ('skip_connect', 2), ('max_pool_3x3', 0), ('dil_conv_5x5', 3), ('skip_connect', 2), ('skip_connect', 3)], reduce_concat=range(2, 6))
    ...2018-06-27 13:54:15,198 train_acc 99.704000
    ...
    2018-06-27 13:54:45,268 valid_acc 88.760000
    

    Should I just copy and paste that Genotype into genotypes.py w/ a new name? Or is there some recommended way?

    Thanks

    opened by bkj 7
  • input_search and target_search are unchanged during iteration

    input_search and target_search are unchanged during iteration

    https://github.com/quark0/darts/blob/b6d4fe1692a67d81adaa3d4bfd7c13e3dcb1d443/cnn/train_search.py#L147

    Hi, thank you for releasing code! nice work! But I discover something I can't understand. In function train in train_search.py, input_search and target_search are unchanged during iteration. They are always the first batch in search_queue.

    Is it a small bug or you do it by purpose?

    opened by yukang2017 6
  • concat operation for the Cell in model_search.py

    concat operation for the Cell in model_search.py

    Hi, Thanks for your great work! I am a little confused by the line 58 in your model_search.py file, which writes as following: return torch.cat(states[-self._multiplier:], dim=1) Following the paper, the hidden states for concat operation should be the last self._steps states, so I think it should write as following: return torch.cat(states[-self._steps:], dim=1) However, I am not sure whether my understanding is right. Looking forward for your response.

    opened by tengteng95 5
  • Train arch_parameters with validation data?

    Train arch_parameters with validation data?

    Thank you for sharing such beautiful implement.

    I'm sorry, but I have a question about why arch_parameters are trained with validation data. In ./cnn/architect, the _backward_step function computes loss for arch_parameter updates with validation data (i.e. input_valid, target_valid). Should not we train arch_parameters with train data and search the proper architecture on validation data?

    opened by Harry-Up 4
  • Some issues about the paper

    Some issues about the paper

    Hi Hanxiao,

    The table 1 in the paper makes me confused. There are three issues: 1 . AmoebaNet-A with 3.34 ± 0.06 test error and 3.2 M params in the original paper is trained without cutout. 2. The search cost for NASNet-A is different in the first an the second line (1800 vs 3150). I refer to the latest original paper, it is 2000 GPU days, 4 days with 500 GPUS. 3. In my view, using GPU days (number of GPUs x days) as metrics is not fair. Because, the running speed on two GPUs is less than twice as the speed on one GPU. In the other word, running on two GPUs across 1 day are not simply same to running one GPU across 2 days, although they are both 2 GPU days.

    Best Yukang

    opened by yukang2017 4
  • try to visulize the genotype but got error

    try to visulize the genotype but got error

    I 'd like to know the meaning of this code in line 42 visualize

    if len(sys.argv) != 2:

    The result of the visualization of genotype always show as follow:

    usage: python darts-master/darts-master/cnn/visualize.py ARCH_NAME

    Process finished with exit code 1

    opened by LinzheCAI 0
  • RuntimeError: cannot pin 'torch.cuda.DoubleTensor' on GPU on version 0.10.0

    RuntimeError: cannot pin 'torch.cuda.DoubleTensor' on GPU on version 0.10.0

    I'm trying to fit the RNN model on GPU and this is the error that I get:

    RuntimeError: cannot pin 'torch.cuda.DoubleTensor' only dense CPU tensors can be pinned

    I also ran the example from here and the same error. The fit method runs without problem on the CPU but not on GPU.

    Here's the error's backtrace: https://paste.ubuntu.com/p/KkpthPvPKp/

    It seems that this problem has been introduced in the latest version (0.10.0), and it is not an issue in version 0.9.1 in which my code was written previously.

    opened by novinsh 0
  • About alpha

    About alpha

    Hi, I have a quick question. Why all 8 cells share a same alpha?

    def _initialize_alphas(self):
         k = sum(1 for i in range(self._steps) for n in range(2+i))
         num_ops = len(PRIMITIVES)
    
         self.alphas_normal = Variable(1e-3*torch.randn(k, num_ops).cuda(), requires_grad=True)
         self.alphas_reduce = Variable(1e-3*torch.randn(k, num_ops).cuda(), requires_grad=True)
         self._arch_parameters = [
           self.alphas_normal,
           self.alphas_reduce,
         ]
    

    The code here looks like each cell share the same alpha. Should't each cell have a independent alpha?

    opened by andyzgj 1
  • Training time on colab

    Training time on colab

    When running train_search in colab pro using GPU, every epoch takes around 3100 secs. Then total 50 epochs takes around 1.8 days. Anybody trying colab how much training time do you need? Can you reproduce the 1 day searching time in paper?

    opened by noilreed 2
  • When running train,py, why can't I get the same high accuracy as in train_search.py

    When running train,py, why can't I get the same high accuracy as in train_search.py

    Thanks for your work! One question I hope can be answered: for the same data set and parameter settings, I train train.py with the best cell obtained by train_search.py. Why can't I get the same high accuracy as in train_search.py?

    Look forward to your reply.

    opened by liqier 0
Owner
Hanxiao Liu
Research Scientist @ Google Brain
Hanxiao Liu
code for paper "Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?"

Does Unsupervised Architecture Representation Learning Help Neural Architecture Search? Code for paper: Does Unsupervised Architecture Representation

null 39 Dec 17, 2022
This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of Coordinate Independent Convolutional Networks.

Orientation independent Möbius CNNs This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of

Maurice Weiler 59 Dec 9, 2022
Physics-informed convolutional-recurrent neural networks for solving spatiotemporal PDEs

PhyCRNet Physics-informed convolutional-recurrent neural networks for solving spatiotemporal PDEs Paper link: [ArXiv] By: Pu Ren, Chengping Rao, Yang

Pu Ren 11 Aug 23, 2022
Code for the ECCV2020 paper "A Differentiable Recurrent Surface for Asynchronous Event-Based Data"

A Differentiable Recurrent Surface for Asynchronous Event-Based Data Code for the ECCV2020 paper "A Differentiable Recurrent Surface for Asynchronous

Marco Cannici 21 Oct 5, 2022
Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

Differentiable Neural Computers and family, for Pytorch Includes: Differentiable Neural Computers (DNC) Sparse Access Memory (SAM) Sparse Differentiab

ixaxaar 302 Dec 14, 2022
EdMIPS: Rethinking Differentiable Search for Mixed-Precision Neural Networks

EdMIPS is an efficient algorithm to search the optimal mixed-precision neural network directly without proxy task on ImageNet given computation budgets. It can be applied to many popular network architectures, including ResNet, GoogLeNet, and Inception-V3.

Zhaowei Cai 47 Dec 30, 2022
Model search is a framework that implements AutoML algorithms for model architecture search at scale

Model search (MS) is a framework that implements AutoML algorithms for model architecture search at scale. It aims to help researchers speed up their exploration process for finding the right model architecture for their classification problems (i.e., DNNs with different types of layers).

Google 3.2k Dec 31, 2022
Densely Connected Search Space for More Flexible Neural Architecture Search (CVPR2020)

DenseNAS The code of the CVPR2020 paper Densely Connected Search Space for More Flexible Neural Architecture Search. Neural architecture search (NAS)

Jamin Fong 291 Nov 18, 2022
[ICLR2021oral] Rethinking Architecture Selection in Differentiable NAS

DARTS-PT Code accompanying the paper ICLR'2021: Rethinking Architecture Selection in Differentiable NAS Ruochen Wang, Minhao Cheng, Xiangning Chen, Xi

Ruochen Wang 86 Dec 27, 2022
DeepHyper: Scalable Asynchronous Neural Architecture and Hyperparameter Search for Deep Neural Networks

What is DeepHyper? DeepHyper is a software package that uses learning, optimization, and parallel computing to automate the design and development of

DeepHyper Team 214 Jan 8, 2023
Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network This repository is the official implementation of Speech Separati

Kai Li (李凯) 116 Nov 9, 2022
Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.

Deep Image Search - AI-Based Image Search Engine Deep Image Search is an AI-based image search engine that includes deep transfer learning features Ex

null 139 Jan 1, 2023
Search and filter videos based on objects that appear in them using convolutional neural networks

Thingscoop: Utility for searching and filtering videos based on their content Description Thingscoop is a command-line utility for analyzing videos se

Anastasis Germanidis 354 Dec 4, 2022
Code and datasets for the paper "Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction" (RA-L, 2021)

Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction This is the code for the paper Combining E

Robotics and Perception Group 69 Dec 26, 2022
Official implementation for NIPS'17 paper: PredRNN: Recurrent Neural Networks for Predictive Learning Using Spatiotemporal LSTMs.

PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning The predictive learning of spatiotemporal sequences aims to generate future

THUML: Machine Learning Group @ THSS 243 Dec 26, 2022
Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"

Easy-To-Hard The official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks". Gett

Avi Schwarzschild 52 Sep 8, 2022
An implementation of DeepMind's Relational Recurrent Neural Networks in PyTorch.

relational-rnn-pytorch An implementation of DeepMind's Relational Recurrent Neural Networks (Santoro et al. 2018) in PyTorch. Relational Memory Core (

Sang-gil Lee 241 Nov 18, 2022
Code for the paper "Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks"

ON-LSTM This repository contains the code used for word-level language model and unsupervised parsing experiments in Ordered Neurons: Integrating Tree

Yikang Shen 572 Nov 21, 2022
"NAS-Bench-301 and the Case for Surrogate Benchmarks for Neural Architecture Search".

NAS-Bench-301 This repository containts code for the paper: "NAS-Bench-301 and the Case for Surrogate Benchmarks for Neural Architecture Search". The

AutoML-Freiburg-Hannover 57 Nov 30, 2022