CondenseNet: Light weighted CNN for mobile devices

Overview

CondenseNets

This repository contains the code (in PyTorch) for "CondenseNet: An Efficient DenseNet using Learned Group Convolutions" paper by Gao Huang*, Shichen Liu*, Laurens van der Maaten and Kilian Weinberger (* Authors contributed equally).

Citation

If you find our project useful in your research, please consider citing:

@inproceedings{huang2018condensenet,
  title={Condensenet: An efficient densenet using learned group convolutions},
  author={Huang, Gao and Liu, Shichen and Van der Maaten, Laurens and Weinberger, Kilian Q},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={2752--2761},
  year={2018}
}

Contents

  1. Introduction
  2. Usage
  3. Results
  4. Discussions
  5. Contacts

Introduction

CondenseNet is a novel, computationally efficient convolutional network architecture. It combines dense connectivity between layers with a mechanism to remove unused connections. The dense connectivity facilitates feature re-use in the network, whereas learned group convolutions remove connections between layers for which this feature re-use is superfluous. At test time, our model can be implemented using standard grouped convolutions —- allowing for efficient computation in practice. Our experiments demonstrate that CondenseNets are much more efficient than other compact convolutional networks such as MobileNets and ShuffleNets.

Figure 1: Learned Group Convolution with G=C=3.

Figure 2: CondenseNets with Fully Dense Connectivity and Increasing Growth Rate.

Usage

Dependencies

Train

As an example, use the following command to train a CondenseNet on ImageNet

python main.py --model condensenet -b 256 -j 20 /PATH/TO/IMAGENET \
--stages 4-6-8-10-8 --growth 8-16-32-64-128 --gpu 0,1,2,3,4,5,6,7 --resume

As another example, use the following command to train a CondenseNet on CIFAR-10

python main.py --model condensenet -b 64 -j 12 cifar10 \
--stages 14-14-14 --growth 8-16-32 --gpu 0 --resume

Evaluation

We take the ImageNet model trained above as an example.

To evaluate the trained model, use evaluate to evaluate from the default checkpoint directory:

python main.py --model condensenet -b 64 -j 20 /PATH/TO/IMAGENET \
--stages 4-6-8-10-8 --growth 8-16-32-64-128 --gpu 0 --resume \
--evaluate

or use evaluate-from to evaluate from an arbitrary path:

python main.py --model condensenet -b 64 -j 20 /PATH/TO/IMAGENET \
--stages 4-6-8-10-8 --growth 8-16-32-64-128 --gpu 0 --resume \
--evaluate-from /PATH/TO/BEST/MODEL

Note that these models are still the large models. To convert the model to group-convolution version as described in the paper, use the convert-from function:

python main.py --model condensenet -b 64 -j 20 /PATH/TO/IMAGENET \
--stages 4-6-8-10-8 --growth 8-16-32-64-128 --gpu 0 --resume \
--convert-from /PATH/TO/BEST/MODEL

Finally, to directly load from a converted model (that is, a CondenseNet), use a converted model file in combination with the evaluate-from option:

python main.py --model condensenet_converted -b 64 -j 20 /PATH/TO/IMAGENET \
--stages 4-6-8-10-8 --growth 8-16-32-64-128 --gpu 0 --resume \
--evaluate-from /PATH/TO/CONVERTED/MODEL

Other Options

We also include DenseNet implementation in this repository.
For more examples of usage, please refer to script.sh
For detailed options, please python main.py --help

Results

Results on ImageNet

Model FLOPs Params Top-1 Err. Top-5 Err. Pytorch Model
CondenseNet-74 (C=G=4) 529M 4.8M 26.2 8.3 Download (18.69M)
CondenseNet-74 (C=G=8) 274M 2.9M 29.0 10.0 Download (11.68M)

Results on CIFAR

Model FLOPs Params CIFAR-10 CIFAR-100
CondenseNet-50 28.6M 0.22M 6.22 -
CondenseNet-74 51.9M 0.41M 5.28 -
CondenseNet-86 65.8M 0.52M 5.06 23.64
CondenseNet-98 81.3M 0.65M 4.83 -
CondenseNet-110 98.2M 0.79M 4.63 -
CondenseNet-122 116.7M 0.95M 4.48 -
CondenseNet-182* 513M 4.2M 3.76 18.47

(* trained 600 epochs)

Inference time on ARM platform

Model FLOPs Top-1 Time(s)
VGG-16 15,300M 28.5 354
ResNet-18 1,818M 30.2 8.14
1.0 MobileNet-224 569M 29.4 1.96
CondenseNet-74 (C=G=4) 529M 26.2 1.89
CondenseNet-74 (C=G=8) 274M 29.0 0.99

Contact

[email protected]
[email protected]

We are working on the implementation on other frameworks.
Any discussions or concerns are welcomed!

Comments
  • Architecture of CondenseNet{light-160*, 182*, light-94, 84}

    Architecture of CondenseNet{light-160*, 182*, light-94, 84}

    Hi, The paper mentions CondenseNet{light-160*, 182*, light-94, 84} for CIFAR, though is not clear about the details of the architecture. Could you share the architectures and how those results can be reproduced?

    opened by geevi 8
  • Out of memory issue when training a new dataset

    Out of memory issue when training a new dataset

    Hi,

    I am attempting to reproduce your code for the CondenseNet for training a dataset of 7 classes, and approximately between 100K - 150K training images splitted (non-equally) for those clases. My images consist of bounding boxes of different sizes. For that, first I'm using a similar setting you use to train the ImageNet, pointing to my dataset and preparing the class folders to find the paths properly. I resized all images to 256x256 as you did in your paper. Therefore, this is the command line I use for training the new dataset:

    python main.py --model condensenet -b 256 -j 28 lima_train --stages 4-6-8-10-8 --growth 8-16-32-64-128 --gpu 0 --resume

    where lima_train is a link file pointing to the folder containing all training data splitted in class subfolders as required.

    I'm using a datacenter whose GPU nodes use NVIDIA Tesla P100 of 16 GB each, and CUDA 8 with cuDNN. In this sense, I presume the training should not be a problem. I understand that a GPU of 16GB or even 8GB should be enough to train this network, shouldn't be? However, I'm getting the out of memory problem shown below. I modified the parameters to reduce the batch size to 64 and the number of workers according to the machine. Probably I am missing some step or I should modify the command line according to the settings of my data.

    I would appreciate any feedback.

    Thanks in advance and congratulations for this work.

    THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory Traceback (most recent call last): File "main.py", line 479, in <module> main() File "main.py", line 239, in main train(train_loader, model, criterion, optimizer, epoch) File "main.py", line 303, in train output = model(input_var, progress) File "/mnt/storage/home/vp17941/.conda/envs/condensenet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__ result = self.forward(*input, **kwargs) File "/mnt/storage/home/vp17941/.conda/envs/condensenet/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 58, in forward return self.module(*inputs[0], **kwargs[0]) File "/mnt/storage/home/vp17941/.conda/envs/condensenet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__ result = self.forward(*input, **kwargs) File "/mnt/storage/home/vp17941/CondenseNet/models/condensenet.py", line 127, in forward features = self.features(x) File "/mnt/storage/home/vp17941/.conda/envs/condensenet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__ result = self.forward(*input, **kwargs) File "/mnt/storage/home/vp17941/.conda/envs/condensenet/lib/python3.6/site-packages/torch/nn/modules/container.py", line 67, in forward input = module(input) File "/mnt/storage/home/vp17941/.conda/envs/condensenet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__ result = self.forward(*input, **kwargs) File "/mnt/storage/home/vp17941/.conda/envs/condensenet/lib/python3.6/site-packages/torch/nn/modules/container.py", line 67, in forward input = module(input) File "/mnt/storage/home/vp17941/.conda/envs/condensenet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__ result = self.forward(*input, **kwargs) File "/mnt/storage/home/vp17941/CondenseNet/models/condensenet.py", line 33, in forward x = self.conv_1(x) File "/mnt/storage/home/vp17941/.conda/envs/condensenet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__ result = self.forward(*input, **kwargs) File "/mnt/storage/home/vp17941/CondenseNet/layers.py", line 42, in forward x = self.norm(x) File "/mnt/storage/home/vp17941/.conda/envs/condensenet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__ result = self.forward(*input, **kwargs) File "/mnt/storage/home/vp17941/.conda/envs/condensenet/lib/python3.6/site-packages/torch/nn/modules/batchnorm.py", line 37, in forward self.training, self.momentum, self.eps) File "/mnt/storage/home/vp17941/.conda/envs/condensenet/lib/python3.6/site-packages/torch/nn/functional.py", line 639, in batch_norm return f(input, weight, bias) RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/generic/THCStorage.cu:66 srun: error: gpu09: task 0: Exited with exit code 1

    opened by vponcelo 8
  • CondeseNet-182* on Cifar100 validation 1 error rate is 19.73% where in paper is 18.47%

    CondeseNet-182* on Cifar100 validation 1 error rate is 19.73% where in paper is 18.47%

    Hi, I run CondenseNet-182* using command provided by issue 11 on Cfiar100

    python main.py --model condensenet -b 64 -j 2 cifar100 --epochs 600 --stages 30-30-30 --growth 12-24-48
    

    The result of the first run is 19.73%, the result in the second run is 19. 86% The result in the paper is 18.47% (Table) I just used all the default arguments in the code provide, do we need to make other changes?

    Thanks in advance

    opened by lizhenstat 6
  • Dropping issue with pytorch v0.4

    Dropping issue with pytorch v0.4

    See: https://github.com/ShichenLiu/CondenseNet/blob/3b4398ed1987f6f7c891d81a470578dcc5c5562c/layers.py#L88

    Weird stuff in the Pytorch API:

    self._mask[i::self.groups, d, :, :].fill_(0)
    

    ... does not fill in place. So you must do:

    self._mask[i::self.groups, d, :, :] = self._mask[i::self.groups, d, :, :].fill_(0)
    

    https://github.com/pytorch/pytorch/issues/2599#issuecomment-326775742

    opened by Cadene 4
  • No shuffle layer when training condensenet?

    No shuffle layer when training condensenet?

    Dear @ShichenLiu , I did not found any shuffle layer related stuff in models.condensenet, which use layers.LearnedGroupConv as LGC. However, the paper says we should use it clearly. Is it a mismatch? Thanks

    opened by hiyijian 4
  • Question on dropping function

    Question on dropping function

    Hi, I have one question on function dropping in layers.py. I don't understand why learned group convolution still needs the shuffling operation?

            weight = weight.view(d_out, self.groups, self.in_channels)
            weight = weight.transpose(0, 1).contiguous()
            weight = weight.view(self.out_channels, self.in_channels)
    

    https://github.com/ShichenLiu/CondenseNet/blob/master/layers.py#L78

    I notice there is a shuffle operation mentioned in 4.1's first graph: "we permute the output channels of the first 1x1_conv learned group convolution layer, such that the features generated by each of its groups are evenly used by all the groups of the subsequent 3x3 group convolutional layer" However, this operation aims to shuffle feature maps, not convolutional kernels.

    Can you explain a little bit? Thanks in advance

    opened by lizhenstat 3
  • Training Time issue when training Condensenet-light on cifar100

    Training Time issue when training Condensenet-light on cifar100

    Hi, I am reproducing your work in tensorflow, but I found that dropping during training has taken a lot of time. I would like to ask if you have encountered such a problem. What do you think might be the reason?

    opened by infrontofme 3
  • condensenet-86 parameters number different from torchsummary

    condensenet-86 parameters number different from torchsummary

    Hi, I noticed that condensenet-86 on cifar10 is 0.52M on cifar10 However using torchsummary package, the total calculated params are as follow: cifar10 the parameters are calculated as

        from torchsummary import summary
        summary(model, (3, 32, 32), device="cpu")
        exit(0)
    

    Do you know why is there the difference? Thanks in advance

    opened by lizhenstat 2
  • dropout before convolution layer

    dropout before convolution layer

    Hi, I noticed that the dropout is placed before convolution layer, In the original densenet-torch implementation, the order in each block is BN-->relu-->conv-->dropout Is there a particular reason for doing so?

        def forward(self, x):
            self._check_drop()
            x = self.norm(x)
            x = self.relu(x)
            if self.dropout_rate > 0:
                x = self.drop(x)
            ### Masked output
            weight = self.conv.weight * self.mask
            return F.conv2d(x, weight, None, self.conv.stride,
                            self.conv.padding, self.conv.dilation, 1)
    
    opened by lizhenstat 2
  • Group lasso regularization effect for ImageNet

    Group lasso regularization effect for ImageNet

    Hi, the paper states that the group lasso term is added to the total cost function, with the coefficient of 1e-5 on ImageNet Dataset. Have you compared the model without group lasso term on the same Dataset? In other words, the group lasso term improves the final validation accuracy by how much percent?

    Thanks in advance

    opened by lizhenstat 2
  • Question on clamp

    Question on clamp

    Hi, I have a question on clap weights https://github.com/ShichenLiu/CondenseNet/blob/master/layers.py#L125

    weight = weight.sum(0).clamp(min=1e-6).sqrt()
    

    I don't understand the clamp function here. I tried to train condensenet-86 on cifar10 . with and without clamp functions with clamp: error rate = 95.06 without clamp: error rate = 94.96

    Thanks in advance

    opened by lizhenstat 2
  • RuntimeError: INDICES element is out of DATA bounds

    RuntimeError: INDICES element is out of DATA bounds

    Whenever I run the following command:

    python main.py --model condensenet_converted -b 64 -j 4 C:\CondenseNet-master --stages 4-6-8-10-8 --growth 8-16-32-64-128 --group-1x1 4 --group-3x3 4 --condense-factor 4 --evaluate-from C:\CondenseNet-master/converted_condensenet_4.pth.tar --gpu 0

    I get the following error:

    RuntimeError: INDICES element is out of DATA bounds, id=53888868763566084 axis_dim=2064

    any idea how to solve this issue?

    Thank you in advance.

    opened by MohBarbary 0
  • Questions on implementation of dropping

    Questions on implementation of dropping

    Hi,

    I have some questions on the consistency of implementation of dropping and paper.

    1. When you take the sum, you did not use absolute values as written in the paper. https://github.com/ShichenLiu/CondenseNet/blob/833a91d5f859df25579f70a2439dfd62f7fefb29/layers.py#L86

    2. You drop during the stage, not when the stage finishes, as written in the paper. https://github.com/ShichenLiu/CondenseNet/blob/833a91d5f859df25579f70a2439dfd62f7fefb29/layers.py#L62

    Am I wrong or would you explain about it ? Thank you.

    opened by shuuchen 0
  • What are the Training Arguments for ImageNet Pre-Trained Model?

    What are the Training Arguments for ImageNet Pre-Trained Model?

    Getting following error message when running the trained ImageNet model for image classification on my machine, which I downloaded from author's Dropbox link posted in this repo's readme link:

    model.load_state_dict(torch.load(PATH, map_location=torch.device("cpu"))[\'state_dict\'])\n', ' File "C:\\Program Files\\Python36\\lib\\site-packages\\torch\\nn\\modules\\module.py", line 1052, in load_state_dict\n self.__class__.__name__, "\\n\\t".join(error_msgs)))\n', 'RuntimeError: Error(s) in loading state_dict for DataParallel:\n\tMissing key(s) in state_dict: "module.features.denseblock_1.denselayer_1.conv_1._count", "module.features.denseblock_1.denselayer_1.conv_1._stage", "module.features.denseblock_1.denselayer_1.conv_1._mask", "module.features.denseblock_1.denselayer_2.conv_1._count", "module.features.denseblock_1.denselayer_2.conv_1._stage", "module.features.denseblock_1.denselayer_2.conv_1._mask", "module.features.denseblock_1.denselayer_3.conv_1._count", "module.features.denseblock_1.denselayer_3.conv_1._stage", "module.features.denseblock_1.denselayer_3.conv_1._mask", "module.features.denseblock_1.denselayer_4.conv_1._count", "module.features.denseblock_1.denselayer_4.conv_1._stage", "module.features.denseblock_1.denselayer_4.conv_1._mask", "module.features.denseblock_2.denselayer_1.conv_1._count", "module.features.denseblock_2.denselayer_1.conv_1._stage", "module.features.denseblock_2.denselayer_1.conv_1._mask", "module.features.denseblock_2.denselayer_2.conv_1._count", "module.features.denseblock_2.denselayer_2.conv_1._stage", "module.features.denseblock_2.denselayer_2.conv_1._mask", "module.features.denseblock_2.denselayer_3.conv_1._count", "module.features.denseblock_2.denselayer_3.conv_1._stage", "module.features.denseblock_2.denselayer_3.conv_1._mask", "module.features.denseblock_2.denselayer_4.conv_1._count", "module.features.denseblock_2.denselayer_4.conv_1._stage", "module.features.denseblock_2.denselayer_4.conv_1._mask", "module.features.denseblock_2.denselayer_5.conv_1._count", "module.features.denseblock_2.denselayer_5.conv_1._stage", "module.features.denseblock_2.denselayer_5.conv_1._mask", "module.features.denseblock_2.denselayer_6.conv_1._count", "module.features.denseblock_2.denselayer_6.conv_1._stage", "module.features.denseblock_2.denselayer_6.conv_1._mask", "module.features.denseblock_3.denselayer_1.conv_1._count", "module.features.denseblock_3.denselayer_1.conv_1._stage", "module.features.denseblock_3.denselayer_1.conv_1._mask", "module.features.denseblock_3.denselayer_2.conv_1._count", "module.features.denseblock_3.denselayer_2.conv_1._stage", "module.features.denseblock_3.denselayer_2.conv_1._mask", "module.features.denseblock_3.denselayer_3.conv_1._count", "module.features.denseblock_3.denselayer_3.conv_1._stage", "module.features.denseblock_3.denselayer_3.conv_1._mask", "module.features.denseblock_3.denselayer_4.conv_1._count", "module.features.denseblock_3.denselayer_4.conv_1._stage", "module.features.denseblock_3.denselayer_4.conv_1._mask", "module.features.denseblock_3.denselayer_5.conv_1._count", "module.features.denseblock_3.denselayer_5.conv_1._stage", "module.features.denseblock_3.denselayer_5.conv_1._mask", "module.features.denseblock_3.denselayer_6.conv_1._count", "module.features.denseblock_3.denselayer_6.conv_1._stage", "module.features.denseblock_3.denselayer_6.conv_1._mask", "module.features.denseblock_3.denselayer_7.conv_1._count", "module.features.denseblock_3.denselayer_7.conv_1._stage", "module.features.denseblock_3.denselayer_7.conv_1._mask", "module.features.denseblock_3.denselayer_8.conv_1._count", "module.features.denseblock_3.denselayer_8.conv_1._stage", "module.features.denseblock_3.denselayer_8.conv_1._mask", "module.features.denseblock_4.denselayer_1.conv_1._count", "module.features.denseblock_4.denselayer_1.conv_1._stage", "module.features.denseblock_4.denselayer_1.conv_1._mask", "module.features.denseblock_4.denselayer_2.conv_1._count", "module.features.denseblock_4.denselayer_2.conv_1._stage", "module.features.denseblock_4.denselayer_2.conv_1._mask", "module.features.denseblock_4.denselayer_3.conv_1._count", "module.features.denseblock_4.denselayer_3.conv_1._stage", "module.features.denseblock_4.denselayer_3.conv_1._mask", "module.features.denseblock_4.denselayer_4.conv_1._count", "module.features.denseblock_4.denselayer_4.conv_1._stage", "module.features.denseblock_4.denselayer_4.conv_1._mask", "module.features.denseblock_4.denselayer_5.conv_1._count", "module.features.denseblock_4.denselayer_5.conv_1._stage", "module.features.denseblock_4.denselayer_5.conv_1._mask", "module.features.denseblock_4.denselayer_6.conv_1._count", "module.features.denseblock_4.denselayer_6.conv_1._stage", "module.features.denseblock_4.denselayer_6.conv_1._mask", "module.features.denseblock_4.denselayer_7.conv_1._count", "module.features.denseblock_4.denselayer_7.conv_1._stage", "module.features.denseblock_4.denselayer_7.conv_1._mask", "module.features.denseblock_4.denselayer_8.conv_1._count", "module.features.denseblock_4.denselayer_8.conv_1._stage", "module.features.denseblock_4.denselayer_8.conv_1._mask", "module.features.denseblock_4.denselayer_9.conv_1._count", "module.features.denseblock_4.denselayer_9.conv_1._stage", "module.features.denseblock_4.denselayer_9.conv_1._mask", "module.features.denseblock_4.denselayer_10.conv_1._count", "module.features.denseblock_4.denselayer_10.conv_1._stage", "module.features.denseblock_4.denselayer_10.conv_1._mask", "module.features.denseblock_5.denselayer_1.conv_1._count", "module.features.denseblock_5.denselayer_1.conv_1._stage", "module.features.denseblock_5.denselayer_1.conv_1._mask", "module.features.denseblock_5.denselayer_2.conv_1._count", "module.features.denseblock_5.denselayer_2.conv_1._stage", "module.features.denseblock_5.denselayer_2.conv_1._mask", "module.features.denseblock_5.denselayer_3.conv_1._count", "module.features.denseblock_5.denselayer_3.conv_1._stage", "module.features.denseblock_5.denselayer_3.conv_1._mask", "module.features.denseblock_5.denselayer_4.conv_1._count", "module.features.denseblock_5.denselayer_4.conv_1._stage", "module.features.denseblock_5.denselayer_4.conv_1._mask", "module.features.denseblock_5.denselayer_5.conv_1._count", "module.features.denseblock_5.denselayer_5.conv_1._stage", "module.features.denseblock_5.denselayer_5.conv_1._mask", "module.features.denseblock_5.denselayer_6.conv_1._count", "module.features.denseblock_5.denselayer_6.conv_1._stage", "module.features.denseblock_5.denselayer_6.conv_1._mask", "module.features.denseblock_5.denselayer_7.conv_1._count", "module.features.denseblock_5.denselayer_7.conv_1._stage", "module.features.denseblock_5.denselayer_7.conv_1._mask", "module.features.denseblock_5.denselayer_8.conv_1._count", "module.features.denseblock_5.denselayer_8.conv_1._stage", "module.features.denseblock_5.denselayer_8.conv_1._mask", "module.classifier.weight", "module.classifier.bias". \n\tUnexpected key(s) in state_dict: "module.features.denseblock_1.denselayer_1.conv_1.index", "module.features.denseblock_1.denselayer_2.conv_1.index", "module.features.denseblock_1.denselayer_3.conv_1.index", "module.features.denseblock_1.denselayer_4.conv_1.index", "module.features.denseblock_2.denselayer_1.conv_1.index", "module.features.denseblock_2.denselayer_2.conv_1.index", "module.features.denseblock_2.denselayer_3.conv_1.index", "module.features.denseblock_2.denselayer_4.conv_1.index", "module.features.denseblock_2.denselayer_5.conv_1.index", "module.features.denseblock_2.denselayer_6.conv_1.index", "module.features.denseblock_3.denselayer_1.conv_1.index", "module.features.denseblock_3.denselayer_2.conv_1.index", "module.features.denseblock_3.denselayer_3.conv_1.index", "module.features.denseblock_3.denselayer_4.conv_1.index", "module.features.denseblock_3.denselayer_5.conv_1.index", "module.features.denseblock_3.denselayer_6.conv_1.index", "module.features.denseblock_3.denselayer_7.conv_1.index", "module.features.denseblock_3.denselayer_8.conv_1.index", "module.features.denseblock_4.denselayer_1.conv_1.index", "module.features.denseblock_4.denselayer_2.conv_1.index", "module.features.denseblock_4.denselayer_3.conv_1.index", "module.features.denseblock_4.denselayer_4.conv_1.index", "module.features.denseblock_4.denselayer_5.conv_1.index", "module.features.denseblock_4.denselayer_6.conv_1.index", "module.features.denseblock_4.denselayer_7.conv_1.index", "module.features.denseblock_4.denselayer_8.conv_1.index", "module.features.denseblock_4.denselayer_9.conv_1.index", "module.features.denseblock_4.denselayer_10.conv_1.index", "module.features.denseblock_5.denselayer_1.conv_1.index", "module.features.denseblock_5.denselayer_2.conv_1.index", "module.features.denseblock_5.denselayer_3.conv_1.index", "module.features.denseblock_5.denselayer_4.conv_1.index", "module.features.denseblock_5.denselayer_5.conv_1.index", "module.features.denseblock_5.denselayer_6.conv_1.index", "module.features.denseblock_5.denselayer_7.conv_1.index", "module.features.denseblock_5.denselayer_8.conv_1.index", "module.classifier.index", "module.classifier.linear.weight", "module.classifier.linear.bias". \n\tsize mismatch for module.features.denseblock_1.denselayer_1.conv_1.conv.weight: copying a param with shape torch.Size([32, 2, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 16, 1, 1]).\n\tsize mismatch for module.features.denseblock_1.denselayer_1.conv_2.conv.weight: copying a param with shape torch.Size([8, 4, 3, 3]) from checkpoint, the shape in current model is torch.Size([8, 8, 3, 3]).\n\tsize mismatch for module.features.denseblock_1.denselayer_2.conv_1.conv.weight: copying a param with shape torch.Size([32, 3, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 24, 1, 1]).\n\tsize mismatch for module.features.denseblock_1.denselayer_2.conv_2.conv.weight: copying a param with shape torch.Size([8, 4, 3, 3]) from checkpoint, the shape in current model is torch.Size([8, 8, 3, 3]).\n\tsize mismatch for module.features.denseblock_1.denselayer_3.conv_1.conv.weight: copying a param with shape torch.Size([32, 4, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 1, 1]).\n\tsize mismatch for module.features.denseblock_1.denselayer_3.conv_2.conv.weight: copying a param with shape torch.Size([8, 4, 3, 3]) from checkpoint, the shape in current model is torch.Size([8, 8, 3, 3]).\n\tsize mismatch for module.features.denseblock_1.denselayer_4.conv_1.conv.weight: copying a param with shape torch.Size([32, 5, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 40, 1, 1]).\n\tsize mismatch for module.features.denseblock_1.denselayer_4.conv_2.conv.weight: copying a param with shape torch.Size([8, 4, 3, 3]) from checkpoint, the shape in current model is torch.Size([8, 8, 3, 3]).\n\tsize mismatch for module.features.denseblock_2.denselayer_1.conv_1.conv.weight: copying a param with shape torch.Size([64, 6, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 48, 1, 1]).\n\tsize mismatch for module.features.denseblock_2.denselayer_1.conv_2.conv.weight: copying a param with shape torch.Size([16, 8, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).\n\tsize mismatch for module.features.denseblock_2.denselayer_2.conv_1.conv.weight: copying a param with shape torch.Size([64, 8, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 1, 1]).\n\tsize mismatch for module.features.denseblock_2.denselayer_2.conv_2.conv.weight: copying a param with shape torch.Size([16, 8, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).\n\tsize mismatch for module.features.denseblock_2.denselayer_3.conv_1.conv.weight: copying a param with shape torch.Size([64, 10, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 80, 1, 1]).\n\tsize mismatch for module.features.denseblock_2.denselayer_3.conv_2.conv.weight: copying a param with shape torch.Size([16, 8, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).\n\tsize mismatch for module.features.denseblock_2.denselayer_4.conv_1.conv.weight: copying a param with shape torch.Size([64, 12, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 96, 1, 1]).\n\tsize mismatch for module.features.denseblock_2.denselayer_4.conv_2.conv.weight: copying a param with shape torch.Size([16, 8, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).\n\tsize mismatch for module.features.denseblock_2.denselayer_5.conv_1.conv.weight: copying a param with shape torch.Size([64, 14, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 112, 1, 1]).\n\tsize mismatch for module.features.denseblock_2.denselayer_5.conv_2.conv.weight: copying a param with shape torch.Size([16, 8, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).\n\tsize mismatch for module.features.denseblock_2.denselayer_6.conv_1.conv.weight: copying a param with shape torch.Size([64, 16, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 128, 1, 1]).\n\tsize mismatch for module.features.denseblock_2.denselayer_6.conv_2.conv.weight: copying a param with shape torch.Size([16, 8, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).\n\tsize mismatch for module.features.denseblock_3.denselayer_1.conv_1.conv.weight: copying a param with shape torch.Size([128, 18, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 144, 1, 1]).\n\tsize mismatch for module.features.denseblock_3.denselayer_1.conv_2.conv.weight: copying a param with shape torch.Size([32, 16, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).\n\tsize mismatch for module.features.denseblock_3.denselayer_2.conv_1.conv.weight: copying a param with shape torch.Size([128, 22, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 176, 1, 1]).\n\tsize mismatch for module.features.denseblock_3.denselayer_2.conv_2.conv.weight: copying a param with shape torch.Size([32, 16, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).\n\tsize mismatch for module.features.denseblock_3.denselayer_3.conv_1.conv.weight: copying a param with shape torch.Size([128, 26, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 208, 1, 1]).\n\tsize mismatch for module.features.denseblock_3.denselayer_3.conv_2.conv.weight: copying a param with shape torch.Size([32, 16, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).\n\tsize mismatch for module.features.denseblock_3.denselayer_4.conv_1.conv.weight: copying a param with shape torch.Size([128, 30, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 240, 1, 1]).\n\tsize mismatch for module.features.denseblock_3.denselayer_4.conv_2.conv.weight: copying a param with shape torch.Size([32, 16, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).\n\tsize mismatch for module.features.denseblock_3.denselayer_5.conv_1.conv.weight: copying a param with shape torch.Size([128, 34, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 272, 1, 1]).\n\tsize mismatch for module.features.denseblock_3.denselayer_5.conv_2.conv.weight: copying a param with shape torch.Size([32, 16, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).\n\tsize mismatch for module.features.denseblock_3.denselayer_6.conv_1.conv.weight: copying a param with shape torch.Size([128, 38, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 304, 1, 1]).\n\tsize mismatch for module.features.denseblock_3.denselayer_6.conv_2.conv.weight: copying a param with shape torch.Size([32, 16, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).\n\tsize mismatch for module.features.denseblock_3.denselayer_7.conv_1.conv.weight: copying a param with shape torch.Size([128, 42, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 336, 1, 1]).\n\tsize mismatch for module.features.denseblock_3.denselayer_7.conv_2.conv.weight: copying a param with shape torch.Size([32, 16, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).\n\tsize mismatch for module.features.denseblock_3.denselayer_8.conv_1.conv.weight: copying a param with shape torch.Size([128, 46, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 368, 1, 1]).\n\tsize mismatch for module.features.denseblock_3.denselayer_8.conv_2.conv.weight: copying a param with shape torch.Size([32, 16, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).\n\tsize mismatch for module.features.denseblock_4.denselayer_1.conv_1.conv.weight: copying a param with shape torch.Size([256, 50, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 400, 1, 1]).\n\tsize mismatch for module.features.denseblock_4.denselayer_1.conv_2.conv.weight: copying a param with shape torch.Size([64, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).\n\tsize mismatch for module.features.denseblock_4.denselayer_2.conv_1.conv.weight: copying a param with shape torch.Size([256, 58, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 464, 1, 1]).\n\tsize mismatch for module.features.denseblock_4.denselayer_2.conv_2.conv.weight: copying a param with shape torch.Size([64, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).\n\tsize mismatch for module.features.denseblock_4.denselayer_3.conv_1.conv.weight: copying a param with shape torch.Size([256, 66, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 528, 1, 1]).\n\tsize mismatch for module.features.denseblock_4.denselayer_3.conv_2.conv.weight: copying a param with shape torch.Size([64, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).\n\tsize mismatch for module.features.denseblock_4.denselayer_4.conv_1.conv.weight: copying a param with shape torch.Size([256, 74, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 592, 1, 1]).\n\tsize mismatch for module.features.denseblock_4.denselayer_4.conv_2.conv.weight: copying a param with shape torch.Size([64, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).\n\tsize mismatch for module.features.denseblock_4.denselayer_5.conv_1.conv.weight: copying a param with shape torch.Size([256, 82, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 656, 1, 1]).\n\tsize mismatch for module.features.denseblock_4.denselayer_5.conv_2.conv.weight: copying a param with shape torch.Size([64, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).\n\tsize mismatch for module.features.denseblock_4.denselayer_6.conv_1.conv.weight: copying a param with shape torch.Size([256, 90, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 720, 1, 1]).\n\tsize mismatch for module.features.denseblock_4.denselayer_6.conv_2.conv.weight: copying a param with shape torch.Size([64, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).\n\tsize mismatch for module.features.denseblock_4.denselayer_7.conv_1.conv.weight: copying a param with shape torch.Size([256, 98, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 784, 1, 1]).\n\tsize mismatch for module.features.denseblock_4.denselayer_7.conv_2.conv.weight: copying a param with shape torch.Size([64, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).\n\tsize mismatch for module.features.denseblock_4.denselayer_8.conv_1.conv.weight: copying a param with shape torch.Size([256, 106, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 848, 1, 1]).\n\tsize mismatch for module.features.denseblock_4.denselayer_8.conv_2.conv.weight: copying a param with shape torch.Size([64, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).\n\tsize mismatch for module.features.denseblock_4.denselayer_9.conv_1.conv.weight: copying a param with shape torch.Size([256, 114, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 912, 1, 1]).\n\tsize mismatch for module.features.denseblock_4.denselayer_9.conv_2.conv.weight: copying a param with shape torch.Size([64, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).\n\tsize mismatch for module.features.denseblock_4.denselayer_10.conv_1.conv.weight: copying a param with shape torch.Size([256, 122, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 976, 1, 1]).\n\tsize mismatch for module.features.denseblock_4.denselayer_10.conv_2.conv.weight: copying a param with shape torch.Size([64, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).\n\tsize mismatch for module.features.denseblock_5.denselayer_1.conv_1.conv.weight: copying a param with shape torch.Size([512, 130, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 1040, 1, 1]).\n\tsize mismatch for module.features.denseblock_5.denselayer_1.conv_2.conv.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).\n\tsize mismatch for module.features.denseblock_5.denselayer_2.conv_1.conv.weight: copying a param with shape torch.Size([512, 146, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 1168, 1, 1]).\n\tsize mismatch for module.features.denseblock_5.denselayer_2.conv_2.conv.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).\n\tsize mismatch for module.features.denseblock_5.denselayer_3.conv_1.conv.weight: copying a param with shape torch.Size([512, 162, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 1296, 1, 1]).\n\tsize mismatch for module.features.denseblock_5.denselayer_3.conv_2.conv.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).\n\tsize mismatch for module.features.denseblock_5.denselayer_4.conv_1.conv.weight: copying a param with shape torch.Size([512, 178, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 1424, 1, 1]).\n\tsize mismatch for module.features.denseblock_5.denselayer_4.conv_2.conv.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).\n\tsize mismatch for module.features.denseblock_5.denselayer_5.conv_1.conv.weight: copying a param with shape torch.Size([512, 194, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 1552, 1, 1]).\n\tsize mismatch for module.features.denseblock_5.denselayer_5.conv_2.conv.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).\n\tsize mismatch for module.features.denseblock_5.denselayer_6.conv_1.conv.weight: copying a param with shape torch.Size([512, 210, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 1680, 1, 1]).\n\tsize mismatch for module.features.denseblock_5.denselayer_6.conv_2.conv.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).\n\tsize mismatch for module.features.denseblock_5.denselayer_7.conv_1.conv.weight: copying a param with shape torch.Size([512, 226, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 1808, 1, 1]).\n\tsize mismatch for module.features.denseblock_5.denselayer_7.conv_2.conv.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).\n\tsize mismatch for module.features.denseblock_5.denselayer_8.conv_1.conv.weight: copying a param with shape torch.Size([512, 242, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 1936, 1, 1]).\n\tsize mismatch for module.features.denseblock_5.denselayer_8.conv_2.conv.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).\n']

    This is the training argument I have used in my image classification prediction script: args = parser.parse_args(["--model", "condensenet_converted", "-b", "64", "-j", "20", "imagenet", "--stages", "4-6-8-10-8", "--growth", "8-16-32-64-128", "--gpu", "0"]). I have tried both, (C=G=4) and (C=G=8) pre-trained models from this repo. Thank you.

    opened by LeighDavis 5
  • the version of pytorch

    the version of pytorch

    There are several problems of train-loader from torch.vision using my torch version (1.0.1), so could you show the requirements.txt or a latest code version using torch 1.0.1 ?

    opened by cambridgeinch 1
  • Error message: variables needed for gradient computation has been modified by an inplace operation

    Error message: variables needed for gradient computation has been modified by an inplace operation

    Hello dear brother, i am running condensenet and densent_LGC in pycahem IDE with Python 3.6.5 :: Anaconda, Inc. with following configurations: dataset - cifar10 epochs - 200 bottleneck - 3 growth - 12-12-12 C=G=4 batch size = 64

    i am getting this error exactly at 34th epoch every time. i have tried to remove 'inplace addition operations' and ' inplace=True' from Relu, but nothing has worked . Can you please help, how can it be fixed. Regards

    Epoch - 31 * Accuracy@1 89.150 Accuracy@5 99.520 Epoch - 32 * Accuracy@1 88.120 Accuracy@5 99.610 Epoch - 33 * Accuracy@1 88.640 Accuracy@5 99.630 Epoch - 34 Traceback (most recent call last): File "/home/supernet/PycharmProjects/untitled/nets/CondenseNet-Nauman/main.py", line 497, in main() File "/home/supernet/PycharmProjects/untitled/nets/CondenseNet-Nauman/main.py", line 257, in main train(train_loader, model, criterion, optimizer, epoch) File "/home/supernet/PycharmProjects/untitled/nets/CondenseNet-Nauman/main.py", line 340, in train loss.backward() File "/home/supernet/anaconda3/lib/python3.6/site-packages/torch/tensor.py", line 93, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/home/supernet/anaconda3/lib/python3.6/site-packages/torch/autograd/init.py", line 89, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

    opened by nauman46055 1
Owner
Shichen Liu
PhD student at USC
Shichen Liu
Monitor your ML jobs on mobile devices📱, especially for Google Colab / Kaggle

TF Watcher TF Watcher is a simple to use Python package and web app which allows you to monitor ?? your Machine Learning training or testing process o

Rishit Dagli 54 Nov 1, 2022
Edge-oriented Convolution Block for Real-time Super Resolution on Mobile Devices, ACM Multimedia 2021

Codes for ECBSR Edge-oriented Convolution Block for Real-time Super Resolution on Mobile Devices Xindong Zhang, Hui Zeng, Lei Zhang ACM Multimedia 202

xindong zhang 236 Dec 26, 2022
This repo is official PyTorch implementation of MobileHumanPose: Toward real-time 3D human pose estimation in mobile devices(CVPRW 2021).

Github Code of "MobileHumanPose: Toward real-time 3D human pose estimation in mobile devices" Introduction This repo is official PyTorch implementatio

Choi Sang Bum 203 Jan 5, 2023
Unofficial PyTorch implementation of MobileViT based on paper "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer".

MobileViT RegNet Unofficial PyTorch implementation of MobileViT based on paper MOBILEVIT: LIGHT-WEIGHT, GENERAL-PURPOSE, AND MOBILE-FRIENDLY VISION TR

Hong-Jia Chen 91 Dec 2, 2022
NFT-Price-Prediction-CNN - Using visual feature extraction, prices of NFTs are predicted via CNN (Alexnet and Resnet) architectures.

NFT-Price-Prediction-CNN - Using visual feature extraction, prices of NFTs are predicted via CNN (Alexnet and Resnet) architectures.

null 5 Nov 3, 2022
A light and fast one class detection framework for edge devices. We provide face detector, head detector, pedestrian detector, vehicle detector......

A Light and Fast Face Detector for Edge Devices Big News: LFD, which is a big update of LFFD, now is released (2021.03.09). It is strongly recommended

YonghaoHe 1.3k Dec 25, 2022
A Light CNN for Deep Face Representation with Noisy Labels

A Light CNN for Deep Face Representation with Noisy Labels Citation If you use our models, please cite the following paper: @article{wulight, title=

Alfred Xiang Wu 715 Nov 5, 2022
Applications using the GTN library and code to reproduce experiments in "Differentiable Weighted Finite-State Transducers"

gtn_applications An applications library using GTN. Current examples include: Offline handwriting recognition Automatic speech recognition Installing

Facebook Research 68 Dec 29, 2022
Monocular Depth Estimation - Weighted-average prediction from multiple pre-trained depth estimation models

merged_depth runs (1) AdaBins, (2) DiverseDepth, (3) MiDaS, (4) SGDepth, and (5) Monodepth2, and calculates a weighted-average per-pixel absolute dept

Pranav 39 Nov 21, 2022
Code for paper: Group-CAM: Group Score-Weighted Visual Explanations for Deep Convolutional Networks

Group-CAM By Zhang, Qinglong and Rao, Lu and Yang, Yubin [State Key Laboratory for Novel Software Technology at Nanjing University] This repo is the o

zhql 98 Nov 16, 2022
Automatic differentiation with weighted finite-state transducers.

GTN: Automatic Differentiation with WFSTs Quickstart | Installation | Documentation What is GTN? GTN is a framework for automatic differentiation with

null 100 Dec 29, 2022
Weighted QMIX: Expanding Monotonic Value Function Factorisation

This repo contains the cleaned-up code that was used in "Weighted QMIX: Expanding Monotonic Value Function Factorisation"

whirl 82 Dec 29, 2022
Implements an infinite sum of poisson-weighted convolutions

An infinite sum of Poisson-weighted convolutions Kyle Cranmer, Aug 2018 If viewing on GitHub, this looks better with nbviewer: click here Consider a v

Kyle Cranmer 26 Dec 7, 2022
Implementation of Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning

advantage-weighted-regression Implementation of Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning, by Peng et al. (

Omar D. Domingues 1 Dec 2, 2021
Weighted K Nearest Neighbors (kNN) algorithm implemented on python from scratch.

kNN_From_Scratch I implemented the k nearest neighbors (kNN) classification algorithm on python. This algorithm is used to predict the classes of new

null 1 Dec 14, 2021
Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently

Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently This repository is the official implementat

VITA 4 Dec 20, 2022
Multiple-criteria decision-making (MCDM) with Electre, Promethee, Weighted Sum and Pareto

EasyMCDM - Quick Installation methods Install with PyPI Once you have created your Python environment (Python 3.6+) you can simply type: pip3 install

Labrak Yanis 6 Nov 22, 2022
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

The Apache Software Foundation 20.2k Jan 8, 2023
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

The Apache Software Foundation 20.2k Jan 5, 2023