95.47% on CIFAR10 with PyTorch


Train CIFAR10 with PyTorch

I'm playing with PyTorch on the CIFAR10 dataset.


  • Python 3.6+
  • PyTorch 1.0+


# Start training with: 
python main.py

# You can manually resume the training with: 
python main.py --resume --lr=0.01


Model Acc.
VGG16 92.64%
ResNet18 93.02%
ResNet50 93.62%
ResNet101 93.75%
RegNetX_200MF 94.24%
RegNetY_400MF 94.29%
MobileNetV2 94.43%
ResNeXt29(32x4d) 94.73%
ResNeXt29(2x64d) 94.82%
SimpleDLA 94.89%
DenseNet121 95.04%
PreActResNet18 95.11%
DPN92 95.16%
DLA 95.47%
  Progress_bar value unpack error

    from utils import progress_bar File "C:\Users\Debadri\pytorch-cifar\utils.py", line 45, in _, term_width = os.popen('stty size', 'r').read().split() ValueError: not enough values to unpack (expected 2, got 0) I'm getting this error. I'm writing this command: "python main.py"

    Is there any argument I need to pass? capture

    opened by debadridtt 10
  VGG16 Model and Performace on Cifar-10

    I run the Vgg16 model on Cifar-10 ,but the acc in test image only has 89%. I set epoch =350, and I automatically change learning rate as you mentioned. And I want to know why do you set classifier=nn.Linear(512,10)? 512 ,the number is how to determine? Thanks!

    opened by muzi-8 8
  preactivation resnet

    Hi --

    I was looking at the implementation of the preactivation ResNet and had a question -- in the paper, they show the preactivation block below (on the right): screen shot 2017-07-22 at 3 57 48 pm

    which AFAICT should look like:

        def forward(self, x):
            shortcut = self.shortcut(x)
            out = self.bn1(x)
            out = F.relu(out)
            out = self.conv1(out)
            out = self.bn2(out)
            out = F.relu(out)
            out = self.conv2(out)
            out += shortcut
            return out

    which is slightly different from your implementation:

        def forward(self, x):
            out = self.bn1(x)
            out = F.relu(out)
            shortcut = self.shortcut(out)
            out = self.conv1(out)
            out = self.bn2(out)
            out = F.relu(out)
            out = self.conv2(out)
            out += shortcut
            return out

    In the picture, I think your implementation would look like shifting the first bn and relu into the grey band, then splitting into two branches.

    Any thoughts? Am I misinterpreting, or is there a reason for the difference?

    opened by bkj 6
  'ResNet' object has no attribute 'to'

    Hi, I'm totally a newbie in machine learning and I am trying out the transfer learning tutorial on the Pytorch website https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html. When I run the file, this error pops up : AttributeError: 'ResNet' object has no attribute 'to'. Can any help me?

    opened by cuttrex 5
  What is the right hyper parameter for mobilenet V2 to get 94.47% acc on cifar10?

    The first time I run mobilenet v2 experiment, it gives me an error in forward function. So I change the the kernel size of avg pooling layer from 2 --> 4. I train this model using batch size 128, rl 0.1 (x0.1 when epoch in [150, 250]). It gives me around 91.0% . What could be the problem? Thanks!

    opened by birdyLinch 5
  Standard deviation for transforms.Normalize

    How did you calculate the standard deviation values for transforms.Normalize? I am getting the same means, but different standard deviations:

    import numpy as np
    from torchvision import datasets
    from torchvision import transforms
    transform_train = transforms.Compose([
    #     transforms.RandomCrop(32, padding=4),
    #     transforms.RandomHorizontalFlip(),
    trainset = datasets.CIFAR10(root='data', train=True, download=True, transform=transform_train)
    train_loader = torch.utils.data.DataLoader(trainset, batch_size=50_000, shuffle=True)
    train = train_loader.__iter__().next()[0]
    print('Mean: {}'.format(np.mean(train.numpy(), axis=(0, 2, 3))))
    # Mean: [ 0.49139765  0.48215759  0.44653141]
    print('STD: {}'.format(np.std(train.numpy(), axis=(0, 2, 3))))
    # STD: [ 0.24703199  0.24348481  0.26158789]
    opened by codyaustun 5
  VGG16 and Resnet18 on other dataset (STL10) issue

    Since these networks work well on cifar10 and cifar100. .Now I want to work on VGG16 with other dataset i.e STL10. image shape is 96 X 96 X 3.. But it is giving me below error. How to fix?

    File "/home/kumar/kumar/ResNet18 CIFAR10/models/vgg.py", line 48, in test y = net(x) File "/home/kumar/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "/home/kumar/kumar/ResNet18 CIFAR10/models/vgg.py", line 26, in forward out = self.classifier(out) File "/home/kumar/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "/home/kumar/.local/lib/python3.6/site-packages/torch/nn/modules/linear.py", line 87, in forward return F.linear(input, self.weight, self.bias) File "/home/kumar/.local/lib/python3.6/site-packages/torch/nn/functional.py", line 1369, in linear ret = torch.addmm(bias, input, weight.t()) RuntimeError: size mismatch, m1: [2 x 4608], m2: [512 x 10] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:752

    opened by TeerathChandani 4
  running error

    ubuntu@ip-172-31-35-111:~/pytorch-cifar$ python3 main.py Traceback (most recent call last): File "main.py", line 16, in from models import * File "/home/ubuntu/pytorch-cifar/models/init.py", line 1, in from vgg import * ModuleNotFoundError: No module named 'vgg'

    opened by GuanhuaWang 4
  AttributeError: 'ResNet' object has no attribute 'to'

    File "main.py", line 69, in net = net.to(device) AttributeError: 'ResNet' object has no attribute 'to'

    Is there anybody who got the same error?

    opened by ggeh 4
  How comes transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)

    Forgive my ignorance, I am wondering about the the following code transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010) Why are these number for (mean, std)? Are they for some mini-batch (how large the batch size) or total dataset?


    opened by xwuaustin 3
  What does the 1x1 average pooling layer do in VGG models?

    In this line: https://github.com/kuangliu/pytorch-cifar/blob/master/models/vgg.py#L37 There is a 1x1 average pooling layer with stride 1. According to my knowledge of pooling, this layer basically does nothing?

    opened by JC-S 2
  Errors when testing on CPU

    layer_name: <class 'torch.nn.modules.conv.Conv2d'>, total_params: 15121584, total_traina_params: 15121584, n_layers: 39
    device:  cpu
    Traceback (most recent call last):
      File "main.py", line 208, in <module>
      File "main.py", line 189, in test
        outputs = net(inputs)
      File "/home/brcao/Apps/anaconda3/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/home/brcao/Repos/pytorch-cifar/models/dla_simple.py", line 106, in forward
        out = self.base(x)
      File "/home/brcao/Apps/anaconda3/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/home/brcao/Apps/anaconda3/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/container.py", line 117, in forward
        input = module(input)
      File "/home/brcao/Apps/anaconda3/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/home/brcao/Apps/anaconda3/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 423, in forward
        return self._conv_forward(input, self.weight)
      File "/home/brcao/Apps/anaconda3/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 419, in _conv_forward
        return F.conv2d(input, weight, self.bias, self.stride,
    RuntimeError: Expected object of device type cuda but got device type cpu for argument #1 'self' in call to _thnn_conv2d_forward
    opened by bryanbocao 9
  ResNet18 performs much better than expected!

    I ran this repo with ResNet18 using 4 GPUs and with the latest Pytorch version and I got 95.67% instead of 93.02% reported in the ReadMe table. I'm wondering whether anything improved in Pytorch that could explain this major improvement?

    My requirements list: numpy==1.22.3 torch==1.11.0+cu115 torchvision==0.12.0+cu115 -f https://download.pytorch.org/whl/torch_stable.html

    opened by arashash 5
