Model summary in PyTorch similar to `model.summary()` in Keras

Overview

Keras style model.summary() in PyTorch

PyPI version

Keras has a neat API to view the visualization of the model which is very helpful while debugging your network. Here is a barebone code to try and mimic the same in PyTorch. The aim is to provide information complementary to, what is not provided by print(your_model) in PyTorch.

Usage

  • pip install torchsummary or
  • git clone https://github.com/sksq96/pytorch-summary
from torchsummary import summary
summary(your_model, input_size=(channels, H, W))
  • Note that the input_size is required to make a forward pass through the network.

Examples

CNN for MNIST

import torch
import torch.nn as nn
import torch.nn.functional as F
from torchsummary import summary

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
        self.conv2_drop = nn.Dropout2d()
        self.fc1 = nn.Linear(320, 50)
        self.fc2 = nn.Linear(50, 10)

    def forward(self, x):
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        x = x.view(-1, 320)
        x = F.relu(self.fc1(x))
        x = F.dropout(x, training=self.training)
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # PyTorch v0.4.0
model = Net().to(device)

summary(model, (1, 28, 28))
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1           [-1, 10, 24, 24]             260
            Conv2d-2             [-1, 20, 8, 8]           5,020
         Dropout2d-3             [-1, 20, 8, 8]               0
            Linear-4                   [-1, 50]          16,050
            Linear-5                   [-1, 10]             510
================================================================
Total params: 21,840
Trainable params: 21,840
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.06
Params size (MB): 0.08
Estimated Total Size (MB): 0.15
----------------------------------------------------------------

VGG16

import torch
from torchvision import models
from torchsummary import summary

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
vgg = models.vgg16().to(device)

summary(vgg, (3, 224, 224))
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1         [-1, 64, 224, 224]           1,792
              ReLU-2         [-1, 64, 224, 224]               0
            Conv2d-3         [-1, 64, 224, 224]          36,928
              ReLU-4         [-1, 64, 224, 224]               0
         MaxPool2d-5         [-1, 64, 112, 112]               0
            Conv2d-6        [-1, 128, 112, 112]          73,856
              ReLU-7        [-1, 128, 112, 112]               0
            Conv2d-8        [-1, 128, 112, 112]         147,584
              ReLU-9        [-1, 128, 112, 112]               0
        MaxPool2d-10          [-1, 128, 56, 56]               0
           Conv2d-11          [-1, 256, 56, 56]         295,168
             ReLU-12          [-1, 256, 56, 56]               0
           Conv2d-13          [-1, 256, 56, 56]         590,080
             ReLU-14          [-1, 256, 56, 56]               0
           Conv2d-15          [-1, 256, 56, 56]         590,080
             ReLU-16          [-1, 256, 56, 56]               0
        MaxPool2d-17          [-1, 256, 28, 28]               0
           Conv2d-18          [-1, 512, 28, 28]       1,180,160
             ReLU-19          [-1, 512, 28, 28]               0
           Conv2d-20          [-1, 512, 28, 28]       2,359,808
             ReLU-21          [-1, 512, 28, 28]               0
           Conv2d-22          [-1, 512, 28, 28]       2,359,808
             ReLU-23          [-1, 512, 28, 28]               0
        MaxPool2d-24          [-1, 512, 14, 14]               0
           Conv2d-25          [-1, 512, 14, 14]       2,359,808
             ReLU-26          [-1, 512, 14, 14]               0
           Conv2d-27          [-1, 512, 14, 14]       2,359,808
             ReLU-28          [-1, 512, 14, 14]               0
           Conv2d-29          [-1, 512, 14, 14]       2,359,808
             ReLU-30          [-1, 512, 14, 14]               0
        MaxPool2d-31            [-1, 512, 7, 7]               0
           Linear-32                 [-1, 4096]     102,764,544
             ReLU-33                 [-1, 4096]               0
          Dropout-34                 [-1, 4096]               0
           Linear-35                 [-1, 4096]      16,781,312
             ReLU-36                 [-1, 4096]               0
          Dropout-37                 [-1, 4096]               0
           Linear-38                 [-1, 1000]       4,097,000
================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.57
Forward/backward pass size (MB): 218.59
Params size (MB): 527.79
Estimated Total Size (MB): 746.96
----------------------------------------------------------------

Multiple Inputs

import torch
import torch.nn as nn
from torchsummary import summary

class SimpleConv(nn.Module):
    def __init__(self):
        super(SimpleConv, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(1, 1, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
        )

    def forward(self, x, y):
        x1 = self.features(x)
        x2 = self.features(y)
        return x1, x2
    
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = SimpleConv().to(device)

summary(model, [(1, 16, 16), (1, 28, 28)])
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1            [-1, 1, 16, 16]              10
              ReLU-2            [-1, 1, 16, 16]               0
            Conv2d-3            [-1, 1, 28, 28]              10
              ReLU-4            [-1, 1, 28, 28]               0
================================================================
Total params: 20
Trainable params: 20
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.77
Forward/backward pass size (MB): 0.02
Params size (MB): 0.00
Estimated Total Size (MB): 0.78
----------------------------------------------------------------

References

License

pytorch-summary is MIT-licensed.

Comments
  • Multi-input error

    Multi-input error

    I just try to summary my multi-input networks like README.md. The networks can forward correctly, but torchsummary report a mismatch error.

    the module is listed as:

    class SiameseNets(nn.Module):
        def __init__(self):
            super(SiameseNets, self).__init__()
            self.conv1 = nn.Conv2d(1, 64, 10)
            self.conv2 = nn.Conv2d(64, 128, 7)
            self.conv3 = nn.Conv2d(128, 128, 4)
            self.conv4 = nn.Conv2d(128, 256, 4)
    
            self.pooling = nn.MaxPool2d(2, 2)
            self.fc1 = nn.Linear(256, 4096)
            self.fc2 = nn.Linear(4096, 1)
            self.dropout = nn.Dropout(0.5)
    
        def forward(self, x1, x2):
            x1 = self.pooling(F.relu(self.conv1(x1)))
            x1 = self.pooling(F.relu(self.conv2(x1)))
            x1 = self.pooling(F.relu(self.conv3(x1)))
            x1 = self.pooling(F.relu(self.conv4(x1)))
    
            x2 = self.pooling(F.relu(self.conv1(x2)))
            x2 = self.pooling(F.relu(self.conv2(x2)))
            x2 = self.pooling(F.relu(self.conv3(x2)))
            x2 = self.pooling(F.relu(self.conv4(x2)))
    
            x1 = x1.view(-1)
            x2 = x2.view(-1)
    
            x1 = self.fc1(x1)
            x2 = self.fc1(x2)
    
            metric = torch.abs(x1 - x2)
            similarity = F.sigmoid(self.fc2(self.dropout(metric)))
            return similarity
    

    Call function as: summary(net, [(1, 88, 88), (1, 88, 88)])

    Error:

    Traceback (most recent call last):
      File "/Users/***/opt/anaconda3/envs/pytorch/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3331, in run_code
        exec(code_obj, self.user_global_ns, self.user_ns)
      File "<ipython-input-52-3e3d0d779b7b>", line 1, in <module>
        summary(net,[(1,88,88),(1,88,88)])
      File "/Users/***/opt/anaconda3/envs/pytorch/lib/python3.8/site-packages/torchsummary/torchsummary.py", line 72, in summary
        model(*x)
      File "/Users/***/opt/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
        result = self.forward(*input, **kwargs)
      File "/Users/***/PycharmProjects/ML/models.py", line 61, in forward
        x1 = self.fc1(x1)
      File "/Users/***/opt/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
        result = self.forward(*input, **kwargs)
    

    So, did I make any thing wrong? I will check my code again, and waiting your answer. Thx! XD

    opened by TangChiaHsin 7
  • Multiple inputs with different dtype error

    Multiple inputs with different dtype error

    When using multiple inputs with different types, torchsummary generates random inputs with same type torch.FloatTensor

    You can delete the assignment of dtype , then pass it as a parameter to get differnt random inputs with various types:

        ## del
        # if device == "cuda" and torch.cuda.is_available():
        #     dtype = torch.cuda.FloatTensor
        # else:
        #     dtype = torch.FloatTensor
    
        # multiple inputs to the network
        if isinstance(input_size, tuple):
            input_size = [input_size]
    
        # batch_size of 2 for batchnorm
        # modified
        x = [torch.rand(*in_size).type(dtype) for in_size, dtype in input_size]
        # print(type(x[0]))
    
    

    Then use to get the summary of your model: summary(model, [((size1), dtype1), ((size2), dtype2)])

    opened by trajepl 5
  • whats wrong with it

    whats wrong with it

    File "main.py", line 301, in summary(model.to(hyperparams['device']), input.size()[1:], device=hyperparams['device']) File "/home/anaconda3/lib/python3.6/site-packages/torchsummary/torchsummary.py", line 44, in summary device = device.lower()

    AttributeError: 'torch.device' object has no attribute 'lower'

    opened by JoJo-ops 5
  • Syntax error during importing

    Syntax error during importing

    I got a syntax error just by importing

    from torchsummary import summary

     File "main.py", line 21, in <module>
        from torchsummary import summary
      File "/home/ivan/vox/torchsummary/__init__.py", line 1, in <module>
        from .torchsummary import summary
      File "/home/ivan/vox/torchsummary/torchsummary.py", line 9
        def summary(model, *input_size, batch_size=-1, device="cuda"):
                                                 ^
    SyntaxError: invalid syntax
    

    Could you suggest where to look for the catch? Thanks

    opened by IvanEz 5
  • When running DenseNet and MobileNet I got the following errors

    When running DenseNet and MobileNet I got the following errors

    MobileNet: Traceback (most recent call last): File "MobileNet.py", line 365, in summary(model, (3,32,32)) File "/home/sriharsha/miniconda3/lib/python3.6/site-packages/torchsummary/torchsummary.py", line 56, in summary model(x) File "/home/sriharsha/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in call result = self.forward(*input, **kwargs) File "MobileNet.py", line 114, in forward x = self.model(x) File "/home/sriharsha/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in call result = self.forward(*input, **kwargs) File "/home/sriharsha/miniconda3/lib/python3.6/site-packages/torch/nn/modules/container.py", line 67, in forward input = module(input) File "/home/sriharsha/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in call result = self.forward(*input, **kwargs) File "/home/sriharsha/miniconda3/lib/python3.6/site-packages/torch/nn/modules/container.py", line 67, in forward input = module(input) File "/home/sriharsha/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 359, in call hook_result = hook(self, input, result) File "/home/sriharsha/miniconda3/lib/python3.6/site-packages/torchsummary/torchsummary.py", line 28, in hook params += th.prod(th.LongTensor(list(module.bias.size()))) AttributeError: 'NoneType' object has no attribute 'size'

    DenseNet: Traceback (most recent call last): File "DenseNet.py", line 237, in summary(model, (3,32,32)) File "/home/sriharsha/miniconda3/lib/python3.6/site-packages/torchsummary/torchsummary.py", line 56, in summary model(x) File "/home/sriharsha/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in call result = self.forward(*input, **kwargs) File "DenseNet.py", line 226, in forward features = self.features(x) File "/home/sriharsha/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in call result = self.forward(*input, **kwargs) File "/home/sriharsha/miniconda3/lib/python3.6/site-packages/torch/nn/modules/container.py", line 67, in forward input = module(input) File "/home/sriharsha/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 359, in call hook_result = hook(self, input, result) File "/home/sriharsha/miniconda3/lib/python3.6/site-packages/torchsummary/torchsummary.py", line 28, in hook params += th.prod(th.LongTensor(list(module.bias.size()))) AttributeError: 'bool' object has no attribute 'size'

    opened by sriharsha0806 5
  • RuntimeError: Failed to run torchsummary. See above stack traces for more details. Executed layers up to: []

    RuntimeError: Failed to run torchsummary. See above stack traces for more details. Executed layers up to: []

    I am trying to run the following code but keep getting an error. Can anyone help me out

    This is my code :

    from gc_layer import GatedConv2d
    import torch
    import torch.nn as nn
    from base_model import BaseModel
    class Discriminator(BaseModel):
    def __init__(self, channels = 64):
    super(Discriminator, self).__init__()
    self.channels = channels
    input_dim = 3
    self.init_weights()
    self.gt_conv1 = GatedConv2d(input_dim+1, channels, kernel_size=3, dilation=1, pad_type='zero', padding=1, activation='lrelu')
    # self.lk1 = nn.LeakyReLU()
    self.gt_conv2 = GatedConv2d(channels, channels*2, kernel_size=3, dilation=1, pad_type='zero',padding=1, activation='lrelu')
    # self.lk2 = nn.LeakyReLU()
    self.gt_conv3 = GatedConv2d(channels*2, channels*4, kernel_size=3, dilation=1, pad_type='zero', padding=1, activation='lrelu')
    # self.lk3 = nn.LeakyReLU()
    self.gt_conv4 = GatedConv2d(channels*4, channels*8, kernel_size=3, dilation=1, pad_type='zero', padding=1, activation='lrelu')
    # self.lk4 = nn.LeakyReLU()
    self.gt_conv5 = GatedConv2d(channels*8, channels*8, kernel_size=3, dilation=1, pad_type='zero', padding=1, activation='lrelu')
    # self.lk5 = nn.LeakyReLU()
    self.gt_conv6 = GatedConv2d(channels*8, channels*8, kernel_size=3, dilation=1, pad_type='zero', padding=1, activation='lrelu')
    # self.lk6 = nn.LeakyReLU()
    self.gt_conv7 = GatedConv2d(channels*8, channels*8, kernel_size=3, dilation=1, pad_type='zero', padding=1, activation='lrelu')
    # self.lk7 = nn.LeakyReLU()
    def forward(self, inputs):
    # x_in = torch.cat([inputs, mask], dim=1)
    output = self.gt_conv1(inputs)
    # output = self.lk1(output)
    output = self.gt_conv2(output)
    # output = self.lk2(output)
    output = self.gt_conv3(output)
    # output = self.lk3(output)
    output = self.gt_conv4(output)
    # output = self.lk4(output)
    output = self.gt_conv5(output)
    # output = self.lk5(output)
    output = self.gt_conv6(output)
    # output = self.lk6(output)
    output = self.gt_conv7(output)
    # output = self.lk7(output)
    return output
    
    if __name__ == "__main__":
    model = Discriminator()
    print(model)
    from torchsummary import summary
    print(summary(model, (3,256,256), 1))
    
    

    And this is the error:

    Traceback (most recent call last):
    
    File "/home/huynth/miniconda3/envs/inpainting/lib/python3.8/site-packages/torchsummary/torchsummary.py", line 140, in summary
    
    _ = model.to(device)(*x, *args, **kwargs) # type: ignore[misc]
    
    File "/home/huynth/miniconda3/envs/inpainting/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    
    return forward_call(*input, **kwargs)
    
    TypeError: forward() takes 2 positional arguments but 3 were given
    
    
    
    The above exception was the direct cause of the following exception:
    
    
    
    Traceback (most recent call last):
    
    File "/home/huynth/Hypergraph-Inpainting/models/discriminator.py", line 57, in <module>
    
    print(summary(model, (3,256,256), 1))
    
    File "/home/huynth/miniconda3/envs/inpainting/lib/python3.8/site-packages/torchsummary/torchsummary.py", line 143, in summary
    
    raise RuntimeError(
    
    RuntimeError: Failed to run torchsummary. See above stack traces for more details. Executed layers up to: []
    

    Thank you!

    opened by huynth1801 4
  • Summary assumes Cuda

    Summary assumes Cuda

    Summary assumes that the model is on the GPU if cuda is available, this is not always the case and should rather be left up to the user to specify if it should use the GPU or CPU.

    opened by system123 4
  • Add interface for returning summary string

    Add interface for returning summary string

    Add interface for returning concatenated string of summary, instead of directly printing it.

    This feature is useful for logging.

    • Issue: https://github.com/sksq96/pytorch-summary/issues/99
    opened by greenmonn 3
  • Multiple inputs error

    Multiple inputs error

    When using multiple inputs i get the error "TypeError: can't multiply sequence by non-int of type 'tuple'" from np.prod

    https://github.com/sksq96/pytorch-summary/blob/b50f213f38544ac337beeeda93b03c7e48e69c78/torchsummary/torchsummary.py#L100

    Suggested fix - add the sum function: np.prod(sum(input_size,()))

    opened by mfinean 3
  • Can the function run when there are sequential containers in the model?

    Can the function run when there are sequential containers in the model?

    Hello,

    I have tried your summary function and found out that there would be errors while encountering sequential containers in the model. I am wondering if this function works on sequential containers or there are other problems in my model?

    Thank you!

    opened by acnokegoo 3
  • RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

    RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

    After running the example code from the doc

    import torch
    import torch.nn as nn
    import torch.nn.functional as F
    from torchsummary import summary
    
    class Net(nn.Module):
        def __init__(self):
            super(Net, self).__init__()
            self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
            self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
            self.conv2_drop = nn.Dropout2d()
            self.fc1 = nn.Linear(320, 50)
            self.fc2 = nn.Linear(50, 10)
    
        def forward(self, x):
            x = F.relu(F.max_pool2d(self.conv1(x), 2))
            x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
            x = x.view(-1, 320)
            x = F.relu(self.fc1(x))
            x = F.dropout(x, training=self.training)
            x = self.fc2(x)
            return F.log_softmax(x, dim=1)
    
    # device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # PyTorch v0.4.0
    device = "cpu"
    model = Net().to(device)
    
    summary(model, (1, 28, 28))
    

    Getting

    RuntimeError                              Traceback (most recent call last)
    /tmp/ipykernel_108660/4081865440.py in <module>
         26 model = Net().to(device)
         27 
    ---> 28 summary(model, (1, 28, 28))
    
    ~/.local/lib/python3.9/site-packages/torchsummary/torchsummary.py in summary(model, input_size, batch_size, device)
         70     # make a forward pass
         71     # print(x.shape)
    ---> 72     model(*x)
         73 
         74     # remove these hooks
    
    ~/.local/lib/python3.9/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
       1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
       1101                 or _global_forward_hooks or _global_forward_pre_hooks):
    -> 1102             return forward_call(*input, **kwargs)
       1103         # Do not call functions when jit is used
       1104         full_backward_hooks, non_full_backward_hooks = [], []
    
    /tmp/ipykernel_108660/4081865440.py in forward(self, x)
         14 
         15     def forward(self, x):
    ---> 16         x = F.relu(F.max_pool2d(self.conv1(x), 2))
         17         x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
         18         x = x.view(-1, 320)
    
    ~/.local/lib/python3.9/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
       1118             input = bw_hook.setup_input_hook(input)
       1119 
    -> 1120         result = forward_call(*input, **kwargs)
       1121         if _global_forward_hooks or self._forward_hooks:
       1122             for hook in (*_global_forward_hooks.values(), *self._forward_hooks.values()):
    
    ~/.local/lib/python3.9/site-packages/torch/nn/modules/conv.py in forward(self, input)
        444 
        445     def forward(self, input: Tensor) -> Tensor:
    --> 446         return self._conv_forward(input, self.weight, self.bias)
        447 
        448 class Conv3d(_ConvNd):
    
    ~/.local/lib/python3.9/site-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight, bias)
        440                             weight, bias, self.stride,
        441                             _pair(0), self.dilation, self.groups)
    --> 442         return F.conv2d(input, weight, bias, self.stride,
        443                         self.padding, self.dilation, self.groups)
        444 
    
    RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
    
    opened by rohan-paul 2
  • Failing in the input stage Using Encoder and Decoder Model Architecture (DONUT model)

    Failing in the input stage Using Encoder and Decoder Model Architecture (DONUT model)

    I am trying to get a model summary of the donut model but am unable to define the input for the torch summary. ########################################################### import argparse import gradio as gr import torch from PIL import Image from donut.donut.model import DonutModel from torchvision import models from torchsummary import summary

    def demo_process_vqa(input_img, question): global pretrained_model, task_prompt, task_name # pretrained_model = './donut/result/train_docvqa/20220912_103244' # task_name = "docvqa" # task_prompt = "<s_pdf-donut>" input_img = Image.fromarray(input_img) user_prompt = task_prompt.replace("{user_input}", question) print(user_prompt) output = pretrained_model.inference(input_img, prompt=user_prompt)["predictions"][0] print('inf_out',output) return output

    def demo_process(input_img): global pretrained_model, task_prompt, task_name input_img = Image.fromarray(input_img) output = pretrained_model.inference(image=input_img, prompt=task_prompt)["predictions"][0] return output

    parser = argparse.ArgumentParser() parser.add_argument("--task", type=str, default="docvqa") parser.add_argument("--pretrained_path", type=str, default="train_docvqa_for_all_atts/donut/result/train_docvqa/20220915_125713") args, left_argv = parser.parse_known_args()

    task_name = args.task if "docvqa" == task_name: task_prompt = "<s_taco_eiko_pdf_donut>{user_input}</s_question><s_answer>" else: # rvlcdip, cord, ... task_prompt = f"<s_{task_name}>"

    pretrained_model = DonutModel.from_pretrained(args.pretrained_path)

    if torch.cuda.is_available(): # pretrained_model.half() device = torch.device("cuda") pretrained_model.to(device) else: pretrained_model.encoder.to(torch.bfloat16)

    summary(pretrained_model, [(1, 3, 1280 , 960), (1, 21),(1, 21)])

    The shape of the encoder and decoder is as follows. Encoder : torch.Size([1, 3, 1280, 960]) Decode : torch.Size([1, 21])

    ##Model forward architecture looks like this

        encoder_outputs = self.encoder(image_tensors)
        decoder_outputs = self.decoder(
            input_ids=decoder_input_ids,
            encoder_hidden_states=encoder_outputs,
            labels=decoder_labels,
        )
        return decoder_outputs
    

    Can you please guide how to pass down the model input in summary?

    opened by swapnil-lader 0
  • Fix None Type error while using MultiHeadAttention

    Fix None Type error while using MultiHeadAttention

    This PR is modified based on previous PR #165 by @cainmagi ,

    Main change features:

    • automatical detect and filter not array like elements in forward output list/tuple/dict. For example, MultiHeadAttention module return a tuple which contain a NoneType value as a placeholder of attention weight.
    • If filtered output contain no element, raise a ValueError to notify user instead of original NoneType AtrributeError.
    • Replace -1 to batch_size in dict/list/tuple output shape because I believe it will be more properly.
    opened by GCS-ZHN 0
  • AttributeError: ‘NoneType’ object has no attribute ‘size’

    AttributeError: ‘NoneType’ object has no attribute ‘size’

    hello. when i use torch summary. it reports some issues about:

    File “F:\Anaconda3\lib\site-packages\torchsummary\torchsummary.py”, line 23, in [-1] + list(o.size())[1:] for o in output AttributeError: ‘NoneType’ object has no attribute ‘size’

    here is my model structure. CommonsenseGRUModel( (linear_in): Linear(in_features=1024, out_features=300, bias=True) (norm1a): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (norm1b): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (norm1c): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (norm1d): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (norm3a): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (norm3b): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (norm3c): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (norm3d): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (dropout): Dropout(p=0.5, inplace=False) (dropout_rec): Dropout(p=False, inplace=False) (cs_rnn_f): CommonsenseRNN( (dropout): Dropout(p=False, inplace=False) (dialogue_cell): CommonsenseRNNCell( (g_cell): GRUCell(600, 150) (p_cell): GRUCell(918, 150) (r_cell): GRUCell(1218, 150) (i_cell): GRUCell(918, 150) (e_cell): GRUCell(750, 450) (dropout): Dropout(p=False, inplace=False) (dropout1): Dropout(p=False, inplace=False) (dropout2): Dropout(p=False, inplace=False) (dropout3): Dropout(p=False, inplace=False) (dropout4): Dropout(p=False, inplace=False) (dropout5): Dropout(p=False, inplace=False) (attention): SimpleAttention( (scalar): Linear(in_features=150, out_features=1, bias=False) ) ) ) (cs_rnn_r): CommonsenseRNN( (dropout): Dropout(p=False, inplace=False) (dialogue_cell): CommonsenseRNNCell( (g_cell): GRUCell(600, 150) (p_cell): GRUCell(918, 150) (r_cell): GRUCell(1218, 150) (i_cell): GRUCell(918, 150) (e_cell): GRUCell(750, 450) (dropout): Dropout(p=False, inplace=False) (dropout1): Dropout(p=False, inplace=False) (dropout2): Dropout(p=False, inplace=False) (dropout3): Dropout(p=False, inplace=False) (dropout4): Dropout(p=False, inplace=False) (dropout5): Dropout(p=False, inplace=False) (attention): SimpleAttention( (scalar): Linear(in_features=150, out_features=1, bias=False) ) ) ) (sense_gru): GRU(768, 384, bidirectional=True) (matchatt): MatchingAttention( (transform): Linear(in_features=900, out_features=900, bias=True) ) (linear): Linear(in_features=900, out_features=300, bias=True) (smax_fc): Linear(in_features=300, out_features=7, bias=True) )

    here is about using summary. input_size = [( 14,1, 1024), (14, 1, 1024), (14, 1, 1024), (14, 1, 1024), (14, 1, 768), (14, 1, 768), (14, 1, 768),(14, 1, 768), (14, 1, 768), (14, 1, 9), (1, 14)] from torchsummary import summary model_summary = summary(model,input_size = input_size) print(model_summary)

    Could you please give me some detailed suggestions about how to fix it? Thanks, best wishes

    opened by Aidenfaustine 3
  • Added OrderedDict as an output datatype

    Added OrderedDict as an output datatype

    Some pre-trained models like the deeplabv3_resnet50 returns an OrderedDict containing the semantic mask and the auxillary loss as two separate Tensors under the keys 'out' and 'aux'.

    Usage of torchsummary with such models was not compatible since OrderedDict had not been considered as a potential output type.

    Error thrown: 'collections.OrderedDict' object has no attribute 'size'

    This issue has now been fixed.

    opened by SoumyadeepB 0
Owner
Shubham Chandel
Applied Scientist at @Microsoft working on natural language and code. Previously NYU, @IBM research, @amzn.
Shubham Chandel
Tez is a super-simple and lightweight Trainer for PyTorch. It also comes with many utils that you can use to tackle over 90% of deep learning projects in PyTorch.

Tez: a simple pytorch trainer NOTE: Currently, we are not accepting any pull requests! All PRs will be closed. If you want a feature or something does

abhishek thakur 1.1k Jan 4, 2023
A lightweight wrapper for PyTorch that provides a simple declarative API for context switching between devices, distributed modes, mixed-precision, and PyTorch extensions.

A lightweight wrapper for PyTorch that provides a simple declarative API for context switching between devices, distributed modes, mixed-precision, and PyTorch extensions.

Fidelity Investments 56 Sep 13, 2022
A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.

A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.

null 878 Dec 30, 2022
Unofficial PyTorch implementation of DeepMind's Perceiver IO with PyTorch Lightning scripts for distributed training

Unofficial PyTorch implementation of DeepMind's Perceiver IO with PyTorch Lightning scripts for distributed training

Martin Krasser 251 Dec 25, 2022
PyTorch framework A simple and complete framework for PyTorch, providing a variety of data loading and simple task solutions that are easy to extend and migrate

PyTorch framework A simple and complete framework for PyTorch, providing a variety of data loading and simple task solutions that are easy to extend and migrate

Cong Cai 12 Dec 19, 2021
Pretrained ConvNets for pytorch: NASNet, ResNeXt, ResNet, InceptionV4, InceptionResnetV2, Xception, DPN, etc.

Pretrained models for Pytorch (Work in progress) The goal of this repo is: to help to reproduce research papers results (transfer learning setups for

Remi 8.7k Dec 31, 2022
torch-optimizer -- collection of optimizers for Pytorch

torch-optimizer torch-optimizer -- collection of optimizers for PyTorch compatible with optim module. Simple example import torch_optimizer as optim

Nikolay Novik 2.6k Jan 3, 2023
A PyTorch implementation of EfficientNet

EfficientNet PyTorch Quickstart Install with pip install efficientnet_pytorch and load a pretrained EfficientNet with: from efficientnet_pytorch impor

Luke Melas-Kyriazi 7.2k Jan 6, 2023
The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

News March 3: v0.9.97 has various bug fixes and improvements: Bug fixes for NTXentLoss Efficiency improvement for AccuracyCalculator, by using torch i

Kevin Musgrave 5k Jan 2, 2023
A collection of extensions and data-loaders for few-shot learning & meta-learning in PyTorch

Torchmeta A collection of extensions and data-loaders for few-shot learning & meta-learning in PyTorch. Torchmeta contains popular meta-learning bench

Tristan Deleu 1.7k Jan 6, 2023
PyTorch Extension Library of Optimized Scatter Operations

PyTorch Scatter Documentation This package consists of a small extension library of highly optimized sparse update (scatter and segment) operations fo

Matthias Fey 1.2k Jan 7, 2023
PyTorch Extension Library of Optimized Autograd Sparse Matrix Operations

PyTorch Sparse This package consists of a small extension library of optimized sparse matrix operations with autograd support. This package currently

Matthias Fey 757 Jan 4, 2023
Reformer, the efficient Transformer, in Pytorch

Reformer, the Efficient Transformer, in Pytorch This is a Pytorch implementation of Reformer https://openreview.net/pdf?id=rkgNKkHtvB It includes LSH

Phil Wang 1.8k Jan 6, 2023
higher is a pytorch library allowing users to obtain higher order gradients over losses spanning training loops rather than individual training steps.

higher is a library providing support for higher-order optimization, e.g. through unrolled first-order optimization loops, of "meta" aspects of these

Facebook Research 1.5k Jan 3, 2023
PyTorch implementation of TabNet paper : https://arxiv.org/pdf/1908.07442.pdf

README TabNet : Attentive Interpretable Tabular Learning This is a pyTorch implementation of Tabnet (Arik, S. O., & Pfister, T. (2019). TabNet: Attent

DreamQuark 2k Dec 27, 2022
PyTorch extensions for fast R&D prototyping and Kaggle farming

Pytorch-toolbelt A pytorch-toolbelt is a Python library with a set of bells and whistles for PyTorch for fast R&D prototyping and Kaggle farming: What

Eugene Khvedchenya 1.3k Jan 5, 2023
An implementation of Performer, a linear attention-based transformer, in Pytorch

Performer - Pytorch An implementation of Performer, a linear attention-based transformer variant with a Fast Attention Via positive Orthogonal Random

Phil Wang 900 Dec 22, 2022
The goal of this library is to generate more helpful exception messages for numpy/pytorch matrix algebra expressions.

Tensor Sensor See article Clarifying exceptions and visualizing tensor operations in deep learning code. One of the biggest challenges when writing co

Terence Parr 704 Dec 14, 2022
You like pytorch? You like micrograd? You love tinygrad! ❤️

For something in between a pytorch and a karpathy/micrograd This may not be the best deep learning framework, but it is a deep learning framework. Due

George Hotz 9.7k Jan 5, 2023