Pytorch implementation of Hinton's Dynamic Routing Between Capsules

Overview
Comments
  • softmax dimension seems not right

    softmax dimension seems not right

    Notice that in capsule_layer.py, c_ij is generated by following eq:

    c_ij = F.softmax(b_ij)

    which softmax the vector based on dim 1, which is input channel. The result is softmax b_ij in "i" dimension, but i think we should softmax in "j" dimension according to original paper, so i think it should be as following?(minimal change)

    c_ij = F.softmax(b_ij.transpose(1, 2)).transpose(1, 2)

    opened by Primus-zhao 6
  • Type Invalid Error

    Type Invalid Error

    Hi, I tried to run ur code, i got an error, do u know how to fix this? thx~ :)

    Here is the Traceback Information: yellowtown@Yellowtown:/media/yellowtown/Work/Github/pytorch-capsule$ python3.5 main.py Traceback (most recent call last): File "main.py", line 61, in output_unit_size=output_unit_size).cuda() File "/media/yellowtown/Work/Github/pytorch-capsule/capsule_network.py", line 53, in init self.reconstruct0 = nn.Linear(num_output_units*output_unit_size, (reconstruction_size * 2) / 3) File "/home/yellowtown/.local/lib/python3.5/site-packages/torch/nn/modules/linear.py", line 39, in init self.weight = Parameter(torch.Tensor(out_features, in_features)) TypeError: torch.FloatTensor constructor received an invalid combination of arguments - got (float, int), but expected one of:

    • no arguments
    • (int ...) didn't match because some of the arguments have invalid types: (float, int)
    • (torch.FloatTensor viewed_tensor)
    • (torch.Size size)
    • (torch.FloatStorage data)
    • (Sequence data)
    opened by yellowtownhz 3
  • TypeError: sum() got an unexpected keyword argument 'keepdim'

    TypeError: sum() got an unexpected keyword argument 'keepdim'

    I get a error: File "/home/ai/pytorch-capsule-master/capsule_layer.py", line 53, in squash mag_sq = torch.sum(s**2, dim=2, keepdim=True) TypeError: sum() got an unexpected keyword argument 'keepdim'

    how to fix this? pip install torch-version?

    thanks!

    opened by AI-liu 3
  • Does it has something wrong in reconstruct loss code?

    Does it has something wrong in reconstruct loss code?

            masked = Variable(torch.zeros(input.size())).cuda()
            masked[:,v_max_index] = input[:,v_max_index]
    

    The version of python I use is 3.5. In my opinion, the v_max_index (batchsize, 1) means for each sample of batch, there is a max length vector among 16 vectors. So for the sample 0, there should be active vector among 16 vectors, and other 15 vectors are all 0. But for

    masked[:,v_max_index] = input[:,v_max_index]

    it means each sample, there are v_max_index vector will be assign which includes duplicate assignments.

    And the paper says

    During training, we mask out all but the activity vector of the correct digit capsule.

    The figure 2 in the paper show that for each sample, there are 15 vectors among 16 vectors will be masked.

    opened by AlexHex7 2
  • Code on SVHN dataset

    Code on SVHN dataset

    I am trying to use your code on SVHN dataset, but I got the following error. Can you please help to sort out the problem? Thank you in advanced.

    ####Data Loader###

    train_dataset = datasets.SVHN('/media/user/DATA/New_CODE/pytorch-capsule/SVHN', download=True, transform=transform, split='train') test_dataset = datasets.SVHN('/media/user/DATA/New_CODE/pytorch-capsule/SVHN', download=True, transform=transform, split='test')

    train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

    test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=test_batch_size, shuffle=True)

    CapsuleNetwork ( (conv1): CapsuleConvLayer ( (conv0): Conv2d(1, 256, kernel_size=(9, 9), stride=(1, 1)) (relu): ReLU (inplace) ) (primary): CapsuleLayer ( (unit_0): ConvUnit ( (conv0): Conv2d(256, 32, kernel_size=(9, 9), stride=(2, 2)) ) (unit_1): ConvUnit ( (conv0): Conv2d(256, 32, kernel_size=(9, 9), stride=(2, 2)) ) (unit_2): ConvUnit ( (conv0): Conv2d(256, 32, kernel_size=(9, 9), stride=(2, 2)) ) (unit_3): ConvUnit ( (conv0): Conv2d(256, 32, kernel_size=(9, 9), stride=(2, 2)) ) (unit_4): ConvUnit ( (conv0): Conv2d(256, 32, kernel_size=(9, 9), stride=(2, 2)) ) (unit_5): ConvUnit ( (conv0): Conv2d(256, 32, kernel_size=(9, 9), stride=(2, 2)) ) (unit_6): ConvUnit ( (conv0): Conv2d(256, 32, kernel_size=(9, 9), stride=(2, 2)) ) (unit_7): ConvUnit ( (conv0): Conv2d(256, 32, kernel_size=(9, 9), stride=(2, 2)) ) ) (digits): CapsuleLayer ( ) (reconstruct0): Linear (160 -> 522) (reconstruct1): Linear (522 -> 1176) (reconstruct2): Linear (1176 -> 784) (relu): ReLU (inplace) (sigmoid): Sigmoid () ) Traceback (most recent call last): File "main.py", line 195, in last_loss = train(epoch) File "main.py", line 171, in train output = network(data) File "/home/mahfuj/pytorch_python3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 224, in call result = self.forward(*input, **kwargs) File "/media/mahfuj/DATA/New_CODE/pytorch-capsule/capsule_network.py", line 61, in forward return self.digits(self.primary(self.conv1(x))) File "/home/mahfuj/pytorch_python3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 224, in call result = self.forward(*input, **kwargs) File "/media/mahfuj/DATA/New_CODE/pytorch-capsule/capsule_conv_layer.py", line 27, in forward return self.relu(self.conv0(x)) File "/home/mahfuj/pytorch_python3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 224, in call result = self.forward(*input, **kwargs) File "/home/mahfuj/pytorch_python3/lib/python3.5/site-packages/torch/nn/modules/conv.py", line 254, in forward self.padding, self.dilation, self.groups) File "/home/mahfuj/pytorch_python3/lib/python3.5/site-packages/torch/nn/functional.py", line 52, in conv2d return f(input, weight, bias) RuntimeError: Need input.size[1] == 1 but got 3 instead.

    opened by redhat12345 1
  • can you realize the reconstruction?

    can you realize the reconstruction?

    I tried this code, and got 0.002 test loss.Accuracy was 99%,but the reconstruction image was wrong . I guess there may be something wrong with the reconstruction code.

    Originally posted by @jjprincess in https://github.com/timomernick/pytorch-capsule/issues/10#issuecomment-383916175

    opened by JiaQiao111 0
  • Softmax in routing algorithm incorrect:?

    Softmax in routing algorithm incorrect:?

    Hi, I think the softmax in the routing algorithm is being calculated over the wrong dimension.

    Currently the code has:

            # Initialize routing logits to zero.
            b_ij = Variable(torch.zeros(1, self.in_channels, self.num_units, 1)).cuda()
    
            # Iterative routing.
            num_iterations = 3
            for iteration in range(num_iterations):
                # Convert routing logits to softmax.
                # (batch, features, num_units, 1, 1)
                c_ij = F.softmax(b_ij)
    

    and since the dim parameter is not passed to the F.softmax call it will choose dim=1 and compute the softmax over the self.in_channels dimension (1152 here) whereas the softmax should be computed so that the c_ij between each input capsule and all the capsules in the next layer should sum to 1.

    Thus the correct call should be:

               c_ij = F.softmax(b_ij, dim=2)
    
    opened by geefer 3
  • something wrong in your code

    something wrong in your code

    Firstly,thank you for your code

    but as i try to read your source code.i find maybe there is errors in your squash function code Problem1: from your readme file,i read the tensorflow source code

    Squashing function corresponding to Eq. 1
        Args:
            vector: A tensor with shape [batch_size, 1, num_caps, vec_len, 1] or [batch_size, num_caps, vec_len, 1].
        Returns:
            A tensor with the same shape as vector but squashed in 'vec_len' dimension.
    

    in the comment,we squash in the vec_len dimension.But in your code

     def squash(s):
            # This is equation 1 from the paper.
            mag_sq = torch.sum(s**2, dim=2, keepdim=True)
            mag = torch.sqrt(mag_sq)
            s = (mag_sq / (1.0 + mag_sq)) * (s / mag)
            return s
    

    because you have not wrote comment.so we just see here

    # Flatten to (batch, unit, output).
    u = u.view(x.size(0), self.num_units, -1)
    # Return squashed outputs.
    return CapsuleLayer.squash(u)
    

    it is easy to know we should do squashing in dim=1 not 2

    Problem2:

    # (batch, features, in_units) -> (batch, features, num_units, in_units, 1)
    x = torch.stack([x] * self.num_units, dim=2).unsqueeze(4)
    
    # (batch, features, in_units, unit_size, num_units)
    W = torch.cat([self.W] * batch_size, dim=0)
    
    # Transform inputs by weight matrix.
    # (batch_size, features, num_units, unit_size, 1)
    u_hat = torch.matmul(W, x)
    

    how can x with shape(batch, features, num_units, in_units, 1) and w with shape (batch, features, in_units, unit_size, num_units) do matmul operate...

    i do not run your code successfully,because of data.so i do not know is it right.

    Best Wishes!!

    opened by selous123 3
  • The mask operation is different between training and testing?

    The mask operation is different between training and testing?

    In the paper, there is a sentence

    During training, we mask out all but the activity vector of the correct digit capsule.

    so, I think it will mask all but the capsule (1x16 vector) which is match the ground-truth during training. And the code now is about testing time, it will mask all but the longest capsule (1x16 vector).

    opened by AlexHex7 3
  • bugfix for the reconstruction part and the squash function and the margin&reconsturction loss function.

    bugfix for the reconstruction part and the squash function and the margin&reconsturction loss function.

    #2 Get 98% accuracy in 1st epoch now. Original code is so great and neat. thanks a lot :) But I found some bugs here.

    fixbug1: Use the correct capsule to reconstruct the input image rather than longest capsule. fixbug2: Use the squash function to squash the right dimension(unit_size 8) in primary capsule, It have a great impact on the model accuracy. CapsuleLayer.squash(u, dim=1)(The digit part squash function seems not right, and the most weird thing is if I squash the right dim which is capsule size 16, model can't be train correctly.) fixbug3: The margin loss function lack of a square term. fixbug4: The reconstruction_loss function should minimize the sum of squared differences instead of the mean squared differences.(as the capsule paper said)

    opened by JaveyWang 2
Owner
Tim Omernick
Tim Omernick
UAV-Networks-Routing is a Python simulator for experimenting routing algorithms and mac protocols on unmanned aerial vehicle networks.

UAV-Networks Simulator - Autonomous Networking - A.A. 20/21 UAV-Networks-Routing is a Python simulator for experimenting routing algorithms and mac pr

null 0 Nov 13, 2021
Re-implementation of the vector capsule with dynamic routing

VectorCapsule Re-implementation of the vector capsule with dynamic routing We implement the vector capsule and dynamic routing via graph neural networ

ZhenchaoTang 10 Feb 10, 2022
Official implementation of the paper "Topographic VAEs learn Equivariant Capsules"

Topographic Variational Autoencoder Paper: https://arxiv.org/abs/2109.01394 Getting Started Install requirements with Anaconda: conda env create -f en

T. Andy Keller 69 Dec 12, 2022
Pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering".

TRAnsformer Routing Networks (TRAR) This is an official implementation for ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visu

Ren Tianhe 49 Nov 10, 2022
This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task

?? ERASOR (RA-L'21 with ICRA Option) Official page of "ERASOR: Egocentric Ratio of Pseudo Occupancy-based Dynamic Object Removal for Static 3D Point C

Hyungtae Lim 225 Dec 29, 2022
Reimplementation of the paper "Attention, Learn to Solve Routing Problems!" in jax/flax.

JAX + Attention Learn To Solve Routing Problems Reinplementation of the paper Attention, Learn to Solve Routing Problems! using Jax and Flax. Fully su

Gabriela Surita 7 Dec 1, 2022
Dynamic View Synthesis from Dynamic Monocular Video

Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer This repository contains code to compute depth from a

Intelligent Systems Lab Org 2.3k Jan 1, 2023
Dynamic View Synthesis from Dynamic Monocular Video

Dynamic View Synthesis from Dynamic Monocular Video Project Website | Video | Paper Dynamic View Synthesis from Dynamic Monocular Video Chen Gao, Ayus

Chen Gao 139 Dec 28, 2022
Dynamic vae - Dynamic VAE algorithm is used for anomaly detection of battery data

Dynamic VAE frame Automatic feature extraction can be achieved by probability di

null 10 Oct 7, 2022
Official PyTorch implementation of Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations

Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations Zhenyu Jiang, Yifeng Zhu, Maxwell Svetlik, Kuan Fang, Yu

UT-Austin Robot Perception and Learning Lab 63 Jan 3, 2023
PyTorch implementation of "Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning"

Transparency-by-Design networks (TbD-nets) This repository contains code for replicating the experiments and visualizations from the paper Transparenc

David Mascharka 351 Nov 18, 2022
Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

SwinTextSpotter This is the pytorch implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text R

mxin262 183 Jan 3, 2023
PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

Neural Scene Flow Fields PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 20

Zhengqi Li 585 Jan 4, 2023
This repo contains the pytorch implementation for Dynamic Concept Learner (accepted by ICLR 2021).

DCL-PyTorch Pytorch implementation for the Dynamic Concept Learner (DCL). More details can be found at the project page. Framework Grounding Physical

Zhenfang Chen 31 Jan 6, 2023
[TOG 2021] PyTorch implementation for the paper: SofGAN: A Portrait Image Generator with Dynamic Styling.

This repository contains the official PyTorch implementation for the paper: SofGAN: A Portrait Image Generator with Dynamic Styling. We propose a SofGAN image generator to decouple the latent space of portraits into two subspaces: a geometry space and a texture space. Experiments on SofGAN show that our system can generate high quality portrait images with independently controllable geometry and texture attributes.

Anpei Chen 694 Dec 23, 2022
Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic Scenes", ICCV 2021.

Deep 3D Mask Volume for View Synthesis of Dynamic Scenes Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic S

Ken Lin 17 Oct 12, 2022
Unofficial pytorch implementation of the paper "Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution"

DFSA Unofficial pytorch implementation of the ICCV 2021 paper "Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution" (p

null 2 Nov 15, 2021
Pytorch implementation of various High Dynamic Range (HDR) Imaging algorithms

Deep High Dynamic Range Imaging Benchmark This repository is the pytorch impleme

Tianhong Dai 5 Nov 16, 2022
MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.

MMdnn MMdnn is a comprehensive and cross-framework tool to convert, visualize and diagnose deep learning (DL) models. The "MM" stands for model manage

Microsoft 5.7k Jan 9, 2023