Pytorch implementation of Hinton's Dynamic Routing Between Capsules

Tim Omernick

Last update: Oct 27, 2022

Related tags

Deep Learning pytorch-capsule

Overview

pytorch-capsule

A Pytorch implementation of Hinton's "Dynamic Routing Between Capsules".

https://arxiv.org/pdf/1710.09829.pdf

Thanks to @naturomics for his Tensorflow implementation which was a useful guide and sanity check. https://github.com/naturomics/CapsNet-Tensorflow

To use:

$ python main.py

Comments

softmax dimension seems not right

Notice that in capsule_layer.py, c_ij is generated by following eq:

c_ij = F.softmax(b_ij)

which softmax the vector based on dim 1, which is input channel. The result is softmax b_ij in "i" dimension, but i think we should softmax in "j" dimension according to original paper, so i think it should be as following?(minimal change)

c_ij = F.softmax(b_ij.transpose(1, 2)).transpose(1, 2)

opened by Primus-zhao 6
Type Invalid Error
Hi, I tried to run ur code, i got an error, do u know how to fix this? thx~ :)

Here is the Traceback Information: yellowtown@Yellowtown:/media/yellowtown/Work/Github/pytorch-capsule$ python3.5 main.py Traceback (most recent call last): File "main.py", line 61, in output_unit_size=output_unit_size).cuda() File "/media/yellowtown/Work/Github/pytorch-capsule/capsule_network.py", line 53, in init self.reconstruct0 = nn.Linear(num_output_units*output_unit_size, (reconstruction_size * 2) / 3) File "/home/yellowtown/.local/lib/python3.5/site-packages/torch/nn/modules/linear.py", line 39, in init self.weight = Parameter(torch.Tensor(out_features, in_features)) TypeError: torch.FloatTensor constructor received an invalid combination of arguments - got (float, int), but expected one of:

no arguments

(int ...) didn't match because some of the arguments have invalid types: (float, int)

(torch.FloatTensor viewed_tensor)

(torch.Size size)

(torch.FloatStorage data)

(Sequence data)
opened by yellowtownhz 3
TypeError: sum() got an unexpected keyword argument 'keepdim'

I get a error: File "/home/ai/pytorch-capsule-master/capsule_layer.py", line 53, in squash mag_sq = torch.sum(s**2, dim=2, keepdim=True) TypeError: sum() got an unexpected keyword argument 'keepdim'

how to fix this? pip install torch-version?

thanks！

opened by AI-liu 3
Does it has something wrong in reconstruct loss code?
masked = Variable(torch.zeros(input.size())).cuda() masked[:,v_max_index] = input[:,v_max_index]

The version of python I use is 3.5. In my opinion, the v_max_index (batchsize, 1) means for each sample of batch, there is a max length vector among 16 vectors. So for the sample 0, there should be active vector among 16 vectors, and other 15 vectors are all 0. But for

masked[:,v_max_index] = input[:,v_max_index]

it means each sample, there are v_max_index vector will be assign which includes duplicate assignments.

And the paper says

During training, we mask out all but the activity vector of the correct digit capsule.

The figure 2 in the paper show that for each sample, there are 15 vectors among 16 vectors will be masked.
opened by AlexHex7 2
Code on SVHN dataset

I am trying to use your code on SVHN dataset, but I got the following error. Can you please help to sort out the problem? Thank you in advanced.

####Data Loader###

train_dataset = datasets.SVHN('/media/user/DATA/New_CODE/pytorch-capsule/SVHN', download=True, transform=transform, split='train') test_dataset = datasets.SVHN('/media/user/DATA/New_CODE/pytorch-capsule/SVHN', download=True, transform=transform, split='test')

train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=test_batch_size, shuffle=True)

CapsuleNetwork ( (conv1): CapsuleConvLayer ( (conv0): Conv2d(1, 256, kernel_size=(9, 9), stride=(1, 1)) (relu): ReLU (inplace) ) (primary): CapsuleLayer ( (unit_0): ConvUnit ( (conv0): Conv2d(256, 32, kernel_size=(9, 9), stride=(2, 2)) ) (unit_1): ConvUnit ( (conv0): Conv2d(256, 32, kernel_size=(9, 9), stride=(2, 2)) ) (unit_2): ConvUnit ( (conv0): Conv2d(256, 32, kernel_size=(9, 9), stride=(2, 2)) ) (unit_3): ConvUnit ( (conv0): Conv2d(256, 32, kernel_size=(9, 9), stride=(2, 2)) ) (unit_4): ConvUnit ( (conv0): Conv2d(256, 32, kernel_size=(9, 9), stride=(2, 2)) ) (unit_5): ConvUnit ( (conv0): Conv2d(256, 32, kernel_size=(9, 9), stride=(2, 2)) ) (unit_6): ConvUnit ( (conv0): Conv2d(256, 32, kernel_size=(9, 9), stride=(2, 2)) ) (unit_7): ConvUnit ( (conv0): Conv2d(256, 32, kernel_size=(9, 9), stride=(2, 2)) ) ) (digits): CapsuleLayer ( ) (reconstruct0): Linear (160 -> 522) (reconstruct1): Linear (522 -> 1176) (reconstruct2): Linear (1176 -> 784) (relu): ReLU (inplace) (sigmoid): Sigmoid () ) Traceback (most recent call last): File "main.py", line 195, in last_loss = train(epoch) File "main.py", line 171, in train output = network(data) File "/home/mahfuj/pytorch_python3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 224, in call result = self.forward(*input, **kwargs) File "/media/mahfuj/DATA/New_CODE/pytorch-capsule/capsule_network.py", line 61, in forward return self.digits(self.primary(self.conv1(x))) File "/home/mahfuj/pytorch_python3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 224, in call result = self.forward(*input, **kwargs) File "/media/mahfuj/DATA/New_CODE/pytorch-capsule/capsule_conv_layer.py", line 27, in forward return self.relu(self.conv0(x)) File "/home/mahfuj/pytorch_python3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 224, in call result = self.forward(*input, **kwargs) File "/home/mahfuj/pytorch_python3/lib/python3.5/site-packages/torch/nn/modules/conv.py", line 254, in forward self.padding, self.dilation, self.groups) File "/home/mahfuj/pytorch_python3/lib/python3.5/site-packages/torch/nn/functional.py", line 52, in conv2d return f(input, weight, bias) RuntimeError: Need input.size[1] == 1 but got 3 instead.

opened by redhat12345 1
can you realize the reconstruction？

I tried this code, and got 0.002 test loss.Accuracy was 99%,but the reconstruction image was wrong . I guess there may be something wrong with the reconstruction code.

Originally posted by @jjprincess in https://github.com/timomernick/pytorch-capsule/issues/10#issuecomment-383916175

opened by JiaQiao111 0
Softmax in routing algorithm incorrect:?
Hi, I think the softmax in the routing algorithm is being calculated over the wrong dimension.

Currently the code has:

# Initialize routing logits to zero. b_ij = Variable(torch.zeros(1, self.in_channels, self.num_units, 1)).cuda() # Iterative routing. num_iterations = 3 for iteration in range(num_iterations): # Convert routing logits to softmax. # (batch, features, num_units, 1, 1) c_ij = F.softmax(b_ij)

and since the dim parameter is not passed to the F.softmax call it will choose dim=1 and compute the softmax over the self.in_channels dimension (1152 here) whereas the softmax should be computed so that the c_ij between each input capsule and all the capsules in the next layer should sum to 1.

Thus the correct call should be:

c_ij = F.softmax(b_ij, dim=2)
opened by geefer 3

something wrong in your code

Firstly,thank you for your code

but as i try to read your source code.i find maybe there is errors in your squash function code Problem1: from your readme file,i read the tensorflow source code

Squashing function corresponding to Eq. 1
    Args:
        vector: A tensor with shape [batch_size, 1, num_caps, vec_len, 1] or [batch_size, num_caps, vec_len, 1].
    Returns:
        A tensor with the same shape as vector but squashed in 'vec_len' dimension.

in the comment,we squash in the vec_len dimension.But in your code

 def squash(s):
        # This is equation 1 from the paper.
        mag_sq = torch.sum(s**2, dim=2, keepdim=True)
        mag = torch.sqrt(mag_sq)
        s = (mag_sq / (1.0 + mag_sq)) * (s / mag)
        return s

because you have not wrote comment.so we just see here

# Flatten to (batch, unit, output).
u = u.view(x.size(0), self.num_units, -1)
# Return squashed outputs.
return CapsuleLayer.squash(u)

it is easy to know we should do squashing in dim=1 not 2

Problem2:

# (batch, features, in_units) -> (batch, features, num_units, in_units, 1)
x = torch.stack([x] * self.num_units, dim=2).unsqueeze(4)

# (batch, features, in_units, unit_size, num_units)
W = torch.cat([self.W] * batch_size, dim=0)

# Transform inputs by weight matrix.
# (batch_size, features, num_units, unit_size, 1)
u_hat = torch.matmul(W, x)

how can x with shape(batch, features, num_units, in_units, 1) and w with shape (batch, features, in_units, unit_size, num_units) do matmul operate...

i do not run your code successfully,because of data.so i do not know is it right.

Best Wishes!!

opened by selous123 3

The mask operation is different between training and testing?

In the paper, there is a sentence

During training, we mask out all but the activity vector of the correct digit capsule.

so, I think it will mask all but the capsule (1x16 vector) which is match the ground-truth during training. And the code now is about testing time, it will mask all but the longest capsule (1x16 vector).

opened by AlexHex7 3
bugfix for the reconstruction part and the squash function and the margin&reconsturction loss function.

#2 Get 98% accuracy in 1st epoch now. Original code is so great and neat. thanks a lot :) But I found some bugs here.

fixbug1: Use the correct capsule to reconstruct the input image rather than longest capsule. fixbug2: Use the squash function to squash the right dimension(unit_size 8) in primary capsule, It have a great impact on the model accuracy. CapsuleLayer.squash(u, dim=1)(The digit part squash function seems not right, and the most weird thing is if I squash the right dim which is capsule size 16, model can't be train correctly.) fixbug3: The margin loss function lack of a square term. fixbug4: The reconstruction_loss function should minimize the sum of squared differences instead of the mean squared differences.(as the capsule paper said)

opened by JaveyWang 2

Owner

Tim Omernick

GitHub

UAV-Networks-Routing is a Python simulator for experimenting routing algorithms and mac protocols on unmanned aerial vehicle networks.

UAV-Networks Simulator - Autonomous Networking - A.A. 20/21 UAV-Networks-Routing is a Python simulator for experimenting routing algorithms and mac pr

0 Nov 13, 2021

Re-implementation of the vector capsule with dynamic routing

VectorCapsule Re-implementation of the vector capsule with dynamic routing We implement the vector capsule and dynamic routing via graph neural networ

10 Feb 10, 2022

Official implementation of the paper "Topographic VAEs learn Equivariant Capsules"

Topographic Variational Autoencoder Paper: https://arxiv.org/abs/2109.01394 Getting Started Install requirements with Anaconda: conda env create -f en

69 Dec 12, 2022

Pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering".

TRAnsformer Routing Networks (TRAR) This is an official implementation for ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visu

49 Nov 10, 2022

This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task

?? ERASOR (RA-L'21 with ICRA Option) Official page of "ERASOR: Egocentric Ratio of Pseudo Occupancy-based Dynamic Object Removal for Static 3D Point C

225 Dec 29, 2022

Reimplementation of the paper "Attention, Learn to Solve Routing Problems!" in jax/flax.

JAX + Attention Learn To Solve Routing Problems Reinplementation of the paper Attention, Learn to Solve Routing Problems! using Jax and Flax. Fully su

7 Dec 1, 2022

Dynamic View Synthesis from Dynamic Monocular Video

Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer This repository contains code to compute depth from a

2.3k Jan 1, 2023

Dynamic View Synthesis from Dynamic Monocular Video

Dynamic View Synthesis from Dynamic Monocular Video Project Website | Video | Paper Dynamic View Synthesis from Dynamic Monocular Video Chen Gao, Ayus

139 Dec 28, 2022

Dynamic vae - Dynamic VAE algorithm is used for anomaly detection of battery data

Dynamic VAE frame Automatic feature extraction can be achieved by probability di

10 Oct 7, 2022

Official PyTorch implementation of Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations

Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations Zhenyu Jiang, Yifeng Zhu, Maxwell Svetlik, Kuan Fang, Yu

UT-Austin Robot Perception and Learning Lab

63 Jan 3, 2023

PyTorch implementation of "Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning"

Transparency-by-Design networks (TbD-nets) This repository contains code for replicating the experiments and visualizations from the paper Transparenc

351 Nov 18, 2022

Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

SwinTextSpotter This is the pytorch implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text R

183 Jan 3, 2023

PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

Neural Scene Flow Fields PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 20

585 Jan 4, 2023

This repo contains the pytorch implementation for Dynamic Concept Learner (accepted by ICLR 2021).

DCL-PyTorch Pytorch implementation for the Dynamic Concept Learner (DCL). More details can be found at the project page. Framework Grounding Physical

31 Jan 6, 2023

[TOG 2021] PyTorch implementation for the paper: SofGAN: A Portrait Image Generator with Dynamic Styling.

This repository contains the official PyTorch implementation for the paper: SofGAN: A Portrait Image Generator with Dynamic Styling. We propose a SofGAN image generator to decouple the latent space of portraits into two subspaces: a geometry space and a texture space. Experiments on SofGAN show that our system can generate high quality portrait images with independently controllable geometry and texture attributes.

694 Dec 23, 2022

Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic Scenes", ICCV 2021.

Deep 3D Mask Volume for View Synthesis of Dynamic Scenes Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic S

17 Oct 12, 2022

Unofficial pytorch implementation of the paper "Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution"

DFSA Unofficial pytorch implementation of the ICCV 2021 paper "Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution" (p

2 Nov 15, 2021

Pytorch implementation of various High Dynamic Range (HDR) Imaging algorithms

Deep High Dynamic Range Imaging Benchmark This repository is the pytorch impleme

5 Nov 16, 2022

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.

MMdnn MMdnn is a comprehensive and cross-framework tool to convert, visualize and diagnose deep learning (DL) models. The "MM" stands for model manage

5.7k Jan 9, 2023