Pytorch implementation of convolutional neural network visualization techniques

Overview

Convolutional Neural Network Visualizations

This repository contains a number of convolutional neural network visualization techniques implemented in PyTorch.

Note: I removed cv2 dependencies and moved the repository towards PIL. A few things might be broken (although I tested all methods), I would appreciate if you could create an issue if something does not work.

Note: The code in this repository was tested with torch version 0.4.1 and some of the functions may not work as intended in later versions. Although it shouldn't be too much of an effort to make it work, I have no plans at the moment to make the code in this repository compatible with the latest version because I'm still using 0.4.1.

Implemented Techniques

General Information

Depending on the technique, the code uses pretrained AlexNet or VGG from the model zoo. Some of the code also assumes that the layers in the model are separated into two sections; features, which contains the convolutional layers and classifier, that contains the fully connected layer (after flatting out convolutions). If you want to port this code to use it on your model that does not have such separation, you just need to do some editing on parts where it calls model.features and model.classifier.

Every technique has its own python file (e.g. gradcam.py) which I hope will make things easier to understand. misc_functions.py contains functions like image processing and image recreation which is shared by the implemented techniques.

All images are pre-processed with mean and std of the ImageNet dataset before being fed to the model. None of the code uses GPU as these operations are quite fast for a single image (except for deep dream because of the example image that is used for it is huge). You can make use of gpu with very little effort. The example pictures below include numbers in the brackets after the description, like Mastiff (243), this number represents the class id in the ImageNet dataset.

I tried to comment on the code as much as possible, if you have any issues understanding it or porting it, don't hesitate to send an email or create an issue.

Below, are some sample results for each operation.

Gradient Visualization

Target class: King Snake (56) Target class: Mastiff (243) Target class: Spider (72)
Original Image
Colored Vanilla Backpropagation
Vanilla Backpropagation Saliency
Colored Guided Backpropagation

(GB)
Guided Backpropagation Saliency

(GB)
Guided Backpropagation Negative Saliency

(GB)
Guided Backpropagation Positive Saliency

(GB)
Gradient-weighted Class Activation Map

(Grad-CAM)
Gradient-weighted Class Activation Heatmap

(Grad-CAM)
Gradient-weighted Class Activation Heatmap on Image

(Grad-CAM)
Score-weighted Class Activation Map

(Score-CAM)
Score-weighted Class Activation Heatmap

(Score-CAM)
Score-weighted Class Activation Heatmap on Image

(Score-CAM)
Colored Guided Gradient-weighted Class Activation Map

(Guided-Grad-CAM)
Guided Gradient-weighted Class Activation Map Saliency

(Guided-Grad-CAM)
Integrated Gradients
(without image multiplication)

Hierarchical Gradient Visualization

LayerCAM [16] is a simple modification of Grad-CAM [3], which can generate reliable class activation maps from different layers. For the examples provided below, a pre-trained VGG16 was used.

Class Activation Map Class Activation HeatMap Class Activation HeatMap on Image
LayerCAM
(Layer 9)
LayerCAM
(Layer 16)
LayerCAM
(Layer 23)
LayerCAM
(Layer 30)

Grad Times Image

Another technique that is proposed is simply multiplying the gradients with the image itself. Results obtained with the usage of multiple gradient techniques are below.

Vanilla Grad
X
Image
Guided Grad
X
Image
Integrated Grad
X
Image

Smooth Grad

Smooth grad is adding some Gaussian noise to the original image and calculating gradients multiple times and averaging the results [8]. There are two examples at the bottom which use vanilla and guided backpropagation to calculate the gradients. Number of images (n) to average over is selected as 50. σ is shown at the bottom of the images.

Vanilla Backprop
Guided Backprop

Convolutional Neural Network Filter Visualization

CNN filters can be visualized when we optimize the input image with respect to output of the specific convolution operation. For this example I used a pre-trained VGG16. Visualizations of layers start with basic color and direction filters at lower levels. As we approach towards the final layer the complexity of the filters also increase. If you employ external techniques like blurring, gradient clipping etc. you will probably produce better images.

Layer 2
(Conv 1-2)
Layer 10
(Conv 2-1)
Layer 17
(Conv 3-1)
Layer 24
(Conv 4-1)

Another way to visualize CNN layers is to to visualize activations for a specific input on a specific layer and filter. This was done in [1] Figure 3. Below example is obtained from layers/filters of VGG16 for the first image using guided backpropagation. The code for this opeations is in layer_activation_with_guided_backprop.py. The method is quite similar to guided backpropagation but instead of guiding the signal from the last layer and a specific target, it guides the signal from a specific layer and filter.

Input Image Layer Vis. (Filter=0) Filter Vis. (Layer=29)

Inverted Image Representations

I think this technique is the most complex technique in this repository in terms of understanding what the code does. It is mainly because of complex regularization. If you truly want to understand how this is implemented I suggest you read the second and third page of the paper [5], specifically, the regularization part. Here, the aim is to generate original image after nth layer. The further we go into the model, the harder it becomes. The results in the paper are incredibly good (see Figure 6) but here, the result quickly becomes messy as we iterate through the layers. This is because the authors of the paper tuned the parameters for each layer individually. You can tune the parameters just like the to ones that are given in the paper to optimize results for each layer. The inverted examples from several layers of AlexNet with the previous Snake picture are below.

Layer 0: Conv2d Layer 2: MaxPool2d Layer 4: ReLU
Layer 7: ReLU Layer 9: ReLU Layer 12: MaxPool2d

Deep Dream

Deep dream is technically the same operation as layer visualization the only difference is that you don't start with a random image but use a real picture. The samples below were created with VGG19, the produced result is entirely up to the filter so it is kind of hit or miss. The more complex models produce mode high level features. If you replace VGG19 with an Inception variant you will get more noticable shapes when you target higher conv layers. Like layer visualization, if you employ additional techniques like gradient clipping, blurring etc. you might get better visualizations.

Original Image
VGG19
Layer: 34
(Final Conv. Layer) Filter: 94
VGG19
Layer: 34
(Final Conv. Layer) Filter: 103

Class Specific Image Generation

This operation produces different outputs based on the model and the applied regularization method. Below, are some samples produced with VGG19 incorporated with Gaussian blur every other iteration (see [14] for details). The quality of generated images also depend on the model, AlexNet generally has green(ish) artifacts but VGGs produce (kind of) better images. Note that these images are generated with regular CNNs with optimizing the input and not with GANs.

Target class: Worm Snake (52) - (VGG19) Target class: Spider (72) - (VGG19)

The samples below show the produced image with no regularization, l1 and l2 regularizations on target class: flamingo (130) to show the differences between regularization methods. These images are generated with a pretrained AlexNet.

No Regularization L1 Regularization L2 Regularization

Produced samples can further be optimized to resemble the desired target class, some of the operations you can incorporate to improve quality are; blurring, clipping gradients that are below a certain treshold, random color swaps on some parts, random cropping the image, forcing generated image to follow a path to force continuity.

Some of these techniques are implemented in generate_regularized_class_specific_samples.py (courtesy of alexstoken).

Requirements:

torch == 0.4.1
torchvision >= 0.1.9
numpy >= 1.13.0
matplotlib >= 1.5
PIL >= 1.1.7

Citation

If you find the code in this repository useful for your research consider citing it.

@misc{uozbulak_pytorch_vis_2021,
  author = {Utku Ozbulak},
  title = {PyTorch CNN Visualizations},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/utkuozbulak/pytorch-cnn-visualizations}},
  commit = {53561b601c895f7d7d5bcf5fbc935a87ff08979a}
}

References:

[1] J. T. Springenberg, A. Dosovitskiy, T. Brox, and M. Riedmiller. Striving for Simplicity: The All Convolutional Net, https://arxiv.org/abs/1412.6806

[2] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, A. Torralba. Learning Deep Features for Discriminative Localization, https://arxiv.org/abs/1512.04150

[3] R. R. Selvaraju, A. Das, R. Vedantam, M. Cogswell, D. Parikh, and D. Batra. Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization, https://arxiv.org/abs/1610.02391

[4] K. Simonyan, A. Vedaldi, A. Zisserman. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps, https://arxiv.org/abs/1312.6034

[5] A. Mahendran, A. Vedaldi. Understanding Deep Image Representations by Inverting Them, https://arxiv.org/abs/1412.0035

[6] H. Noh, S. Hong, B. Han, Learning Deconvolution Network for Semantic Segmentation https://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Noh_Learning_Deconvolution_Network_ICCV_2015_paper.pdf

[7] A. Nguyen, J. Yosinski, J. Clune. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images https://arxiv.org/abs/1412.1897

[8] D. Smilkov, N. Thorat, N. Kim, F. Viégas, M. Wattenberg. SmoothGrad: removing noise by adding noise https://arxiv.org/abs/1706.03825

[9] D. Erhan, Y. Bengio, A. Courville, P. Vincent. Visualizing Higher-Layer Features of a Deep Network https://www.researchgate.net/publication/265022827_Visualizing_Higher-Layer_Features_of_a_Deep_Network

[10] A. Mordvintsev, C. Olah, M. Tyka. Inceptionism: Going Deeper into Neural Networks https://research.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html

[11] I. J. Goodfellow, J. Shlens, C. Szegedy. Explaining and Harnessing Adversarial Examples https://arxiv.org/abs/1412.6572

[12] A. Shrikumar, P. Greenside, A. Shcherbina, A. Kundaje. Not Just a Black Box: Learning Important Features Through Propagating Activation Differences https://arxiv.org/abs/1605.01713

[13] M. Sundararajan, A. Taly, Q. Yan. Axiomatic Attribution for Deep Networks https://arxiv.org/abs/1703.01365

[14] J. Yosinski, J. Clune, A. Nguyen, T. Fuchs, Hod Lipson, Understanding Neural Networks Through Deep Visualization https://arxiv.org/abs/1506.06579

[15] H. Wang, Z. Wang, M. Du, F. Yang, Z. Zhang, S. Ding, P. Mardziel, X. Hu. Score-CAM: Score-Weighted Visual Explanations for Convolutional Neural Networks https://arxiv.org/abs/1910.01279

[16] P. Jiang, C. Zhang, Q. Hou, M. Cheng, Y. Wei. LayerCAM: Exploring Hierarchical Class Activation Maps for Localization http://mmcheng.net/mftp/Papers/21TIP_LayerCAM.pdf

Comments
  • Class Specific Image Generation (L1 and L2 Regularization)

    Class Specific Image Generation (L1 and L2 Regularization)

    Thank you so much for making this work open source! I really appreciate it.

    I am trying to reproduce the results you've displayed. The code that is provided for this does very well for the no regularization case, but I cannot replicate the L1 and L2 regularization examples. How were these generated? Could the code for this also be kindly provided?

    Once again, thank you so much for your work!

    opened by arjung128 11
  • Sign error in CNN Layer Visualization

    Sign error in CNN Layer Visualization

    There is only a very minor sign error in the code. In the hook version you correctly minimize the negative mean, hence, maximizing the mean.

    In the unhooked version however, the minus sign is missing, computing the input than activates the feature the least (line 102):

    loss = torch.mean(self.conv_output)

    Additionally, the code has to be modified to run the latest pytorch 0.4, where zero-dimensional tensor can no longer be indexed (line 101 and 63).

    opened by McLawrence 11
  • Add blur, gradient clipping, and regularization to ClassSpecificImageGenerator

    Add blur, gradient clipping, and regularization to ClassSpecificImageGenerator

    Thanks for a great visualization suite. I was having trouble getting a quality image from the class specific image generator so I implemented some of the techniques you suggested in the README and that are originally from Understanding Neural Networks Through Deep Visualization. These techniques greatly improved my visualizations.

    Tested with torchvision 0.5.0 with model and image on cpu. To use GPU, will likely have to add logic to transfer image to cpu for PIL and then back to GPU for optimization, but I haven't looked in to this much.

    opened by alexstoken 9
  • Class correctness in gradcam.py

    Class correctness in gradcam.py

    There is a line in gradcam.py which says that if target_class = None then the target_class takes the argmax of the ouptut. Is it possible that the actual class might be different from the expected class?

    opened by AND2797 8
  • too many indices for array

    too many indices for array

    Traceback (most recent call last): File "X:/pytorch-cnn-visualizations-master/src/cnn_layer_visualization.py", line 130, in layer_vis.visualise_layer_without_hooks() File "X:/pytorch-cnn-visualizations-master/src/cnn_layer_visualization.py", line 105, in visualise_layer_without_hooks print('Iteration:', str(i), 'Loss:', "{0:.2f}".format(loss.data.numpy()[0])) IndexError: too many indices for array same Error in deep_dreem.py in exactly the same line but lint no 62, same in inverted_representation.py line 106 and generate_claass_specificsamples.py line 43

    opened by shashanka300 8
  • Model does not have 'feature' attribute

    Model does not have 'feature' attribute

    Hi thanks for the code, I'm trying to visualise a trained ResNet18 model with the GradCam function. But ResNet models do not have the feature attribute as in VGGs. Should we use model.named_children() instead?

    from gradcam import GradCam
    grad_cam = GradCam(model, target_layer=7)
    

    Error:

    ---------------------------------------------------------------------------
    AttributeError                            Traceback (most recent call last)
    <ipython-input-16-0a85f7f4c865> in <module>
          8 grad_cam = GradCam(model, target_layer=0)
          9 # Generate cam mask
    ---> 10 cam = grad_cam.generate_cam(x, target_class=None)
         11 # Save mask
         12 save_class_activation_images(original_image, cam, file_name_to_export)
    
    /mnt/sdh/adam/visualisation/gradcam.py in generate_cam(self, input_image, target_class)
         56         # conv_output is the output of convolutions at specified layer
         57         # model_output is the final output of the model (1, 1000)
    ---> 58         conv_output, model_output = self.extractor.forward_pass(input_image)
         59         if target_class is None:
         60             target_class = np.argmax(model_output.data.numpy())
    
    /mnt/sdh/adam/visualisation/gradcam.py in forward_pass(self, x)
         35         """
         36         # Forward pass on the convolutions
    ---> 37         conv_output, x = self.forward_pass_on_convolutions(x)
         38         x = x.view(x.size(0), -1)  # Flatten
         39         # Forward pass on the classifier
    
    /mnt/sdh/adam/visualisation/gradcam.py in forward_pass_on_convolutions(self, x)
         23         """
         24         conv_output = None
    ---> 25         for module_pos, module in self.model.features._modules.items():
         26             x = module(x)  # Forward
         27             if int(module_pos) == self.target_layer:
    
    /mnt/sdh/adam/adam_env/lib/python3.6/site-packages/torch/nn/modules/module.py in __getattr__(self, name)
        589                 return modules[name]
        590         raise AttributeError("'{}' object has no attribute '{}'".format(
    --> 591             type(self).__name__, name))
        592 
        593     def __setattr__(self, name, value):
    
    AttributeError: 'ResNet' object has no attribute 'features'
    
    opened by adamxyang 7
  • visualize model that not in the model zoo

    visualize model that not in the model zoo

    HI! as you said:

    If you want to port this code to use it on your model that does not have such separation, you just need to do some editing on parts where it calls model.features and model.classifier.

    I try to modify the model.features or .classifier part , but i get confused how to do it. Below is part of my script , hope that you can give some details about how to visualize own trained model.

    model =torch.load(model_save_path)
    for index,(layer,_) in enumerate(model.items()):
          # model.items() return the weight of this layer, eg, model.items()[0]=model.0.1.weight
          x = layer(x)
            ......
    

    but i get the TypeError that the object is not callable. i wonder how to read the layer from the trained model that didn't have features attribute. thanks~

    opened by visonpon 6
  • maximize instead of minimize output for filter visualization?

    maximize instead of minimize output for filter visualization?

    opened by zym1010 6
  • Reference for image clipping in image recreation step?

    Reference for image clipping in image recreation step?

    Hi, first of all, thanks for providing this repository! I really like it and compare it to my own implementation, trying to understand the algorithms. I am very new to this, so sorry if this is a stupid question, but could you/anyone point me to a reference of why you clip instead of normalizing the images during the recreation phase? The specific code lines are the following in the mis_functions.py:

    recreated_im[recreated_im > 1] = 1
    recreated_im[recreated_im < 0] = 0
    

    Why shouldn't we rescale them to a range between 0 and 1?

    Thanks :)

    opened by kai-tub 5
  • question about guided backprop

    question about guided backprop

    Hi, thank you for this great repo. I'm confused when I'm reading guided_backprop.py. On line 70-73, the gradient of output is [0, 0, ..., 1, 0].

    one_hot_output = torch.FloatTensor(1, model_output.size()[-1]).zero_()
    one_hot_output[0][target_class] = 1
    # Backward pass
    model_output.backward(gradient=one_hot_output)
    

    I changed target class while draw cat_dog_Guided_BP_color map. But I find no matter what class is set, it always looks the same.

    For example: cat_dog_Guided_BP_color_target_class_10.jpg cat_dog_Guided_BP_color_target_class_10

    cat_dog_Guided_BP_color_target_class_100.jpg cat_dog_Guided_BP_color_target_class_100

    cat_dog_Guided_BP_color_target_class_243.jpg(ground truth) cat_dog_Guided_BP_color_target_class_243

    cat_dog_Guided_BP_color_target_class_500.jpg cat_dog_Guided_BP_color_target_class_500

    cat_dog_Guided_BP_color_target_class_890.jpg cat_dog_Guided_BP_color_target_class_890

    opened by narrowsnap 5
  • guided_backprop register backward/forward hook seems to consume a lot of GPU memory

    guided_backprop register backward/forward hook seems to consume a lot of GPU memory

    Hi, really a great job! I was trying to get a guided-gradcam visualization of a video. But I found as more images have been read, very soon my server shows CUDA out of memory error. Then I discovered that it is because register_backward/forward_hook consumes a lot of GPU memory. Is there any way I could fix this issue? Thanks in advance!

    opened by Terahezi 5
  • Adding capability to choose device other than cpu and fixing/generalize

    Adding capability to choose device other than cpu and fixing/generalize

    Reference Issues/PRs

    None

    What does this implementation fix?

    These changes mainly focus on augmenting the code to make it more usable and generalize.

    1. Run on GPU: Modern architectures are so deep and computation extensive, that running it only on CPU may always result in out of memory error. In this implementation, I have changed grad-cam, guided-backprop, integrated-gradient, layer-activation-with-guided-backprop, score-cam, and vanilla-backprop to be able to initialize with desired device ids as per our will. And then the model forward function and backward gradient calculations can be done on GPU devices. If nothing is provided at the time of initialization, the code simply follows the existing.

    2. Introduction of forward hook: When we make models in Pytorch it is not essential that model class and the forward function contain the same layers. For example, often designers don't keep Flatten and Concat functions out of the class, And only perform them in forward section. So in Cam extractors, forward_pass_on_convolutions function looping over the layer of the model until the desired layer is reached may not depict the correct functionality of the forward function of the model. Hence we just added forward hook of the desired conv layer and let the model complete its forward pass without intervention. This way we can get the conv layer output as well as the correct output of the entire model in way more general way without the dependency of the forward function.

    Other comments

    No

    enhancement 
    opened by arnabdas2019ovgu 1
  • image size problem

    image size problem

    Hi,I find a bug in gradcam.py The Image package use w,h mode. However, the precessed image is a tensor with 1,c,h,w. So,the order of w and h need to be switched when using Image.resize.

    Line 88,change cam = np.uint8(Image.fromarray(cam).resize((input_image.shape[2], input_image.shape[3]), Image.ANTIALIAS)) to cam = np.uint8(Image.fromarray(cam).resize((input_image.shape[3], input_image.shape[2]), Image.ANTIALIAS))

    opened by zhaoxin111 6
Owner
Utku Ozbulak
Fourth-year doctoral student at Ghent University. Located in Ghent University Global Campus, South Korea.
Utku Ozbulak
null 131 Jun 25, 2021
Neural network visualization toolkit for tf.keras

Neural network visualization toolkit for tf.keras

Yasuhiro Kubota 259 Nov 23, 2022
Visualization toolkit for neural networks in PyTorch! Demo -->

FlashTorch A Python visualization toolkit, built with PyTorch, for neural networks in PyTorch. Neural networks are often described as "black box". The

Misa Ogura 689 Nov 2, 2022
An Empirical Review of Optimization Techniques for Quantum Variational Circuits

QVC Optimizer Review Code for the paper "An Empirical Review of Optimization Techniques for Quantum Variational Circuits". Each of the python files ca

Owen Lockwood 5 Jun 28, 2022
GNNLens2 is an interactive visualization tool for graph neural networks (GNN).

GNNLens2 is an interactive visualization tool for graph neural networks (GNN).

Distributed (Deep) Machine Learning Community 138 Nov 2, 2022
pytorch implementation of "Distilling a Neural Network Into a Soft Decision Tree"

Soft-Decision-Tree Soft-Decision-Tree is the pytorch implementation of Distilling a Neural Network Into a Soft Decision Tree, paper recently published

Kim Heecheol 259 Nov 6, 2022
Visual Computing Group (Ulm University) 96 Nov 16, 2022
🎆 A visualization of the CapsNet layers to better understand how it works

CapsNet-Visualization For more information on capsule networks check out my Medium articles here and here. Setup Use pip to install the required pytho

Nick Bourdakos 388 Oct 20, 2022
Logging MXNet data for visualization in TensorBoard.

Logging MXNet Data for Visualization in TensorBoard Overview MXBoard provides a set of APIs for logging MXNet data for visualization in TensorBoard. T

Amazon Web Services - Labs 327 Sep 23, 2022
Visualization Toolbox for Long Short Term Memory networks (LSTMs)

Visualization Toolbox for Long Short Term Memory networks (LSTMs)

Hendrik Strobelt 1.1k Nov 23, 2022
Interactive convnet features visualization for Keras

Quiver Interactive convnet features visualization for Keras The quiver workflow Video Demo Build your model in keras model = Model(...) Launch the vis

Keplr 1.7k Nov 16, 2022
A collection of infrastructure and tools for research in neural network interpretability.

Lucid Lucid is a collection of infrastructure and tools for research in neural network interpretability. We're not currently supporting tensorflow 2!

null 4.5k Nov 22, 2022
Visualizer for neural network, deep learning, and machine learning models

Netron is a viewer for neural network, deep learning and machine learning models. Netron supports ONNX (.onnx, .pb, .pbtxt), Keras (.h5, .keras), Tens

Lutz Roeder 20.6k Nov 23, 2022
Using / reproducing ACD from the paper "Hierarchical interpretations for neural network predictions" 🧠 (ICLR 2019)

Hierarchical neural-net interpretations (ACD) ?? Produces hierarchical interpretations for a single prediction made by a pytorch neural network. Offic

Chandan Singh 110 Nov 10, 2022
Visualizer for neural network, deep learning, and machine learning models

Netron is a viewer for neural network, deep learning and machine learning models. Netron supports ONNX, TensorFlow Lite, Keras, Caffe, Darknet, ncnn,

Lutz Roeder 16.3k Sep 27, 2021
A ultra-lightweight 3D renderer of the Tensorflow/Keras neural network architectures

A ultra-lightweight 3D renderer of the Tensorflow/Keras neural network architectures

Souvik Pratiher 16 Nov 17, 2021
PyTorch implementation of DeepDream algorithm

neural-dream This is a PyTorch implementation of DeepDream. The code is based on neural-style-pt. Here we DeepDream a photograph of the Golden Gate Br

null 121 Nov 5, 2022
A Practical Debugging Tool for Training Deep Neural Networks

Cockpit is a visual and statistical debugger specifically designed for deep learning!

null 31 Aug 14, 2022
Portal is the fastest way to load and visualize your deep neural networks on images and videos 🔮

Portal is the fastest way to load and visualize your deep neural networks on images and videos ??

Datature 242 Nov 16, 2022