[CVPR 2021] A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts

Overview

Visual-Reasoning-eXplanation

[CVPR 2021 A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts]

Project Page | Video | Paper

Editor

Figure: An example result with the proposed VRX. To explain the prediction (i.e., fire engine and not alternatives like ambulance), VRX provides both visual and structural clues.

A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts
Yunhao Ge, Yao Xiao, Zhi Xu, Meng Zheng, Srikrishna Karanam, Terrence Chen, Laurent Itti, Ziyan Wu
IEEE/ CVF International Conference on Computer Vision and Pattern Recognition (CVPR), 2021

We considered the challenging problem of interpreting the reasoning logic of a neural network decision. We propose a novel framework to interpret neural networks which extracts relevant class-specific visual concepts and organizes them using structural concepts graphs based on pairwise concept relationships. By means of knowledge distillation, we show VRX can take a step towards mimicking the reasoning process of NNs and provide logical, concept-level explanations for final model decisions. With extensive experiments, we empirically show VRX can meaningfully answer “why” and “why not” questions about the prediction, providing easy-to-understand insights about the reasoning process. We also show that these insights can potentially provide guidance on improving NN’s performance.

Editor

Figure: Examples of representing images as structural concept graph.

Editor

Figure: Pipeline for Visual Reasoning Explanation framework.

Thanks for a re-implementation from sssufmug, we added more features and finish the whole pipeline.

Getting Started

Installation

  • Clone this repo:
git clone https://github.com/gyhandy/Visual-Reasoning-eXplanation.git
cd Visual-Reasoning-eXplanation
  • Dependencies
pip install -r requirements.txt

Datasets

  • We use a subset of ImageNet as our source data. There are intrested classes which want to do reasoning, such as fire angine, ambulance and school bus, and also other random images for discovering concepts. You can download the source data that we used in our paper here: source [http://ilab.usc.edu/andy/dataset/source.zip]

  • Input files for training GNN and doing reasoning. You can get these data by doing discover concepts and match concepts yourself, but we also provide those files to help you doing inference directly. You can download the result data here: result[http://ilab.usc.edu/andy/dataset/result.zip]

Datasets Preprocess

Unzip source.zip as well as result.zip, and then place them in ./source and ./result. If you only want to do inference, you can skip discover concept, match concept and training Structural Concept Graph (SCG).

Discover concept

For more information about discover concept, you can refer to ACE: Towards Automatic Concept Based Explanations. We use the pretrained model provided by tensorflow to discover cencept. With default setting you can simply run

python3 discover_concept.py

If you want to do this step with a custom model, you should write a wrapper for it containing the following methods:

run_examples(images, BOTTLENECK_LAYER): which basically returens the activations of the images in the BOTTLENECK_LAYER. 'images' are original images without preprocessing (float between 0 and 1)
get_image_shape(): returns the shape of the model's input
label_to_id(CLASS_NAME): returns the id of the given class name.
get_gradient(activations, CLASS_ID, BOTTLENECK_LAYER): computes the gradient of the CLASS_ID logit in the logit layer with respect to activations in the BOTTLENECK_LAYER.

If you want to discover concept with GradCam, please also implement a 'gradcam.py' for your model and place it into ./src. Then run:

python3 discover_concept.py --model_to_run YOUR_LOCAL_PRETRAINED_MODEL_NAME --model_path YOUR_LOCAL_PATH_OF_PRETRAINED_MODEL --labels_path LABEL_PATH_OF_YOUR_MODEL_LABEL --use_gradcam TRUE/FALSE

Match concept

This step will use the concepts you discovered in last step to match new images. If you want to match your own images, please put them into ./source and create a new folder named IMAGE_CLASS_NAME. Then run:

python3 macth_concept.py --model_to_run YOUR_LOCAL_PRETRAINED_MODEL_NAME --model_path YOUR_LOCAL_PATH_OF_PRETRAINED_MODEL --labels_path LABEL_PATH_OF_YOUR_MODEL_LABEL --use_gradcam TRUE/FALSE

Training Structural Concept Graph (SCG)

python3 VR_training_XAI.py

Then you can find the checkpoints of model in ./result/model.

Reasoning a image

For images you want to do reasoning, you should first doing match concept to extract concept knowledge. Once extracted graph knowledge for SCG, you can do the inference. For example, if you want to inference ./source/fire_engine/n03345487_19835.JPEG, the "img_class" is "ambulance" and "img_idx" is 10367, then run:

python3 Xception_WhyNot.py --img_class fire_engine --img_idx 19835

Some visualize results

Editor
Editor
Editor

Contact / Cite

Got Questions? We would love to answer them! Please reach out by email! You may cite us in your research as:

@inproceedings{ge2021peek,
  title={A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts},
  author={Ge, Yunhao and Xiao, Yao and Xu, Zhi and Zheng, Meng and Karanam, Srikrishna and Chen, Terrence and Itti, Laurent and Wu, Ziyan},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={2195--2204},
  year={2021}
}

We will post other relevant resources, implementations, applications and extensions of this work here. Please stay tuned

You might also like...
PyTorch implementation of
PyTorch implementation of "Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning"

Transparency-by-Design networks (TbD-nets) This repository contains code for replicating the experiments and visualizations from the paper Transparenc

🌈 PyTorch Implementation for EMNLP'21 Findings
🌈 PyTorch Implementation for EMNLP'21 Findings "Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer"

SGLKT-VisDial Pytorch Implementation for the paper: Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer Gi-Cheon Kang, Junseok P

A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

CLEVR Dataset Generation This is the code used to generate the CLEVR dataset as described in the paper: CLEVR: A Diagnostic Dataset for Compositional

The code for MM2021 paper "Multi-Level Counterfactual Contrast for Visual Commonsense Reasoning"

The Code for MM2021 paper "Multi-Level Counterfactual Contrast for Visual Commonsense Reasoning" Setting up and using the repo Get the dataset. Follow

Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.
Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

PyTorch Implementation of Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers 1 Using Colab Please notic

ALFRED - A Benchmark for Interpreting Grounded Instructions for Everyday Tasks
ALFRED - A Benchmark for Interpreting Grounded Instructions for Everyday Tasks

ALFRED A Benchmark for Interpreting Grounded Instructions for Everyday Tasks Mohit Shridhar, Jesse Thomason, Daniel Gordon, Yonatan Bisk, Winson Han,

 InterFaceGAN - Interpreting the Latent Space of GANs for Semantic Face Editing
InterFaceGAN - Interpreting the Latent Space of GANs for Semantic Face Editing

InterFaceGAN - Interpreting the Latent Space of GANs for Semantic Face Editing Figure: High-quality facial attributes editing results with InterFaceGA

Code to reproduce the results in the paper
Code to reproduce the results in the paper "Tensor Component Analysis for Interpreting the Latent Space of GANs".

Tensor Component Analysis for Interpreting the Latent Space of GANs [ paper | project page ] Code to reproduce the results in the paper "Tensor Compon

Code for
Code for "Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks", CVPR 2021

Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks This repository contains the code that accompanies our CVPR 20

Comments
  • I got an error when I ran ‘discover_concept.py’

    I got an error when I ran ‘discover_concept.py’

    When I run to ‘calculate Calculating CAVs and TCAV scores of ambulance...’, I got this error:ValueError: Input 0 of layer block1_conv1 is incompatible with the layer: expected ndim=4, found ndim=3. Full shape received: [224, 224, 3]

    Hope you can help me! Thanks!

    opened by PeterZhang97 4
  • Questions about Xception_XAI_graph_training.dataset

    Questions about Xception_XAI_graph_training.dataset

    hello,I want to know where can i find "result/vec2graph/Xception_XAI_graph_training.dataset",i downloaded "source","result"folder,but i can't find it.I look forward to your reply.

    opened by LiuuCSDN 3
  • Some questions about  'concept_dict' and 'concept_score_dict'

    Some questions about 'concept_dict' and 'concept_score_dict'

    Hi, i don't know the numbers in the 'concept_dict' and the numbers in the 'concept_score_dict',i try to change numbers,but the results is not good.How do you fix these figures?

    opened by LiuuCSDN 1
  • Some issues about functions of the Xception model.

    Some issues about functions of the Xception model.

    Hi, thank you very much for your hard work and kindness sharing. In the provided codes, it seems that the Xception model dose not have the follow function: preprocess_single_img (called in ace.py, line 210), which is important for using self._return_gradcam_superpixels (ace.py, line 185). Could you give us some assistance to handle this issue? Again, thank you very much for your help and hard work.

    opened by ddghjikle 1
Owner
Andy_Ge
Ph.D. Student in USC, interested in Computer Vision, Machine Learning, and AGI
Andy_Ge
Official implementation of GraphMask as presented in our paper Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking.

GraphMask This repository contains an implementation of GraphMask, the interpretability technique for graph neural networks presented in our ICLR 2021

Michael Schlichtkrull 29 Sep 2, 2022
Code for the AAAI-2022 paper: Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification

Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification (AAAI 2022) Prerequisite PyTorch >= 1.2.0 P

null 16 Dec 14, 2022
ImageNet-CoG is a benchmark for concept generalization. It provides a full evaluation framework for pre-trained visual representations which measure how well they generalize to unseen concepts.

The ImageNet-CoG Benchmark Project Website Paper (arXiv) Code repository for the ImageNet-CoG Benchmark introduced in the paper "Concept Generalizatio

NAVER 23 Oct 9, 2022
[CVPR 2020] Interpreting the Latent Space of GANs for Semantic Face Editing

InterFaceGAN - Interpreting the Latent Space of GANs for Semantic Face Editing Figure: High-quality facial attributes editing results with InterFaceGA

GenForce: May Generative Force Be with You 1.3k Dec 29, 2022
git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking Ning Wang, Wengang Zhou, Jie Wang, and Houqiang Li Accepted by CVPR

NingWang 236 Dec 22, 2022
Code and data for "Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning" (EMNLP 2021).

GD-VCR Code for Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning (EMNLP 2021). Research Questions and Aims: How well can a model perform o

Da Yin 24 Oct 13, 2022
git《Self-Attention Attribution: Interpreting Information Interactions Inside Transformer》(AAAI 2021) GitHub:

Self-Attention Attribution This repository contains the implementation for AAAI-2021 paper Self-Attention Attribution: Interpreting Information Intera

null 60 Dec 29, 2022
Robust Instance Segmentation through Reasoning about Multi-Object Occlusion [CVPR 2021]

Robust Instance Segmentation through Reasoning about Multi-Object Occlusion [CVPR 2021] Abstract Analyzing complex scenes with DNN is a challenging ta

Irene Yuan 24 Jun 27, 2022
[NeurIPS 2021] Code for Unsupervised Learning of Compositional Energy Concepts

Unsupervised Learning of Compositional Energy Concepts This is the pytorch code for the paper Unsupervised Learning of Compositional Energy Concepts.

null 45 Nov 30, 2022
Pytorch implementation of "A simple neural network module for relational reasoning" (Relational Networks)

Pytorch implementation of Relational Networks - A simple neural network module for relational reasoning Implemented & tested on Sort-of-CLEVR task. So

Kim Heecheol 800 Dec 5, 2022