This repo contains the pytorch implementation for Dynamic Concept Learner (accepted by ICLR 2021).

Overview

DCL-PyTorch

Pytorch implementation for the Dynamic Concept Learner (DCL). More details can be found at the project page.

Framework

Grounding Physical Concepts of Objects and Events Through Dynamic Visual Reasoning
Zhenfang Chen, Jiayuan Mao, Jiajun Wu, Kwan-Yee K. Wong, Joshua B. Tenenbaum, and Chuang Gan

Prerequisites

  • Python 3
  • PyTorch 1.0 or higher, with NVIDIA CUDA Support
  • Other required python packages specified by requirements.txt. See the Installation.

Installation

Install Jacinle: Clone the package, and add the bin path to your global PATH environment variable:

git clone https://github.com/vacancy/Jacinle --recursive
export PATH=<path_to_jacinle>/bin:$PATH

Clone this repository:

git clone https://github.com/zfchenUnique/DCL-Release.git --recursive

Create a conda environment for NS-CL, and install the requirements. This includes the required python packages from both Jacinle NS-CL. Most of the required packages have been included in the built-in anaconda package:

Dataset preparation

  • Download videos, video annotation, questions and answers, and object proposals accordingly from the official website
  • Transform videos into ".png" frames with ffmpeg.
  • Organize the data as shown below.
    clevrer
    ├── annotation_00000-01000
    │   ├── annotation_00000.json
    │   ├── annotation_00001.json
    │   └── ...
    ├── ...
    ├── image_00000-01000
    │   │   ├── 1.png
    │   │   ├── 2.png
    │   │   └── ...
    │   └── ...
    ├── ...
    ├── questions
    │   ├── train.json
    │   ├── validation.json
    │   └── test.json
    ├── proposals
    │   ├── proposal_00000.json
    │   ├── proposal_00001.json
    │   └── ...
    

Fast Evaluation

    git clone https://github.com/zfchenUnique/clevrer_dynamic_propnet.git
    cd clevrer_dynamic_propnet
    sh ./scripts/eval_fast_release_v2.sh 0
   sh scripts/script_test_prp_clevrer_qa.sh 0

Step-by-step Training

  • Step 1: download the proposals from the region proposal network and extract object trajectories for train and val set by
   sh scripts/script_gen_tubes.sh
  • Step 2: train a concept learner with descriptive and explanatory questions for static concepts (i.e. color, shape and material)
   sh scripts/script_train_dcl_stage1.sh 0
  • Step 3: extract static attributes & refine object trajectories extract static attributes
   sh scripts/script_extract_attribute.sh

refine object trajectories

   sh scripts/script_gen_tubes_refine.sh
  • Step 4: extract predictive and counterfactual scenes by
    cd clevrer_dynamic_propnet
    sh ./scripts/train_tube_box_only.sh # train
    sh ./scripts/train_tube.sh # train
    sh ./scripts/eval_fast_release_v2.sh 0 # val
  • Step 5: train DCL with all questions and the refined trajectories
   sh scripts/script_train_dcl_stage2.sh 0

Generalization to CLEVRER-Grounding

    sh ./scripts/script_grounding.sh  0
    jac-crun 0 scripts/script_evaluate_grounding.py

Generalization to CLEVRER-Retrieval

    sh ./scripts/script_retrieval.sh  0
    jac-crun 0 scripts/script_evaluate_retrieval.py

Extension to Tower Blocks

    sh ./scripts/script_train_blocks.sh 0
  • Step 3: download the pretrain model from google drive and evaluate on Tower block QA
    sh ./scripts/script_eval_blocks.sh 0

Others

Citation

If you find this repo useful in your research, please consider citing:

@inproceedings{zfchen2021iclr,
    title={Grounding Physical Concepts of Objects and Events Through Dynamic Visual Reasoning},
    author={Chen, Zhenfang and Mao, Jiayuan and Wu, Jiajun and Wong, Kwan-Yee K and Tenenbaum, Joshua B. and Gan, Chuang},
    booktitle={International Conference on Learning Representations},
    year={2021}
    }
You might also like...
This repo. is an implementation of ACFFNet, which is accepted for in Image and Vision Computing.
This repo. is an implementation of ACFFNet, which is accepted for in Image and Vision Computing.

Attention-Guided-Contextual-Feature-Fusion-Network-for-Salient-Object-Detection This repo. is an implementation of ACFFNet, which is accepted for in I

 Official PyTorch implementation of the preprint paper
Official PyTorch implementation of the preprint paper "Stylized Neural Painting", accepted to CVPR 2021.

Official PyTorch implementation of the preprint paper "Stylized Neural Painting", accepted to CVPR 2021.

Official PyTorch implementation of the paper
Official PyTorch implementation of the paper "Recycling Discriminator: Towards Opinion-Unaware Image Quality Assessment Using Wasserstein GAN", accepted to ACM MM 2021 BNI Track.

RecycleD Official PyTorch implementation of the paper "Recycling Discriminator: Towards Opinion-Unaware Image Quality Assessment Using Wasserstein GAN

[ICLR 2022] DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR
[ICLR 2022] DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR

DAB-DETR This is the official pytorch implementation of our ICLR 2022 paper DAB-DETR. Authors: Shilong Liu, Feng Li, Hao Zhang, Xiao Yang, Xianbiao Qi

This repo contains the implementation of the algorithm proposed in Off-Belief Learning, ICML 2021.

Off-Belief Learning Introduction This repo contains the implementation of the algorithm proposed in Off-Belief Learning, ICML 2021. Environment Setup

Dynamic View Synthesis from Dynamic Monocular Video

Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer This repository contains code to compute depth from a

Dynamic View Synthesis from Dynamic Monocular Video
Dynamic View Synthesis from Dynamic Monocular Video

Dynamic View Synthesis from Dynamic Monocular Video Project Website | Video | Paper Dynamic View Synthesis from Dynamic Monocular Video Chen Gao, Ayus

Dynamic vae - Dynamic VAE algorithm is used for anomaly detection of battery data
Dynamic vae - Dynamic VAE algorithm is used for anomaly detection of battery data

Dynamic VAE frame Automatic feature extraction can be achieved by probability di

Python and C++ implementation of
Python and C++ implementation of "MarkerPose: Robust real-time planar target tracking for accurate stereo pose estimation". Accepted at LXCV @ CVPR 2021.

MarkerPose: Robust real-time planar target tracking for accurate stereo pose estimation This is a PyTorch and LibTorch implementation of MarkerPose: a

Comments
  • Can't find annotation_01362.pk

    Can't find annotation_01362.pk

    Hi,

    Thanks for your amazing work.

    I am trying to run the code, but I got a problem. The program can NOT find the '/tubeProposalsRelease/1.0_1.0_0.6_0.7/annotation_01362.pk' I also checked the code, and I can NOT find the code for generating the '/tubeProposalsRelease/1.0_1.0_0.6_0.7/annotation_01362.pk'.

    Could you please give me any suggestions? Thanks for your help.

    opened by haoranD 7
  • No such file or directory: '../clevrer/tubeProposalsGt'

    No such file or directory: '../clevrer/tubeProposalsGt'

    Hello. Thank you for sharing code!

    I have some trouble when I do the step-by-step training at step1.

    I have already organized the data as shown in the instruction. When I run 'sh scripts/script_gen_tubes.sh', here comes an error:

    Traceback (most recent call last): File "scripts/script_gen_tube_proposals.py", line 1923, in <module> compute_recall_and_precision(opt) File "scripts/script_gen_tube_proposals.py", line 786, in compute_recall_and_precision pk_fn_list = get_sub_file_list(out_gt_path, 'pk') File "scripts/script_gen_tube_proposals.py", line 585, in get_sub_file_list for file_name in os.listdir(folder_name): FileNotFoundError: [Errno 2] No such file or directory: '../clevrer/tubeProposalsGt'

    It seems that there is no folder called 'tubeProposalsGt'.

    I would appreciate it if you can tell me what I should do to fix it.

    Thanks a lot!

    opened by z-yf17 3
  • Missing file when doing fast evaluation

    Missing file when doing fast evaluation

    Hello,

    When trying to do fast evaluation, I run:

    sh ./scripts/eval_fast_release_v2.sh 0

    and get the following error:

    Loading saved ckp from ../data/models/dynamic_models/attrV3_offset4.pth
    spatial model #params: 17495816
    Loading spatail model saved ckp from ../data/models/dynamic_models/box_only_attrV3.pth
    [0/5000]:   0%|                                                                                                                                                                                                                                                                           | 1/5000 [00:00<00:03, 1359.14it/s]
    Traceback (most recent call last):
      File "eval_tube_sep.py", line 158, in <module>
        if args.eval_full_path !='':
      File "/mnt/data/DCL-Release/clevrer_dynamic_propnet/utils_tube.py", line 1064, in pickleload
        f = open(path, 'rb')
    FileNotFoundError: [Errno 2] No such file or directory: '../data/proposals/annotation_15000.pk'
    

    I'm not sure where this annotation .pk file is supposed to come from. Do I have to run training first to generate it? Thanks!

    opened by AdamIshay 2
  • Repository clevrer_dynamic_propnet not available

    Repository clevrer_dynamic_propnet not available

    Hello, I tried to run the code but I couldn't because the following command does not work:

    git clone https://github.com/zfchenUnique/clevrer_dynamic_propnet.git

    Is the repository clevrer_dynamic_propnet available ?

    Thank you in advance.

    opened by gabrielsluz 1
Owner
Zhenfang Chen
Keep it simple.
Zhenfang Chen
This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

BiPointNet: Binary Neural Network for Point Clouds Created by Haotong Qin, Zhongang Cai, Mingyuan Zhang, Yifu Ding, Haiyu Zhao, Shuai Yi, Xianglong Li

Haotong Qin 59 Dec 17, 2022
PyTorch implementation for the visual prior component (i.e. perception module) of the Visually Grounded Physics Learner [Li et al., 2020].

VGPL-Visual-Prior PyTorch implementation for the visual prior component (i.e. perception module) of the Visually Grounded Physics Learner (VGPL). Give

Toru 8 Dec 29, 2022
Seach Losses of our paper 'Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search', accepted by ICLR 2021.

CSE-Autoloss Designing proper loss functions for vision tasks has been a long-standing research direction to advance the capability of existing models

Peidong Liu(刘沛东) 54 Dec 17, 2022
A Research-oriented Federated Learning Library and Benchmark Platform for Graph Neural Networks. Accepted to ICLR'2021 - DPML and MLSys'21 - GNNSys workshops.

FedGraphNN: A Federated Learning System and Benchmark for Graph Neural Networks A Research-oriented Federated Learning Library and Benchmark Platform

FedML-AI 175 Dec 1, 2022
This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

MultiModal-InfoMax This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Informa

Deep Cognition and Language Research (DeCLaRe) Lab 89 Dec 26, 2022
Implementation of "Meta-rPPG: Remote Heart Rate Estimation Using a Transductive Meta-Learner"

Meta-rPPG: Remote Heart Rate Estimation Using a Transductive Meta-Learner This repository is the official implementation of Meta-rPPG: Remote Heart Ra

Eugene Lee 137 Dec 13, 2022
This repo contains the official code and pre-trained models for the Dynamic Vision Transformer (DVT).

Dynamic-Vision-Transformer (Pytorch) This repo contains the official code and pre-trained models for the Dynamic Vision Transformer (DVT). Not All Ima

null 210 Dec 18, 2022
The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

Sun Yi 201 Nov 21, 2022
Neural Motion Learner With Python

Neural Motion Learner Introduction This work is to extract skeletal structure from volumetric observations and to learn motion dynamics from the detec

Jinseok Bae 14 Nov 28, 2022
The Self-Supervised Learner can be used to train a classifier with fewer labeled examples needed using self-supervised learning.

Published by SpaceML • About SpaceML • Quick Colab Example Self-Supervised Learner The Self-Supervised Learner can be used to train a classifier with

SpaceML 92 Nov 30, 2022