Official PyTorch implementation of "IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos", CVPRW 2021

Overview

IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos

Introduction

This repo is official PyTorch implementation of IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos (CVPRW 2021).

Directory

Root

The ${ROOT} is described as below.

${ROOT}  
|-- data  
|-- common  
|-- main  
|-- tool
|-- output  
  • data contains data loading codes and soft links to images and annotations directories.
  • common contains kernel codes for IntegralAction.
  • main contains high-level codes for training or testing the network.
  • tool contains a code to merge models of rgb_only and pose_only stages.
  • output contains log, trained models, visualized outputs, and test result.

Data

You need to follow directory structure of the data as below.

${ROOT}  
|-- data  
|   |-- Kinetics
|   |   |-- data
|   |   |   |-- frames 
|   |   |   |-- kinetics-skeleton
|   |   |   |-- Kinetics50_train.json
|   |   |   |-- Kinetics50_val.json
|   |   |   |-- Kinetics400_train.json
|   |   |   |-- Kinetics400_val.json
|   |-- Mimetics
|   |   |-- data  
|   |   |   |-- frames 
|   |   |   |-- pose_results 
|   |   |   |-- Mimetics50.json
|   |   |   |-- Mimetics400.json
|   |-- NTU
|   |   |-- data  
|   |   |   |-- frames 
|   |   |   |-- nturgb+d_skeletons
|   |   |   |-- NTU_train.json
|   |   |   |-- NTU_test.json

To download multiple files from Google drive without compressing them, try this. If you have a problem with 'Download limit' problem when tried to download dataset from google drive link, please try this trick.

* Go the shared folder, which contains files you want to copy to your drive  
* Select all the files you want to copy  
* In the upper right corner click on three vertical dots and select “make a copy”  
* Then, the file is copied to your personal google drive account. You can download it from your personal account.  

Output

You need to follow the directory structure of the output folder as below.

${ROOT}  
|-- output  
|   |-- log  
|   |-- model_dump  
|   |-- result  
|   |-- vis  
  • Creating output folder as soft link form is recommended instead of folder form because it would take large storage capacity.
  • log folder contains training log file.
  • model_dump folder contains saved checkpoints for each epoch.
  • result folder contains final estimation files generated in the testing stage.
  • vis folder contains visualized results.

Running IntegralAction

Start

  • Install PyTorch and Python >= 3.7.3 and run sh requirements.sh.
  • In the main/config.py, you can change settings of the model including dataset to use, network backbone, and input size and so on.
  • There are three stages. 1) rgb_only , 2) pose_only, and 3) rgb+pose. In the rgb_only stage, only RGB stream is trained, and in the pose_only stage, only pose stream is trained. Finally, rgb+pose stage initializes weights from the previous two stages and continue training by the pose-drive integration.

Train

1. rgb_only stage

In the main folder, run

python train.py --gpu 0-3 --mode rgb_only

to train IntegralAction in the rgb_only stage on the GPU 0,1,2,3. --gpu 0,1,2,3 can be used instead of --gpu 0-3. Then, backup the trained weights by running

mkdir ../output/model_dump/rgb_only
mv ../output/model_dump/snapshot_*.pth.tar ../output/model_dump/rgb_only/.

2. pose_only stage

In the main folder, run

python train.py --gpu 0-3 --mode pose_only

to train IntegralAction in the pose_only stage on the GPU 0,1,2,3. --gpu 0,1,2,3 can be used instead of --gpu 0-3.
Then, backup the trained weights by running

mkdir ../output/model_dump/pose_only
mv ../output/model_dump/snapshot_*.pth.tar ../output/model_dump/pose_only/.

3. rgb+pose stage

In the tool folder, run

cp ../output/model_dump/rgb_only/snapshot_29.pth.tar snapshot_29_rgb_only.pth.tar
cp ../output/model_dump/pose_only/snapshot_29.pth.tar snapshot_29_pose_only.pth.tar
python merge_rgb_only_pose_only.py
mv snapshot_0.pth.tar ../output/model_dump/.

In the main folder, run

python train.py --gpu 0-3 --mode rgb+pose --continue

to train IntegralAction in the rgb+pose stage on the GPU 0,1,2,3. --gpu 0,1,2,3 can be used instead of --gpu 0-3.

Test

Place trained model at the output/model_dump/. Choose the stage you want to test from one of [rgb_only, pose_only, rgb+pose].

In the main folder, run

python test.py --gpu 0-3 --mode $STAGE --test_epoch 29

to test IntegralAction in $STAGE stage (should be one of [rgb_only, pose_only, rgb+pose]) on the GPU 0,1,2,3 with 29th epoch trained model. --gpu 0,1,2,3 can be used instead of --gpu 0-3.

Results

Here I report the performance of the IntegralAction.

Kinetics50

  • Download IntegralAction trained on [Kinetics50].
  • Kinetics50 is a subset of Kinetics400. It mainly contains videos with human motion-related action classes, sampled from Kinetics400.
(base) mks0601:~/workspace/IntegralAction/main$ python test.py --gpu 5-6 --mode rgb+pose --test_epoch 29
>>> Using GPU: 5,6
04-15 11:48:25 Creating dataset...
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
04-15 11:48:25 Load checkpoint from ../output/model_dump/snapshot_29.pth.tar
04-15 11:48:25 Creating graph...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 773/773 [03:09<00:00,  5.11it/s]
Evaluation start...
Top-1 accuracy: 72.2087
Top-5 accuracy: 92.2735
Result is saved at: ../output/result/kinetics_result.json

Mimetics

  • Download IntegralAction trained on [Kinetics50].
  • Kinetics50 is a subset of Kinetics400. It mainly contains videos with human motion-related action classes, sampled from Kinetics400.
  • Note that Mimetics is used only for the testing purpose.
(base) mks0601:~/workspace/IntegralAction/main$ python test.py --gpu 5-6 --mode rgb+pose --test_epoch 29
>>> Using GPU: 5,6
04-15 11:52:20 Creating dataset...
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
04-15 11:52:20 Load checkpoint from ../output/model_dump/snapshot_29.pth.tar
04-15 11:52:20 Creating graph...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 187/187 [02:14<00:00,  4.93it/s]
Evaluation start...
Top-1 accuracy: 26.5101
Top-5 accuracy: 50.5034
Result is saved at: ../output/result/mimetics_result.json

Reference

@InProceedings{moon2021integralaction,
  title={IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos},
  author={Moon, Gyeongsik and Kwon, Heeseung and Lee, Kyoung Mu and Cho, Minsu},
  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW)}, 
  year={2021}
}
Comments
  • How to create frames folder?

    How to create frames folder?

    Hi, im trying with mimetics data sets. Arranged data folder as you mentioned.

    1. But what to keep in the frames folder??
    2. I have downloaded the mimetics dataset which has the videos. where to keep them ? Im facing issue like this (Really need your support right now as i have project demonstration on monday) 1
    opened by suryateja0101 1
  • (Urgent Request) How to test single video

    (Urgent Request) How to test single video

    Hi, I have taken your kinetics trained data set and tested mimetics videos on it and got the same result as yours.

    How can I test for a single video to visualize the result for that particular video. (Need to give one video as input and get the action recognized output) Need to demo this in 13th June presentation(Monday # Tomorrow). Can you please explain this and help me in this.

    Awaiting for your reply. Thanks.

    opened by gunadeepak 0
  • Provide NTU DataSet

    Provide NTU DataSet

    Hi, in the given link for NTU Dataset, only json files are present. Have a look in the below screenshot. Can you please provide the Dataset required to test for that particular dataset. ntu

    opened by suryateja0101 0
  • Accuracy Issue

    Accuracy Issue

    Hi i have downloaded IntegralAction trained on [[Kinetics50] and tried rgb_only, pose_only, rgb+pose. However, for rgb+pose im getting the same accuracy as yours, but for the other two modes, the accuracies are nowhere near. Any reason for that? Attached the screenshot. acc

    opened by suryateja0101 0
  • Query regarding testing

    Query regarding testing

    How can I test a single video? Can you please let me know the process of it.

    Thank you for your response. I'm going forward in this project just because you are replying :-)

    opened by suryateja0101 0
  • Performance degradation of rgb+pose

    Performance degradation of rgb+pose

    Dear @mks0601 , thank you for your awesome work. Following your instructions, I just want to reproduce action recognition on the dataset NTU-RGBD. Sampling per action videos with single times, accuracies are listed as below: rgb_only, Top-1: 86.1009, Top-5: 98.3245 ntu_result-29-0 pose_only, Top-1: 78.9649, Top-5: 96.4252 ntu_result-29-0 rgb+pose, Top-1: 44.8107, Top-5: 77.6964 ntu_result-29-0

    As you can see, the accuracy of rgb+pose decrease a lot. I want to figure out what' wrong. A noteworthy detail, the snapshot model sizes of both rgb_only and pose_only are 196.5MB, while rgb+pose is 106.9MB. Please any suggestions, let me know. Thanks agian Best wishes.

    opened by gftww 2
  • Too large to download Kinetics-50

    Too large to download Kinetics-50

    Dear @mks0601, Kinetics-50 in google pan is too large ,which makes it difficult to download . Would you mind dividing the Kinetics-50 dataset into smaller pieces,and then sharing us with it in google pan ? Thank you very much !

    opened by HuangZuShu 0
Owner
Gyeongsik Moon
Postdoc in CVLAB, SNU, Korea
Gyeongsik Moon
PyTorch implementation of the Deep SLDA method from our CVPRW-2020 paper "Lifelong Machine Learning with Deep Streaming Linear Discriminant Analysis"

Lifelong Machine Learning with Deep Streaming Linear Discriminant Analysis This is a PyTorch implementation of the Deep Streaming Linear Discriminant

Tyler Hayes 41 Dec 25, 2022
Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019). A PyTorch implementation.

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set —— PyTorch implementation This is an unofficial offici

Sicheng Xu 833 Dec 28, 2022
Code for Dual Contrastive Learning for Unsupervised Image-to-Image Translation, NTIRE, CVPRW 2021.

arXiv Dual Contrastive Learning Adversarial Generative Networks (DCLGAN) We provide our PyTorch implementation of DCLGAN, which is a simple yet powerf

null 119 Dec 4, 2022
[CVPRW 2021] Code for Region-Adaptive Deformable Network for Image Quality Assessment

RADN [CVPRW 2021] Code for Region-Adaptive Deformable Network for Image Quality Assessment [Paper on arXiv] Overview Update [2021/5/7] add codes for W

IIGROUP 53 Dec 28, 2022
CVPRW 2021: How to calibrate your event camera

E2Calib: How to Calibrate Your Event Camera This repository contains code that implements video reconstruction from event data for calibration as desc

Robotics and Perception Group 104 Nov 16, 2022
PyTorch version of the paper 'Enhanced Deep Residual Networks for Single Image Super-Resolution' (CVPRW 2017)

About PyTorch 1.2.0 Now the master branch supports PyTorch 1.2.0 by default. Due to the serious version problem (especially torch.utils.data.dataloade

Sanghyun Son 2.1k Jan 1, 2023
[CVPRW 21] "BNN - BN = ? Training Binary Neural Networks without Batch Normalization", Tianlong Chen, Zhenyu Zhang, Xu Ouyang, Zechun Liu, Zhiqiang Shen, Zhangyang Wang

BNN - BN = ? Training Binary Neural Networks without Batch Normalization Codes for this paper BNN - BN = ? Training Binary Neural Networks without Bat

VITA 40 Dec 30, 2022
[CVPRW 2022] Attentions Help CNNs See Better: Attention-based Hybrid Image Quality Assessment Network

Attention Helps CNN See Better: Hybrid Image Quality Assessment Network [CVPRW 2022] Code for Hybrid Image Quality Assessment Network [paper] [code] T

IIGROUP 49 Dec 11, 2022
"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction (CVPRW 2022) Yuanhao Cai, Jing Lin, Zudi Lin, Haoqian Wang, Yulun Z

Yuanhao Cai 274 Jan 5, 2023
Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

Jie Liu 111 Dec 31, 2022
Official PyTorch implementation of RobustNet (CVPR 2021 Oral)

RobustNet (CVPR 2021 Oral): Official Project Webpage Codes and pretrained models will be released soon. This repository provides the official PyTorch

Sungha Choi 173 Dec 21, 2022
Official PyTorch Implementation of Hypercorrelation Squeeze for Few-Shot Segmentation, arXiv 2021

Hypercorrelation Squeeze for Few-Shot Segmentation This is the implementation of the paper "Hypercorrelation Squeeze for Few-Shot Segmentation" by Juh

Juhong Min 165 Dec 28, 2022
Official pytorch implementation of Rainbow Memory (CVPR 2021)

Rainbow Memory: Continual Learning with a Memory of Diverse Samples

Clova AI Research 91 Dec 17, 2022
Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(2021) paper

ImageNet-21K Pretraining for the Masses Paper | Pretrained models Official PyTorch Implementation Tal Ridnik, Emanuel Ben-Baruch, Asaf Noy, Lihi Zelni

null 574 Jan 2, 2023
Official Pytorch implementation of "Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video", CVPR 2021

TCMR: Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video Qualtitative result Paper teaser video Introduction This r

Hongsuk Choi 215 Jan 6, 2023
Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning" (AAAI 2021)

Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic

NAVER/LINE Vision 30 Dec 6, 2022
Official PyTorch Implementation of SSMix (Findings of ACL 2021)

SSMix: Saliency-based Span Mixup for Text Classification (Findings of ACL 2021) Official PyTorch Implementation of SSMix | Paper Abstract Data augment

Clova AI Research 52 Dec 27, 2022
Official Pytorch Implementation of: "Semantic Diversity Learning for Zero-Shot Multi-label Classification"(2021) paper

Semantic Diversity Learning for Zero-Shot Multi-label Classification Paper Official PyTorch Implementation Avi Ben-Cohen, Nadav Zamir, Emanuel Ben Bar

null 28 Aug 29, 2022
Official PyTorch Implementation of Embedding Transfer with Label Relaxation for Improved Metric Learning, CVPR 2021

Embedding Transfer with Label Relaxation for Improved Metric Learning Official PyTorch implementation of CVPR 2021 paper Embedding Transfer with Label

Sungyeon Kim 37 Dec 6, 2022