Official PyTorch implementation of "IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos", CVPRW 2021

Gyeongsik Moon

Last update: Sep 24, 2022

Related tags

Deep Learning IntegralAction_RELEASE

Overview

IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos

Introduction

This repo is official PyTorch implementation of IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos (CVPRW 2021).

Directory

Root

The ${ROOT} is described as below.

${ROOT}  
|-- data  
|-- common  
|-- main  
|-- tool
|-- output

data contains data loading codes and soft links to images and annotations directories.
common contains kernel codes for IntegralAction.
main contains high-level codes for training or testing the network.
tool contains a code to merge models of rgb_only and pose_only stages.
output contains log, trained models, visualized outputs, and test result.

Data

You need to follow directory structure of the data as below.

${ROOT}  
|-- data  
|   |-- Kinetics
|   |   |-- data
|   |   |   |-- frames 
|   |   |   |-- kinetics-skeleton
|   |   |   |-- Kinetics50_train.json
|   |   |   |-- Kinetics50_val.json
|   |   |   |-- Kinetics400_train.json
|   |   |   |-- Kinetics400_val.json
|   |-- Mimetics
|   |   |-- data  
|   |   |   |-- frames 
|   |   |   |-- pose_results 
|   |   |   |-- Mimetics50.json
|   |   |   |-- Mimetics400.json
|   |-- NTU
|   |   |-- data  
|   |   |   |-- frames 
|   |   |   |-- nturgb+d_skeletons
|   |   |   |-- NTU_train.json
|   |   |   |-- NTU_test.json

Download Kinetics parsed data [data] [website]
Download Mimetics parsed data [data] [website]
Download NTU parsed data [data] [website]
All annotation files follow MS COCO format.
If you want to add your own dataset, you have to convert it to MS COCO format.

To download multiple files from Google drive without compressing them, try this. If you have a problem with 'Download limit' problem when tried to download dataset from google drive link, please try this trick.

* Go the shared folder, which contains files you want to copy to your drive  
* Select all the files you want to copy  
* In the upper right corner click on three vertical dots and select “make a copy”  
* Then, the file is copied to your personal google drive account. You can download it from your personal account.

Output

You need to follow the directory structure of the output folder as below.

${ROOT}  
|-- output  
|   |-- log  
|   |-- model_dump  
|   |-- result  
|   |-- vis

Creating output folder as soft link form is recommended instead of folder form because it would take large storage capacity.
log folder contains training log file.
model_dump folder contains saved checkpoints for each epoch.
result folder contains final estimation files generated in the testing stage.
vis folder contains visualized results.

Running IntegralAction

Start

Install PyTorch and Python >= 3.7.3 and run sh requirements.sh.
In the main/config.py, you can change settings of the model including dataset to use, network backbone, and input size and so on.
There are three stages. 1) rgb_only , 2) pose_only, and 3) rgb+pose. In the rgb_only stage, only RGB stream is trained, and in the pose_only stage, only pose stream is trained. Finally, rgb+pose stage initializes weights from the previous two stages and continue training by the pose-drive integration.

Train

1. `rgb_only` stage

In the main folder, run

python train.py --gpu 0-3 --mode rgb_only

to train IntegralAction in the rgb_only stage on the GPU 0,1,2,3. --gpu 0,1,2,3 can be used instead of --gpu 0-3. Then, backup the trained weights by running

mkdir ../output/model_dump/rgb_only
mv ../output/model_dump/snapshot_*.pth.tar ../output/model_dump/rgb_only/.

2. `pose_only` stage

In the main folder, run

python train.py --gpu 0-3 --mode pose_only

to train IntegralAction in the pose_only stage on the GPU 0,1,2,3. --gpu 0,1,2,3 can be used instead of --gpu 0-3.
Then, backup the trained weights by running

mkdir ../output/model_dump/pose_only
mv ../output/model_dump/snapshot_*.pth.tar ../output/model_dump/pose_only/.

3. `rgb+pose` stage

In the tool folder, run

cp ../output/model_dump/rgb_only/snapshot_29.pth.tar snapshot_29_rgb_only.pth.tar
cp ../output/model_dump/pose_only/snapshot_29.pth.tar snapshot_29_pose_only.pth.tar
python merge_rgb_only_pose_only.py
mv snapshot_0.pth.tar ../output/model_dump/.

In the main folder, run

python train.py --gpu 0-3 --mode rgb+pose --continue

to train IntegralAction in the rgb+pose stage on the GPU 0,1,2,3. --gpu 0,1,2,3 can be used instead of --gpu 0-3.

Test

Place trained model at the output/model_dump/. Choose the stage you want to test from one of [rgb_only, pose_only, rgb+pose].

In the main folder, run

python test.py --gpu 0-3 --mode $STAGE --test_epoch 29

to test IntegralAction in $STAGE stage (should be one of [rgb_only, pose_only, rgb+pose]) on the GPU 0,1,2,3 with 29th epoch trained model. --gpu 0,1,2,3 can be used instead of --gpu 0-3.

Results

Here I report the performance of the IntegralAction.

Kinetics50

Download IntegralAction trained on [Kinetics50].
Kinetics50 is a subset of Kinetics400. It mainly contains videos with human motion-related action classes, sampled from Kinetics400.

(base) mks0601:~/workspace/IntegralAction/main$ python test.py --gpu 5-6 --mode rgb+pose --test_epoch 29
>>> Using GPU: 5,6
04-15 11:48:25 Creating dataset...
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
04-15 11:48:25 Load checkpoint from ../output/model_dump/snapshot_29.pth.tar
04-15 11:48:25 Creating graph...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 773/773 [03:09<00:00,  5.11it/s]
Evaluation start...
Top-1 accuracy: 72.2087
Top-5 accuracy: 92.2735
Result is saved at: ../output/result/kinetics_result.json

Mimetics

Download IntegralAction trained on [Kinetics50].
Kinetics50 is a subset of Kinetics400. It mainly contains videos with human motion-related action classes, sampled from Kinetics400.
Note that Mimetics is used only for the testing purpose.

(base) mks0601:~/workspace/IntegralAction/main$ python test.py --gpu 5-6 --mode rgb+pose --test_epoch 29
>>> Using GPU: 5,6
04-15 11:52:20 Creating dataset...
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
04-15 11:52:20 Load checkpoint from ../output/model_dump/snapshot_29.pth.tar
04-15 11:52:20 Creating graph...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 187/187 [02:14<00:00,  4.93it/s]
Evaluation start...
Top-1 accuracy: 26.5101
Top-5 accuracy: 50.5034
Result is saved at: ../output/result/mimetics_result.json

Reference

@InProceedings{moon2021integralaction,
  title={IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos},
  author={Moon, Gyeongsik and Kwon, Heeseung and Lee, Kyoung Mu and Cho, Minsu},
  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW)}, 
  year={2021}
}

Comments

How to create frames folder?
Hi, im trying with mimetics data sets. Arranged data folder as you mentioned.

But what to keep in the frames folder??

I have downloaded the mimetics dataset which has the videos. where to keep them ? Im facing issue like this (Really need your support right now as i have project demonstration on monday)
opened by suryateja0101 1
(Urgent Request) How to test single video

Hi, I have taken your kinetics trained data set and tested mimetics videos on it and got the same result as yours.

How can I test for a single video to visualize the result for that particular video. (Need to give one video as input and get the action recognized output) Need to demo this in 13th June presentation(Monday # Tomorrow). Can you please explain this and help me in this.

Awaiting for your reply. Thanks.

opened by gunadeepak 0
Provide NTU DataSet

Hi, in the given link for NTU Dataset, only json files are present. Have a look in the below screenshot. Can you please provide the Dataset required to test for that particular dataset.

opened by suryateja0101 0
Accuracy Issue

Hi i have downloaded IntegralAction trained on [[Kinetics50] and tried rgb_only, pose_only, rgb+pose. However, for rgb+pose im getting the same accuracy as yours, but for the other two modes, the accuracies are nowhere near. Any reason for that? Attached the screenshot.

opened by suryateja0101 0
Query regarding testing

How can I test a single video? Can you please let me know the process of it.

Thank you for your response. I'm going forward in this project just because you are replying :-)

opened by suryateja0101 0
Performance degradation of rgb+pose

Dear @mks0601 , thank you for your awesome work. Following your instructions, I just want to reproduce action recognition on the dataset NTU-RGBD. Sampling per action videos with single times, accuracies are listed as below: rgb_only, Top-1: 86.1009, Top-5: 98.3245 pose_only, Top-1: 78.9649, Top-5: 96.4252 rgb+pose, Top-1: 44.8107, Top-5: 77.6964

As you can see, the accuracy of rgb+pose decrease a lot. I want to figure out what' wrong. A noteworthy detail, the snapshot model sizes of both rgb_only and pose_only are 196.5MB, while rgb+pose is 106.9MB. Please any suggestions, let me know. Thanks agian Best wishes.

opened by gftww 2
Too large to download Kinetics-50

Dear @mks0601, Kinetics-50 in google pan is too large ，which makes it difficult to download . Would you mind dividing the Kinetics-50 dataset into smaller pieces，and then sharing us with it in google pan ？ Thank you very much !

opened by HuangZuShu 0

Official PyTorch implementation of "IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos", CVPRW 2021

Related tags

Overview

IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos

Introduction

Directory

Root

Data

Output

Running IntegralAction

Start

Train

1. rgb_only stage

2. pose_only stage

3. rgb+pose stage

Test

Results

Kinetics50

Mimetics

Reference

Comments

How to create frames folder?

(Urgent Request) How to test single video

Provide NTU DataSet

Accuracy Issue

Query regarding testing

Performance degradation of rgb+pose

Too large to download Kinetics-50

Owner

Gyeongsik Moon

PyTorch implementation of the Deep SLDA method from our CVPRW-2020 paper "Lifelong Machine Learning with Deep Streaming Linear Discriminant Analysis"

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019). A PyTorch implementation.

Code for Dual Contrastive Learning for Unsupervised Image-to-Image Translation, NTIRE, CVPRW 2021.

[CVPRW 2021] Code for Region-Adaptive Deformable Network for Image Quality Assessment

CVPRW 2021: How to calibrate your event camera

PyTorch version of the paper 'Enhanced Deep Residual Networks for Single Image Super-Resolution' (CVPRW 2017)

[CVPRW 21] "BNN - BN = ? Training Binary Neural Networks without Batch Normalization", Tianlong Chen, Zhenyu Zhang, Xu Ouyang, Zechun Liu, Zhiqiang Shen, Zhangyang Wang

[CVPRW 2022] Attentions Help CNNs See Better: Attention-based Hybrid Image Quality Assessment Network

"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

Official PyTorch implementation of RobustNet (CVPR 2021 Oral)

Official PyTorch Implementation of Hypercorrelation Squeeze for Few-Shot Segmentation, arXiv 2021

Official pytorch implementation of Rainbow Memory (CVPR 2021)

Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(2021) paper

Official Pytorch implementation of "Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video", CVPR 2021

Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning" (AAAI 2021)

Official PyTorch Implementation of SSMix (Findings of ACL 2021)

Official Pytorch Implementation of: "Semantic Diversity Learning for Zero-Shot Multi-label Classification"(2021) paper

Official PyTorch Implementation of Embedding Transfer with Label Relaxation for Improved Metric Learning, CVPR 2021

1. `rgb_only` stage

2. `pose_only` stage

3. `rgb+pose` stage