Group Activity Recognition with Clustered Spatial Temporal Transformer

Overview

GroupFormer

Group Activity Recognition with Clustered Spatial-TemporalTransformer

Backbone Style Action Acc Activity Acc Config Download
Inv3+flow+pose pytorch 0.847 0.957 config model | test_log

Volleyball dataset is here.

Keypoint predicted by AlphaPose is here.

Flow data is too huge to upload, it can be easily generated by flownet as mentioned in our paper.

Train Model

./dist_train.sh $GPU_NUM $CONFIG

Test Model

./dist_test.sh $GPU_NUM $CONFIG $CHECKPOINT

A humble version has been released, containing core modules mentioned in this paper.

Any suggestion are welcome. We are glad to optimize our code and provide more details.

Comments
  • RuntimeError: image.is_contiguous()INTERNAL ASSERT FAILED at

    RuntimeError: image.is_contiguous()INTERNAL ASSERT FAILED at "C:\\Users\\29006\\Desktop\\RoIAlign\\roi_align\\src\\crop_and_resize_gpu.cpp":27, please report a bug to PyTorch. image must be contiguous

    作者你好 我在torch1.11上使用roi-align出现RuntimeError

    Traceback (most recent call last):
      File "c:/Users/29006/Desktop/GAR/scripts/train_volleyball_stage1.py", line 21, in <module>
        train_net(cfg)
      File ".\train_net.py", line 110, in train_net
        return forward_call(*input, **kwargs)
      File ".\base_model.py", line 117, in forward
        boxes_idx_flat)  #B*T*N, D, K, K,
      File "D:\Anaconda\envs\GCN\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
        return forward_call(*input, **kwargs)
      File "D:\Anaconda\envs\GCN\lib\site-packages\roi_align-0.0.2-py3.7-win-amd64.egg\roi_align\roi_align.py", line 48, in forward
        return CropAndResizeFunction.apply(featuremap, boxes, box_ind, self.crop_height, self.crop_width, self.extrapolation_value)
      File "D:\Anaconda\envs\GCN\lib\site-packages\roi_align-0.0.2-py3.7-win-amd64.egg\roi_align\crop_and_resize.py", line 25, in forward        
        ctx.extrapolation_value, ctx.crop_height, ctx.crop_width, crops)
    RuntimeError: image.is_contiguous()INTERNAL ASSERT FAILED at "C:\\Users\\29006\\Desktop\\RoIAlign\\roi_align\\src\\crop_and_resize_gpu.cpp":27, please report a bug to PyTorch. image must be contiguous
    

    请问你在使用RoIAlign-pytorch时是否也遇到了这样的错误呢?请问我应该从什么方向解决它?

    opened by Kev1n3zz 0
  • Error while trying to run training

    Error while trying to run training

    Hello,

    Thank you very much for the paper and library. I'm trying to reproduce the results using the suggested training script in the readme file.

    I'm getting the following errors: File "/home/ec2-user/GroupFormer/main.py", line 56, in <module> main() File "/home/ec2-user/GroupFormer/main.py", line 53, in main group_helper.train() File "/home/ec2-user/GroupFormer/group/group.py", line 239, in train self.train_epoch() File "/home/ec2-user/GroupFormer/group/group.py", line 259, in train_epoch actions_loss, activities_loss, aux_loss, loss = self.forward(batch) File "/home/ec2-user/GroupFormer/group/group.py", line 200, in forward actions, activities, aux_loss = self.model(batch[0], batch[1], batch[4], batch[5]) File "/home/ec2-user/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/ec2-user/GroupFormer/group/utils/distributed_utils.py", line 18, in forward return self.module(*inputs, **kwargs) File "/home/ec2-user/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/ec2-user/GroupFormer/group/models/__init__.py", line 149, in forward actions_scores1, activities_scores1, aux_loss1 = self.head(boxes_features, global_token) File "/home/ec2-user/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/ec2-user/GroupFormer/group/models/head/st_plus_tr_cross_cluster.py", line 208, in forward group = self.group_tr(group_query,x).reshape(1,B*T,-1) File "/home/ec2-user/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/ec2-user/GroupFormer/group/models/transformer.py", line 104, in forward output = layer(output, memory, tgt_mask=tgt_mask, File "/home/ec2-user/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/ec2-user/GroupFormer/group/models/transformer.py", line 243, in forward return self.forward_pre(tgt, memory, memory_mask, memory_key_padding_mask, pos) File "/home/ec2-user/GroupFormer/group/models/transformer.py", line 227, in forward_pre tgt2 = self.self_attn(q, k, value=src, attn_mask=src_mask, File "/home/ec2-user/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/ec2-user/.local/lib/python3.9/site-packages/torch/nn/modules/activation.py", line 1153, in forward attn_output, attn_output_weights = F.multi_head_attention_forward( File "/home/ec2-user/.local/lib/python3.9/site-packages/torch/nn/functional.py", line 5030, in multi_head_attention_forward is_batched = _mha_shape_check(query, key, value, key_padding_mask, attn_mask, num_heads) File "/home/ec2-user/.local/lib/python3.9/site-packages/torch/nn/functional.py", line 4874, in _mha_shape_check assert key.dim() == 3 and value.dim() == 3, \ AssertionError: For batched (3-D) "query", expected "key" and "value" to be 3-D but found 4-D and 4-D tensors respectively

    Any ideas why does it happen?

    Thanks in advance!

    opened by TomRaz 0
  • results are quite different from the results provided by you,where is ST_PLUS_TR_cross_cluster_v2?

    results are quite different from the results provided by you,where is ST_PLUS_TR_cross_cluster_v2?

    Hello, I found in the test log that the type of head and pose_head is ST_PLUS_TR_cross_cluster_v2, but there is no ST_PLUS_TR_cross_cluster_V2 module in the code. Have you made any changes based on ST_PLUS_TR_cross_cluster? I used the checkpoint provided by you and only got the results of Action Accuracy 83.988% and Activity Accuracy 94.914%, which are quite different from the results provided by you

    opened by swjtu-ljc 2
  • Raising key error

    Raising key error

    File "/content/GroupFormer/group/utils/distributed_utils.py", line 73, in dist_init_pytorch rank = int(os.environ['RANK']) File "/usr/lib/python3.7/os.py", line 681, in getitem raise KeyError(key) from None KeyError: 'RANK Capture

    opened by LNSHRIVAS 1
  • The configuration and pre-trained model for Collective Activity Dataset

    The configuration and pre-trained model for Collective Activity Dataset

    Thank you for the great work! I tried to reproduce the result of Collective Activity Dataset but was able to achieve only about 88% accuracy. If possible, can you provide the configuration file and pre-trained model for Collective Activity Dataset?

    opened by tamtamz 1
  • cannot unzip the checkpoint file

    cannot unzip the checkpoint file

    Hi, thanks for your work and sharing your codes!

    When I tried to test your uploaded model performance and unzip the checkpoint.pth.tar, I occurred to an error as follows:

    $ tar -xvf checkpoint.pth.tar
    tar: This does not look like a tar archive
    tar: Skipping to next header
    tar: Exiting with failure status due to previous errors
    

    I wander whether you get the same error, and I hope you give me some help. I know this tar file is as huge as about 500 MB, could I trouble you to upload the unzipped checkpoint.pth file to Google Driver directly again? Thank you in advance.

    Regard

    opened by 0shelter0 2
Owner
null
Spatial Temporal Graph Convolutional Networks (ST-GCN) for Skeleton-Based Action Recognition in PyTorch

Reminder ST-GCN has transferred to MMSkeleton, and keep on developing as an flexible open source toolbox for skeleton-based human understanding. You a

sijie yan 1.1k Dec 25, 2022
Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series Forecasting.

Non-AR Spatial-Temporal Transformer Introduction Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series For

Chen Kai 66 Nov 28, 2022
The implementation of "Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer"

Shuffle Transformer The implementation of "Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer" Introduction Very recently, window-

null 87 Nov 29, 2022
data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer"

C2F-FWN data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer" (https://arxiv.org/abs/

EKILI 46 Dec 14, 2022
Unofficial implementation of "TTNet: Real-time temporal and spatial video analysis of table tennis" (CVPR 2020)

TTNet-Pytorch The implementation for the paper "TTNet: Real-time temporal and spatial video analysis of table tennis" An introduction of the project c

Nguyen Mau Dung 438 Dec 29, 2022
The project is an official implementation of our paper "3D Human Pose Estimation with Spatial and Temporal Transformers".

3D Human Pose Estimation with Spatial and Temporal Transformers This repo is the official implementation for 3D Human Pose Estimation with Spatial and

Ce Zheng 363 Dec 28, 2022
This is the official Pytorch implementation of the paper "Diverse Motion Stylization for Multiple Style Domains via Spatial-Temporal Graph-Based Generative Model"

Diverse Motion Stylization (Official) This is the official Pytorch implementation of this paper. Diverse Motion Stylization for Multiple Style Domains

Soomin Park 28 Dec 16, 2022
Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving

GSAN Introduction Code for paper GSAN: Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving, wh

YE Luyao 6 Oct 27, 2022
Code for paper Decoupled Dynamic Spatial-Temporal Graph Neural Network for Traffic Forecasting

Decoupled Spatial-Temporal Graph Neural Networks Code for our paper: Decoupled Dynamic Spatial-Temporal Graph Neural Network for Traffic Forecasting.

S22 43 Jan 4, 2023
PyMove is a Python library to simplify queries and visualization of trajectories and other spatial-temporal data

Use PyMove and go much further Information Package Status License Python Version Platforms Build Status PyPi version PyPi Downloads Conda version Cond

Insight Data Science Lab 64 Nov 15, 2022
This is the research repository for Vid2Doppler: Synthesizing Doppler Radar Data from Videos for Training Privacy-Preserving Activity Recognition.

Vid2Doppler: Synthesizing Doppler Radar Data from Videos for Training Privacy-Preserving Activity Recognition This is the research repository for Vid2

Future Interfaces Group (CMU) 26 Dec 24, 2022
Shallow Convolutional Neural Networks for Human Activity Recognition using Wearable Sensors

-IEEE-TIM-2021-1-Shallow-CNN-for-HAR [IEEE TIM 2021-1] Shallow Convolutional Neural Networks for Human Activity Recognition using Wearable Sensors All

Wenbo Huang 1 May 17, 2022
CVPR2021: Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Temporal Context Aggregation Network - Pytorch This repo holds the pytorch-version codes of paper: "Temporal Context Aggregation Network for Temporal

Zhiwu Qing 63 Sep 27, 2022
Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

Zhengzhong Tu 5 Sep 16, 2022
Code for paper: Group-CAM: Group Score-Weighted Visual Explanations for Deep Convolutional Networks

Group-CAM By Zhang, Qinglong and Rao, Lu and Yang, Yubin [State Key Laboratory for Novel Software Technology at Nanjing University] This repo is the o

zhql 98 Nov 16, 2022
BC3407-Group-5-Project - BC3407 Group Project With Python

BC3407-Group-5-Project As the world struggles to contain the ever-changing varia

null 1 Jan 26, 2022
AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition

AdaFocusV2 This repo contains the official code and pre-trained models for AdaFo

null 79 Dec 26, 2022
MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens

MSG-Transformer Official implementation of the paper MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens, by Jiemin

Hust Visual Learning Team 68 Nov 16, 2022
VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).

VSR-Transformer By Jiezhang Cao, Yawei Li, Kai Zhang, Luc Van Gool This paper proposes a new Transformer for video super-resolution (called VSR-Transf

Jiezhang Cao 225 Nov 13, 2022