Official repo for BMVC2021 paper ASFormer: Transformer for Action Segmentation

Last update: Dec 23, 2022

Related tags

Deep Learning ASFormer

Overview

ASFormer: Transformer for Action Segmentation

This repo provides training & inference code for BMVC 2021 paper: ASFormer: Transformer for Action Segmentation.

Enviroment

Pytorch == 1.1.0, torchvision == 0.3.0, python == 3.6, CUDA=10.1

Reproduce our results

1. Download the dataset data.zip at (https://mega.nz/#!O6wXlSTS!wcEoDT4Ctq5HRq_hV-aWeVF1_JB3cacQBQqOLjCIbc8) or (https://zenodo.org/record/3625992#.Xiv9jGhKhPY). 
2. Unzip the data.zip file to the current folder. There are three datasets in the ./data folder, i.e. ./data/breakfast, ./data/50salads, ./data/gtea
3. Download the pre-trained models at (https://pan.baidu.com/s/1zf-d-7eYqK-IxroBKTxDfg). There are pretrained models for three datasets, i.e. ./models/50salads, ./models/breakfast, ./models/gtea
4. Run python main.py --action=predict --dataset=50salads/gtea/breakfast --split=1/2/3/4/5 to generate predicted results for each split.
5. Run python eval.py --dataset=50salads/gtea/breakfast --split=0/1/2/3/4/5 to evaluate the performance. **NOTE**: split=0 will evaulate the average results for all splits, It needs to be done after you complete all split predictions.

Train your own model

Also, you can retrain the model by yourself with following command.

python main.py --action=train --dataset=50salads/gtea/breakfast --split=1/2/3/4/5

The training process is very stable in our experiments. It convergences very fast and is not sensitive to the number of training epochs.

Demo for using ASFormer as your backbone

In our paper, we replace the original TCN-based backbone model MS-TCN in ASRF with our ASFormer. The new model achieves even higher results on the 50salads dataset than the original ASRF. Code is Here.

If you find our repo useful, please give us a star and cite

@inproceedings{chinayi_ASformer,  
	author={Fangqiu Yi and Hongyu Wen and Tingting Jiang}, 
	booktitle={The British Machine Vision Conference (BMVC)},   
	title={ASFormer: Transformer for Action Segmentation},
	year={2021},  
}

Feel free to raise a issue if you got trouble with our code.

Comments

Batch size constraint

Hello,

Thank you for your amazing work !

I was wondering if there is any particular reason for imposing a batch size of 1 in model.py: https://github.com/ChinaYi/ASFormer/blob/89e72d840a3d3eb8f16c270adb5031d648d9fb73/model.py#L138

In my testing, ASFormer learns fine with bigger batch sizes.

opened by OliverGuy 6
attention实现的问题
您好，您提到的层次注意力是不是指的是band attention（如下图所示），只不过随着层数增加，窗口大小指数递增。这样的话model.py里这个函数里的那个for循环内容，是不是应该改为window_mask[:, i, i:i+self.bl] = 1

def construct_window_mask(self): window_mask = torch.zeros((1, self.bl, self.bl + 2* (self.bl //2))) for i in range(self.bl): window_mask[:, :, i:i+self.bl] = 1 return window_mask.to(device)
opened by ddz16 5
Feature Extraction

Hi, can you provide more informations about the feature extraction? I would like to use this fantastic model on my dataset but I don't know how to extract the features to feed to the encoder.

opened by Camillo4eyes 3
results on salads50 does not match table 5

Hi thanks for your work. I was able to train and test the model and achieve similar performance as mentioned in the paper when I use both enc and dec. However, when I don't use the decoder, the results are much worse than what is mentioned in the table 5 (first row). I was wondering if I need to do any changes to the setting to get the same performance (specially for Acc)? I notice that without using the decoder the acc drops lower than 80.

opened by seyeeet 3
Error in evaluation code

Hi,

Thanks for sharing the code. I noticed the bg_class in the evaluation code is not properly set.

The default name of background class is set to background, which is true in GTEA yet need to be changed to SIL for breakfast and action_start and action_end for 50salads. It seems they are not changed for the results in the paper.

With the correct class name and the released model, I obtained a lower result | | [email protected] | [email protected] | [email protected]| |-|---------|----------|---------| |Breakfast|70.9 |67.5 |56.7 | |50salads|83.7|81.8|73.7|

opened by ZijiaLewisLu 2
Increase the batchsize and the result is hurt

Hi, Thank you for your work. When I try to increase the batch size, the index drops a lot. What do you think are the possible reasons

the GPU is A100 40g.

train with default setting, only split 1

(s1)[83.40807175 81.16591928 72.19730942] 75.934108 83.2241

just change batch size to 8 lr 0.001 and then train

(s1)[68.94977169 67.57990868 55.25114155] 63.931922 72.0049

opened by wlsh1up 2
Cannot download the model

Hi ! Firstly, thanks for sharing this repo ! I'm struggling to download the model (3. Download the pre-trained models at (https://pan.baidu.com/s/1zf-d-7eYqK-IxroBKTxDfg)) Indeed, the site says that you need to create an account to download the file. The thing is I cannot create an account with a french phone number 😅 Any other way to download the pretrained model ? Many thanks !

opened by madmoiselleve 2
Long training time

Hello,

I am adapting your code for my own dataset which usually train relatively fast when using only ASRF, but when using your model with the transformer it's taking approximately 10x times longer. Do you have a similar behaviour with Salad/breakfast/gtea datasets ?

Thank you :)

opened by Jaakik 2
Enviroment issues

I installed the environment as you asked: Pytorch == 1.1.0, torchvision == 0.3.0, python == 3.6, CUDA=10.1

It is certain that the model is loaded because the model size is printed: Model Size: 1130860

But the problem is: Traceback (most recent call last): File "main.py", line 99, in trainer.predict(model_dir, results_dir, features_path, batch_gen_tst, num_epochs, actions_dict, sample_rate) File "/home/cpslabrtx3090/zjb/projects/ASFormer/model.py", line 399, in predict self.model.load_state_dict(torch.load(model_dir + "/epoch-" + str(epoch) + ".model")) File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/site-packages/torch/serialization.py", line 387, in load return _load(f, map_location, pickle_module, **pickle_load_args) File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/site-packages/torch/serialization.py", line 560, in _load raise RuntimeError("{} is a zip archive (did you mean to use torch.jit.load()?)".format(f.name)) RuntimeError: ./models/gtea/split_1/epoch-120.model is a zip archive (did you mean to use torch.jit.load()?)

Traceback (most recent call last): File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 189, in nti n = int(s.strip() or "0", 8) ValueError: invalid literal for int() with base 8: 'ld_tenso'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 2299, in next tarinfo = self.tarinfo.fromtarfile(self) File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 1093, in fromtarfile obj = cls.frombuf(buf, tarfile.encoding, tarfile.errors) File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 1035, in frombuf chksum = nti(buf[148:156]) File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 191, in nti raise InvalidHeaderError("invalid header") tarfile.InvalidHeaderError: invalid header

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/site-packages/torch/serialization.py", line 556, in _load return legacy_load(f) File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/site-packages/torch/serialization.py", line 467, in legacy_load with closing(tarfile.open(fileobj=f, mode='r:', format=tarfile.PAX_FORMAT)) as tar,
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 1591, in open return func(name, filemode, fileobj, **kwargs) File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 1621, in taropen return cls(name, mode, fileobj, **kwargs) File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 1484, in init self.firstmember = self.next() File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 2311, in next raise ReadError(str(e)) tarfile.ReadError: invalid header

opened by wolf-bailang 1
about the randomness of code

Hi, Thank you for your code https://github.com/ChinaYi/ASFormer/blob/3940443a7dca336f879c43a42bcf91c5bf7c790f/model.py?_pjax=%23js-repo-pjax-container%2C%20div%5Bitemtype%3D%22http%3A%2F%2Fschema.org%2FSoftwareSourceCode%22%5D%20main%2C%20%5Bdata-pjax-container%5D#L370 When I change the test interval from 10 to 20, 30, etc., different results(such as training loss ) are obtained under the same seed. What do you think is the reason？ best regards

opened by wlsh1up 1
The provided models generate lower scores than the paper reported

Thanks for you nice work, meanwhile, may I confirm one thing? By using your features and pre-trained models (epoch=120), the obtained scores are lower than your BMVC paper for three datasets. For instance, the edit and F1@10 of gtea can only reach 84.0 and 88.9, which are lower than 84.6 and 90.1 in your paper. Same for another two datasets. 50salads edit=75.7, F1@10=83.4.

opened by medical-girl 3

Owner

GitHub

This repo is for Self-Supervised Monocular Depth Estimation with Internal Feature Fusion(arXiv), BMVC2021

DIFFNet This repo is for Self-Supervised Monocular Depth Estimation with Internal Feature Fusion(arXiv), BMVC2021 A new backbone for self-supervised d

3 Oct 22, 2021

Official implementation of ACTION-Net: Multipath Excitation for Action Recognition (CVPR'21).

ACTION-Net Official implementation of ACTION-Net: Multipath Excitation for Action Recognition (CVPR'21). Getting Started EgoGesture data folder struct

171 Dec 26, 2022

Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization' (ICCV-21 Oral)

Learning-Action-Completeness-from-Points Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal A

67 Jan 3, 2023

Official PyTorch Implementation of Mask-aware IoU and maYOLACT Detector [BMVC2021]

The official implementation of Mask-aware IoU and maYOLACT detector. Our implementation is based on mmdetection. Mask-aware IoU for Anchor Assignment

11 Oct 21, 2021

[BMVC2021] The official implementation of "DomainMix: Learning Generalizable Person Re-Identification Without Human Annotations"

DomainMix [BMVC2021] The official implementation of "DomainMix: Learning Generalizable Person Re-Identification Without Human Annotations" [paper] [de

17 Dec 20, 2022

nnFormer: Interleaved Transformer for Volumetric Segmentation Code for paper "nnFormer: Interleaved Transformer for Volumetric Segmentation "

nnFormer: Interleaved Transformer for Volumetric Segmentation Code for paper "nnFormer: Interleaved Transformer for Volumetric Segmentation ". Please

610 Dec 28, 2022

Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021

ACTOR Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021. Please visit our we

248 Dec 23, 2022

This repo provides the official code for TransBTS: Multimodal Brain Tumor Segmentation Using Transformer (https://arxiv.org/pdf/2103.04430.pdf).

TransBTS: Multimodal Brain Tumor Segmentation Using Transformer This repo is the official implementation for TransBTS: Multimodal Brain Tumor Segmenta

247 Dec 28, 2022

Allows including an action inside another action (by preprocessing the Yaml file). This is how composite actions should have worked.

actions-includes Allows including an action inside another action (by preprocessing the Yaml file). Instead of using uses or run in your action step,

70 Nov 4, 2022

Human Action Controller - A human action controller running on different platforms.

Human Action Controller (HAC) Goal A human action controller running on different platforms. Fun Easy-to-use Accurate Anywhere Fun Examples Mouse Cont

27 Jul 20, 2022

The pytorch implementation of SOKD (BMVC2021).

Semi-Online Knowledge Distillation Implementations of SOKD. Requirements This repo was tested with Python 3.8, PyTorch 1.5.1, torchvision 0.6.1, CUDA

4 Dec 19, 2021

Code for BMVC2021 "MOS: A Low Latency and Lightweight Framework for Face Detection, Landmark Localization, and Head Pose Estimation"

MOS-Multi-Task-Face-Detect Introduction This repo is the official implementation of "MOS: A Low Latency and Lightweight Framework for Face Detection,

104 Dec 8, 2022

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation) Download Synthia dataset The model uses

32 Sep 21, 2022

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Object Detection and Instance Segmentation.

Swin Transformer for Object Detection This repo contains the supported code and configuration files to reproduce object detection results of Swin Tran

1.4k Dec 30, 2022

Official repo for BMVC2021 paper ASFormer: Transformer for Action Segmentation

Related tags

Overview

ASFormer: Transformer for Action Segmentation

Enviroment

Reproduce our results

Train your own model

Demo for using ASFormer as your backbone

Comments

Batch size constraint

attention实现的问题

Feature Extraction

results on salads50 does not match table 5

Error in evaluation code

Increase the batchsize and the result is hurt

Cannot download the model

Long training time

Enviroment issues

about the randomness of code

The provided models generate lower scores than the paper reported