Official code for 'Weakly-supervised Video Anomaly Detection with Robust Temporal Feature Magnitude Learning' [ICCV 2021]

Yu Tian

Last update: Jan 8, 2023

Related tags

Deep Learning RTFM

Overview

RTFM

This repo contains the Pytorch implementation of our paper:

Weakly-supervised Video Anomaly Detection with Robust Temporal Feature Magnitude Learning

Yu Tian, Guansong Pang, Yuanhong Chen, Rajvinder Singh, Johan W. Verjans, Gustavo Carneiro.

Accepted at ICCV 2021.
SOTA on 4 benchmarks. Check out Papers With Code for Video Anomaly Detection.

Training

Setup

Please download the extracted I3d features for ShanghaiTech and UCF-Crime dataset from links below:

ShanghaiTech train i3d onedirve

ShanghaiTech test i3d onedrive

ShanghaiTech train features on Google dirve

ShanghaiTech test features on Google dirve

checkpoint for ShanghaiTech

Extracted I3d features for UCF-Crime dataset

UCF-Crime train i3d onedirve

UCF-Crime test i3d onedrive

UCF-Crime train I3d features on Google drive

UCF-Crime test I3d features on Google drive

checkpoint for Ucf-crime

The above features use the resnet50 I3D to extract from this repo.

Follow previous works, we also apply 10-crop augmentations.

The following files need to be adapted in order to run the code on your own machine:

Change the file paths to the download datasets above in list/shanghai-i3d-test-10crop.list and list/shanghai-i3d-train-10crop.list.
Feel free to change the hyperparameters in option.py

Train and test the RTFM

After the setup, simply run the following command:

python main.py

Citation

If you find this repo useful for your research, please consider citing our paper:

@inproceedings{tian2021weakly,
  title={Weakly-supervised Video Anomaly Detection with Robust Temporal Feature Magnitude Learning},
  author={Tian, Yu and Pang, Guansong and Chen, Yuanhong and Singh, Rajvinder and Verjans, Johan W and Carneiro, Gustavo},
  booktitle={Proceedings of the IEEE/CVF international conference on computer vision},
  year={2021}
}

Comments

How to extract the I3D-10crop feature?
Hello, Yu Tian. I have 2 questions hope you respond:

Can you tell me why I3D-10 crops features of XD-Violence are not provided in this git?

Please can you show me the I3D-10 crops feature extraction code? Because I extracted only 2 dimensions, not 3 dimensions like yours. Thank you!
opened by DungVo1507 7
Process followed for generating the i3d features

Can you kindly explain the process you followed for generating the i3d features of the shanghai tech dataset so that we can follow the same for other datasets and videos as well?

opened by GowthamGottimukkala 7
Using custom dataset

Hi! Hope you're doing well.

I have been able to run the code successfully on the given features, and wanted to try it on another dataset. I was unable to figure out how to use the repo you had mentioned in other issues to extract the features, hence I used this repo (https://github.com/v-iashin/video_features) to extract the i3d features and then used the pytorch inbuilt tencrop function while loading the dataset.

Ten crop done by: transformss = transforms.Compose([ transforms.ToPILImage(), transforms.TenCrop(2048), transforms.Lambda(lambda crops: torch.stack([transforms.ToTensor()(crop) for crop in crops])) ]) Using any number other than 2048 in TenCrop, such as 31, gives an error "Given groups=1, weight of size [512, 2048, 3], expected input[31, 31, 10] to have 2048 channels, but got 31 channels instead"

Moreover, I keep getting errors in size mismatch while trying to run it. "shape '[1, 2048, 10, 2048]' is invalid for input of size 3276800" from this line "abnormal_features = abnormal_features.view(n_size, ncrops, t, f) " Line 239 in model.py

Additionally, I wanted to ask how exactly the results of the ground truth files are structured since it seems to just be a single binary array?

I would appreciate any help and guidance you could provide on the above. Thanks & Regards

opened by ro1406 6
The excat changes to make to train on Ucf-crime
Hello! First, congratulaton for the excelent paper.

What are the excat changes to make to train your model on Ucf-Crime, with your code.

To what I noted in the paper and in the various issues:

assign args.batch_size = 32 (When we concatenate we get a batch size of 64. Moreover when I set to 64 (128) the results do not increase)

weight_decay = 0.0005

and in the dataset.py:

if self.is_normal: self.list = self.list [810:] else: self.list = self.list [: 810]

Is that all?

Because I do not achieve the same performance even after these changes. I get a maximum of :

auc: 0.75

pr_auc: 0.18588291392503292
opened by nathanb97 6
Corrupted trained I3D features on UCF-crime dataset

Wonderful work!

Recently, I found that the pre-extracted I3D features are corrupted in both google-drive and onedrive links. So I would appreciate it if you can upload the correct files.

Thank you and looking forward to your reply.

opened by lts427 5
make_gt_ucf.py ---- Missing Matlab_formate folder

Hello Author,

Firstly. Let you Know that some of the feature files in Google_drive are Missing and resulting in the dimension issues. If you can re-upload all the features that will be helpful.

Also I tried feature extraction using this link on UCF-Crime Dataset. https://github.com/GowthamGottimukkala/I3D_Feature_Extraction_resnet. I extracted the features successfully and I m struck at gt-ucf.npy. Using make_gt_ucf.py I am trying to create one. But I am getting error .

temporal_root = '/home/yu/PycharmProjects/DeepMIL-master/list/Matlab_formate/' in Line 41, if mat_file in mat_name_list: its not finding , since the folder is missing.

Could you help me on that? Do we need any .mat file to create gt_ucf.npy?

Thanks

opened by rabaig 4
Visualize the results

Thanks for your great work, @tianyu0207, I was able to implement your RTFM on my custom data. Please can you describe how to visualize the results from .pkl model files like in your paper?

Thank you so much!

opened by DungVo1507 4
After obtaining the final temporal feature representation X
Thanks for viewing my issue, @tianyu0207 I have 4 questions that I hope you can explain:

After obtaining X, the snippets have been divided into 2 groups normal and abnormal, right?

In the Select Top-k snippets stage, do you select k snippets from both the normal and the abnormal groups, or will each group select k snippets?

Assuming k = 3, in case a video has less than 3 abnormal or normal snippets, how will RTFM choose?

When the input is normal video, how will the RTFM-enabled Snippet Classifier Learning stage classify?
opened by DungVo1507 4
ask for the code of data preprocessing before I3d

thank you for your excellent work!

I saw people met the same problem of data preprocessing on the I3D feature extraction.

Can you share the code of data preprocessing before I3d ?

Can you specify exactly how you do the tenCrop on an image ?

thanks!

opened by marsplant 4
提取ShanghaiTech数据集I3D特征
您好，很有益的工作，在我的科研工作中有很大的帮助！但在提取ShanghaiTech数据集I3D特征的过程中遇见了一些难题，提取出来的特征与您公开的I3D特征有些许差异，所以想请教您几个问题：

问题1：计算clip的数量

是否将一个视频的帧按照每16帧一个clip进行划分，针对最后一个clip就有两种情况：

情况1：最后一个clip不满16帧，是不是复制最后一个视频帧，使其满16帧

情况2：最后一个clip刚好16帧，我看您提取的ShanghaiTech数据集I3D特征是新增一个clip，然后该clip中的视频帧都是复制最后一帧（比如：视频01_047，总帧数为1056，刚好被16整除，产生66个clip，您提取的i3d特征维度却是(67, 10, 2048)）；

问题2：关于resize与TenCrop视频帧的大小

在提取ShanghaiTech数据集I3D特征时，在Resize与TenCrop视频帧时设置的大小是多少呢?

下面是我设置：

# resize图像大小 gtransforms.GroupResize((224, 224)), # 裁剪图像的方式 gtransforms.GroupTenCrop(224),

问题3：mean和std的选择

在提取ShanghaiTech数据集I3D特征时，怎么设置数据集的均值mean和std的呢？

请问您使用下面的设置么？

mean = [114.75, 114.75, 114.75] std = [57.375, 57.375, 57.375]

期盼您的回复，谢谢！
opened by Holdwsq 3
特征维度问题

您好，很有益的工作。我发现您视频中对一个视频的特征维度是TX2048，但是Onedrive中的维度是TX10X2048 通过阅读其他的issues，解释是一段视频分为T个片段，每个帧都要经过10-crop，每个crop的图片提取出2048维度的特征，不知道我这样理解对不对如果对的话，有一个小问题尚未明白，分为T个片段，每个片段含有n个帧（举例n=16），这16个帧通过10-crop会产生160个图片，每个图片都提取2048维度的特征，所以最后一个视频对应的是不是应该为TX160X2048，与TX10X2048相矛盾。请问提取特征是不是还有一些细节我没有注意到。谢谢

opened by Apollo-asleep 3
how to make ground truth file and list file for shanghai tech dataset

Hello Tian hope you are doing good.I see that there is a file make_gt_sh.py and make_list_sh.py in your repo. I cannot understand how to use these. What to place in the root path "/home/yu/yu_ssd/SH_Test_center_crop_i3d/" and how to make a gt_sh.npy file, since the code doesnot mention making a gt_sh.npy file as it is mentioned in make_gt_ucf.py. Same for make_list_sh.py how to use it for making shanghai-i3d-test-10crop.list and train-10crop.list. Hope you reply. Thanks in advance

opened by Waleed-Ahmed01 0
How to implement inference code?

Hello bro, Thank for your paper and codes. I'm developing inference code, But I have a question. If I running inference code, Inference dataset is extracted I3D too? or reshaping to (frame)(10)(2048)? Thx!

opened by yunseokddi 0
some questions about extracting I3D features

Dear Professor, I sincerely want a copy of the code you extracted from I3D. Because of my limited ability, I cannot implement it. But I hope to test my dataset on your algorithm. So I hope you can give me a copy of your code. Thank you for your trouble.

opened by liucun12 0
parameter setting for UCSDped2 dataset

Hi, your results seem very good on ped2 dataset. But my results are very bad. (For other 3 datasets, I can achieve similar results as yours). Would you please share the hyperparameter you are using for training the model? For example, the batch-size, feature-size, k? Or is it possible for you to share the rgb_file_list of ped2 dataset? Thank you!

opened by coranholmes 3
Customized dataset

Thanks for your great work! Would you please share how to do the pre-processing about to make ground truth and extract the features from I3D? 您好！谢谢你们的研究。能否请问一下要怎样组织数据集呢？例如原有的UCF-crime的gt中如果是异常视频有开始帧编号和结束帧编号等等，由于你们的工作经过了10-crops的数据增强，对于如何组织数据集有一点蒙，新手想问下如何操作，谢谢！

opened by zimengxueying 2

Official code for 'Weakly-supervised Video Anomaly Detection with Robust Temporal Feature Magnitude Learning' [ICCV 2021]

Related tags

Overview

RTFM

Training

Setup

Train and test the RTFM

Citation

Comments

Owner

Yu Tian

[Official] Exploring Temporal Coherence for More General Video Face Forgery Detection(ICCV 2021)

Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Change is Everywhere: Single-Temporal Supervised Object Change Detection in Remote Sensing Imagery (ICCV 2021)

Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy" (ICLR 2022 Spotlight)

Code for the paper "Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds" (ICCV 2021)

Normalization Matters in Weakly Supervised Object Localization (ICCV 2021)

Code for "FGR: Frustum-Aware Geometric Reasoning for Weakly Supervised 3D Vehicle Detection", ICRA 2021

The official implementation of CVPR 2021 Paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.

This is an unofficial implementation of the paper “Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection”.

Code and models for ICCV2021 paper "Robust Object Detection via Instance-Level Temporal Cycle Confusion".

Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

Cascaded Deep Video Deblurring Using Temporal Sharpness Prior and Non-local Spatial-Temporal Similarity

git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

[CVPR2021] The source code for our paper 《Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning》.

Official implement of Paper：A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sening images

Code for the paper One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation, CVPR 2021.

Official PyTorch implementation of "IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos", CVPRW 2021

Official PyTorch implementation of N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event Cameras (ICCV 2021)

Official PyTorch code for WACV 2022 paper "CFLOW-AD: Real-Time Unsupervised Anomaly Detection with Localization via Conditional Normalizing Flows"