Foreground-Action Consistency Network for Weakly Supervised Temporal Action Localization

Last update: Nov 22, 2022

Related tags

Overview

FAC-Net

Foreground-Action Consistency Network for Weakly Supervised Temporal Action Localization
Linjiang Huang (CUHK), Liang Wang (CASIA), Hongsheng Li (CUHK)

Paper: arXiv, ICCV

Overview

We argue that existing methods for weakly-supervised temporal activity localization cannot guarantee the foreground-action consistency, that is, the foreground and actions are mutually inclusive. Therefore, we propose a novel method named Foreground-Action Consistency Network (FAC-Net) to address this issue. The experimental results on THUMOS14 are as below.

Method \ mAP(%)	@0.1	@0.2	@0.3	@0.4	@0.5	@0.6	@0.7	AVG
UntrimmedNet	44.4	37.7	28.2	21.1	13.7	-	-	-
STPN	52.0	44.7	35.5	25.8	16.9	9.9	4.3	27.0
W-TALC	55.2	49.6	40.1	31.1	22.8	-	7.6	-
AutoLoc	-	-	35.8	29.0	21.2	13.4	5.8	-
CleanNet	-	-	37.0	30.9	23.9	13.9	7.1	-
MAAN	59.8	50.8	41.1	30.6	20.3	12.0	6.9	31.6
CMCS	57.4	50.8	41.2	32.1	23.1	15.0	7.0	32.4
BM	60.4	56.0	46.6	37.5	26.8	17.6	9.0	36.3
RPN	62.3	57.0	48.2	37.2	27.9	16.7	8.1	36.8
DGAM	60.0	54.2	46.8	38.2	28.8	19.8	11.4	37.0
TSCN	63.4	57.6	47.8	37.7	28.7	19.4	10.2	37.8
EM-MIL	59.1	52.7	45.5	36.8	30.5	22.7	16.4	37.7
BaS-Net	58.2	52.3	44.6	36.0	27.0	18.6	10.4	35.3
A2CL-PT	61.2	56.1	48.1	39.0	30.1	19.2	10.6	37.8
ACM-BANet	64.6	57.7	48.9	40.9	32.3	21.9	13.5	39.9
HAM-Net	65.4	59.0	50.3	41.1	31.0	20.7	11.1	39.8
UM	67.5	61.2	52.3	43.4	33.7	22.9	12.1	41.9
FAC-Net (Ours)	67.6	62.1	52.6	44.3	33.4	22.5	12.7	42.2

Prerequisites

Recommended Environment

Python 3.6
Pytorch 1.2
Tensorboard Logger
CUDA 10.0

Data Preparation

Prepare THUMOS'14 dataset.
- We recommend using features and annotations provided by this repo.
Place the features and annotations inside a dataset/Thumos14reduced/ folder.

Usage

Training

You can easily train the model by running the provided script.

Refer to train_options.py. Modify the argument of dataset-root to the path of your dataset folder.
Run the command below.

$ python train_main.py --run-type 0 --model-id 1   # rgb stream
$ python train_main.py --run-type 1 --model-id 2   # flow stream

Make sure you use different model-id for RGB and optical flow. Models are saved in ./ckpt/dataset_name/model_id/

Evaulation

The trained model can be found here. Please change the file name to xxx.pkl (e.g., 100.pkl) and put it into ./ckpt/dataset_name/model_id/. You can evaluate the model referring to the two stream evaluation process.

Single stream evaluation

Run the command below.

$ python train_main.py --pretrained --run-type 2 --model-id 1 --load-epoch 100  # rgb stream
$ python train_main.py --pretrained --run-type 3 --model-id 2 --load-epoch 100  # flow stream

load-epoch refers to the epoch of the best model. The best model would not always occur at 100 epoch, please refer to the log in the same folder of saved models to set the load epoch of the best model. Make sure you set the right model-id that corresponds to the model-id during training.

Two stream evaluation

Run the command below using our provided models.

$ python test_main.py --rgb-model-id 1 --flow-model-id 2 --rgb-load-epoch 100 --flow-load-epoch 100

References

We referenced the repos below for the code.

If you find this code useful, please cite our paper.

@InProceedings{Huang_2021_ICCV,
    author    = {Huang, Linjiang and Wang, Liang and Li, Hongsheng},
    title     = {Foreground-Action Consistency Network for Weakly Supervised Temporal Action Localization},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {8002-8011}
}

Contact

If you have any question or comment, please contact the first author of the paper - Linjiang Huang ([email protected]).

You might also like...

Self-Learned Video Rain Streak Removal: When Cyclic Consistency Meets Temporal Correspondence

In this paper, we address the problem of rain streaks removal in video by developing a self-learned rain streak removal method, which does not require any clean groundtruth images in the training process.

44 Dec 6, 2022

Cross-Image Region Mining with Region Prototypical Network for Weakly Supervised Segmentation

Cross-Image Region Mining with Region Prototypical Network for Weakly Supervised Segmentation The code of: Cross-Image Region Mining with Region Proto

16 Nov 26, 2022

The first dataset on shadow generation for the foreground object in real-world scenes.

Object-Shadow-Generation-Dataset-DESOBA Object Shadow Generation is to deal with the shadow inconsistency between the foreground object and the backgr

105 Dec 30, 2022

FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery (TGRS)

FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery by Ailong Ma, Junjue Wang*, Yanfei Zhon

43 Jan 5, 2023

[AAAI-2021] Visual Boundary Knowledge Translation for Foreground Segmentation

Trans-Net Code for (Visual Boundary Knowledge Translation for Foreground Segmentation, AAAI2021). [https://ojs.aaai.org/index.php/AAAI/article/view/16

2 Mar 4, 2022

Official implement of Paper：A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sening images

A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sensing images 深度监督影像融合网络DSIFN用于高分辨率双时相遥感影像变化检测 Of

135 Dec 19, 2022

Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

5 Sep 16, 2022

Cascaded Deep Video Deblurring Using Temporal Sharpness Prior and Non-local Spatial-Temporal Similarity

This repository is the official PyTorch implementation of Cascaded Deep Video Deblurring Using Temporal Sharpness Prior and Non-local Spatial-Temporal Similarity

4 Dec 11, 2022

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CAC) Xin Lai*, Zhuotao Tian*, Li Jiang, Shu Liu, Hengshuang Zhao, Li

137 Dec 14, 2022

Comments

How to reproduce your results

The first three lines are the result of my own training model： 100 100 34.18 28.70 20.20 13.54 8.17 3.40 1.45 35 70 31.39 26.72 19.16 12.83 7.28 3.66 1.70 35 75 31.38 26.49 19.76 13.25 7.76 3.84 1.81 yours: 1000 1000 68.19 62.43 53.21 44.78 34.10 23.13 13.44

opened by yangjiangeyjg 4
Visualization

Hello, Thanks for making your interesting work open to the public, I am a student recently studying on WS-TAL tasks. What I want to ask is about "how to visualize results from network so that it can be the one shown in the paper." For example, like the ones in the Figure 1 and 5 in your paper.

Thanks, Joseph

opened by JosephKKim 2

Foreground-Action Consistency Network for Weakly Supervised Temporal Action Localization

Related tags

Overview

FAC-Net

Overview

Prerequisites

Recommended Environment

Data Preparation

Usage

Training

Evaulation

Single stream evaluation

Two stream evaluation

References

Contact

You might also like...

Self-Learned Video Rain Streak Removal: When Cyclic Consistency Meets Temporal Correspondence

Cross-Image Region Mining with Region Prototypical Network for Weakly Supervised Segmentation

The first dataset on shadow generation for the foreground object in real-world scenes.

FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery (TGRS)

[AAAI-2021] Visual Boundary Knowledge Translation for Foreground Segmentation

Official implement of Paper：A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sening images

Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

Cascaded Deep Video Deblurring Using Temporal Sharpness Prior and Non-local Spatial-Temporal Similarity

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Comments

How to reproduce your results

Visualization

Owner

CVPR2021: Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Codes for TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization.

Normalization Matters in Weakly Supervised Object Localization (ICCV 2021)

PyTorch implementation of ''Background Activation Suppression for Weakly Supervised Object Localization''.

Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX.

SSL_SLAM2: Lightweight 3-D Localization and Mapping for Solid-State LiDAR (mapping and localization separated) ICRA 2021

Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

[NeurIPS 2020] Blind Video Temporal Consistency via Deep Video Prior