TVNet: Temporal Voting Network for Action Localization

hywang

Last update: Jul 26, 2022

Related tags

Deep Learning TVNet

Overview

TVNet: Temporal Voting Network for Action Localization

This repo holds the codes of paper: "TVNet: Temporal Voting Network for Action Localization".

Paper Introduction

Temporal action localization is a vital task in video understranding. In this paper, we propose a Temporal Voting Network (TVNet) for action localization in untrimmed videos. This incorporates a novel Voting Evidence Module to locate temporal boundaries, more accurately, where temporal contextual evidence is accumulated to predict frame-level probabilities of start and end action boundaries.

Dependencies

Python == 2.7
Tensorflow == 1.9.0
CUDA==10.1.105
GCC >= 5.4

Note that the PEM code from BMN is implemented in Pytorch==1.1.0 or 1.3.0

Data Preparation

Datasets

Our experiments is based on ActivityNet 1.3 and THUMOS14 datasets.

Feature for THUMOS14

You can download the feature on THUMOS14 at here GooogleDrive.

Place it into a folder named thumos_features inside ./data.

You also need to download the feature for PEM (from BMN) at GooogleDrive. Please put it into a folder named Thumos_feature_hdf5 inside ./TVNet-THUMOS14/data/thumos_features.

If everything goes well, you can get the folder architecture of ./TVNet-THUMOS14/data like this:

data                       
└── thumos_features                    
		├── Thumos_feature_dim_400              
		├── Thumos_feature_hdf5               
		├── features_train.npy 
		└── features_test.npy

Feature for ActivityNet 1.3

You can download the feature on ActivityNet 1.3 at here GoogleCloud. Please put csv_mean_100 directory into ./TVNet-ANET/data/activitynet_feature_cuhk/.

If everything goes well, you can get the folder architecture of ./TVNet-ANET/data like this:

data                        
└── activitynet_feature_cuhk                    
		    └── csv_mean_100

Run all steps

Run all steps on THUMOS14

cd TVNet-THUMOS14

Run the following script with all steps on THUMOS14:

bash do_all.sh

Note: If you use BlueCrystal 4, you can directly run the following script without any dependencies setup.

bash do_all_BC4.sh

Run all steps on ActivityNet 1.3

cd TVNet-ANET
bash do_all.sh  or  bash do_all_BC4.sh

Run steps separately

Take TVNet-THUMOS14 as an example:

cd TVNet-THUMOS14

1. Temporal evaluation module

python TEM_train.py

python TEM_test.py

2. Creat training data for voting evidence module

python VEM_create_windows.py --window_length L --window_stride S

L is the window length and S is the sliding stride. We generate training windows for length 10 with stride 5, and length 5 with stride 2.

3. Voting evidence module

python VEM_train.py --voting_type TYPE --window_length L --window_stride S

python VEM_test.py --voting_type TYPE --window_length L --window_stride S

TYPE should be start or end. We train and test models with window length 10 (stride 5) and window length 5 (stride 2) for start and end separately.

4. Proposal evaluation module from BMN

python PEM_train.py

5. Proposal generation

python proposal_generation.py

6. Post processing and detection

python post_postprocess.py

Results

THUMOS14

tIoU	mAP@IoU
0.3	0.5724681814413137
0.4	0.5060844218403346
0.5	0.430414918823808
0.6	0.3297164845828022
0.7	0.202971546242546

ActivityNet 1.3

tIoU	mAP@IoU
Average	0.3460396513933088
0.5	0.5135151163296395
0.75	0.34955648726767025
0.95	0.10121803584836778

Reference

This implementation borrows from:

BSN: BSN-Boundary-Sensitive-Network

TEM_train/test.py -- for the TEM module we used in our paper
load_dataset.py -- borrow the part which load data for TEM

BMN: BMN-Boundary-Matching-Network

PEM_train.py -- for the PEM module we used in our paper

G-TAD: Sub-Graph Localization for Temporal Action Detection

post_postprocess.py -- for the multicore process to generate detection

Our main contribution is in:

VEM_create_windows.py -- generate training annotations for Voting Evidence Module (VEM)

VEM_train.py -- train Voting Evidence Module (VEM)

VEM_test.py -- test Voting Evidence Module (VEM)

You might also like...

TDN: Temporal Difference Networks for Efficient Action Recognition

TDN: Temporal Difference Networks for Efficient Action Recognition Overview We release the PyTorch code of the TDN(Temporal Difference Networks).

Multimedia Computing Group, Nanjing University

326 Dec 13, 2022

A pytorch-version implementation codes of paper: "BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation"

BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation A pytorch-version implementation

11 Oct 8, 2022

AEI: Actors-Environment Interaction with Adaptive Attention for Temporal Action Proposals Generation

AEI: Actors-Environment Interaction with Adaptive Attention for Temporal Action Proposals Generation A pytorch-version implementation codes of paper:

11 Dec 13, 2022

Spatial Temporal Graph Convolutional Networks (ST-GCN) for Skeleton-Based Action Recognition in PyTorch

Reminder ST-GCN has transferred to MMSkeleton, and keep on developing as an flexible open source toolbox for skeleton-based human understanding. You a

1.1k Dec 25, 2022

Efficient Two-Step Networks for Temporal Action Segmentation (Neurocomputing 2021)

Efficient Two-Step Networks for Temporal Action Segmentation This repository provides a PyTorch implementation of the paper Efficient Two-Step Network

8 Apr 16, 2022

Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

5 Sep 16, 2022

Cascaded Deep Video Deblurring Using Temporal Sharpness Prior and Non-local Spatial-Temporal Similarity

Comments

Thumos_feature_hdf5?

Nice job.

If the file "Thumos_feature_hdf5", containing all videos extracted, as same this link does? https://github.com/open-mmlab/mmaction2/tree/master/tools/data/thumos14

regards.

opened by emedinac 1
Missing link Thumos_feature_dim_400

Hi, good job, I would like to download this file (Thumos_feature_dim_400) but I think it is broken or missing. I downloaded other files and everything looks ok.

opened by emedinac 1

TVNet: Temporal Voting Network for Action Localization

Related tags

Overview

TVNet: Temporal Voting Network for Action Localization

Paper Introduction

Dependencies

Data Preparation

Datasets

Feature for THUMOS14

Feature for ActivityNet 1.3

Run all steps

Run all steps on THUMOS14

Run all steps on ActivityNet 1.3

Run steps separately

1. Temporal evaluation module

2. Creat training data for voting evidence module

3. Voting evidence module

4. Proposal evaluation module from BMN

5. Proposal generation

6. Post processing and detection

Results

THUMOS14

ActivityNet 1.3

Reference

You might also like...

TDN: Temporal Difference Networks for Efficient Action Recognition

A pytorch-version implementation codes of paper: "BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation"

AEI: Actors-Environment Interaction with Adaptive Attention for Temporal Action Proposals Generation

Spatial Temporal Graph Convolutional Networks (ST-GCN) for Skeleton-Based Action Recognition in PyTorch

Efficient Two-Step Networks for Temporal Action Segmentation (Neurocomputing 2021)

Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

Cascaded Deep Video Deblurring Using Temporal Sharpness Prior and Non-local Spatial-Temporal Similarity

Drone-based Joint Density Map Estimation, Localization and Tracking with Space-Time Multi-Scale Attention Network

Dist2Dec: A Simplicial Neural Network for Homology Localization

Comments

Thumos_feature_hdf5?

Missing link Thumos_feature_dim_400

Owner

hywang

Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX.

Code for "PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation" CVPR 2019 oral

SSL_SLAM2: Lightweight 3-D Localization and Mapping for Solid-State LiDAR (mapping and localization separated) ICRA 2021

(CVPR 2021) Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds

Pixel Consensus Voting for Panoptic Segmentation (CVPR 2020)

Allows including an action inside another action (by preprocessing the Yaml file). This is how composite actions should have worked.

Official implementation of ACTION-Net: Multipath Excitation for Action Recognition (CVPR'21).

Human Action Controller - A human action controller running on different platforms.

The official TensorFlow implementation of the paper Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition