Graph Convolutional Networks for Temporal Action Localization (ICCV2019)

Runhao Zeng

Last update: Dec 6, 2022

Related tags

Deep Learning PGCN

Overview

Graph Convolutional Networks for Temporal Action Localization

This repo holds the codes and models for the PGCN framework presented on ICCV 2019

Graph Convolutional Networks for Temporal Action Localization Runhao Zeng*, Wenbing Huang*, Mingkui Tan, Yu Rong, Peilin Zhao, Junzhou Huang, Chuang Gan, ICCV 2019, Seoul, Korea.

[Paper]

Updates

20/12/2019 We have uploaded the RGB features, trained models and evaluation results! We found that increasing the number of proposals to 800 in the testing further boosts the performance on THUMOS14. We have also updated the proposal list.

04/07/2020 We have uploaded the I3D features on Anet, the training configurations files in data/dataset_cfg.yaml and the proposal lists for Anet.

Usage Guide
Other Info
- Citation
- Contact

Usage Guide

Prerequisites

[back to top]

The training and testing in PGCN is reimplemented in PyTorch for the ease of use.

PyTorch 1.0.1

Other minor Python modules can be installed by running

pip install -r requirements.txt

Code and Data Preparation

[back to top]

Get the code

Clone this repo with git, please remember to use --recursive

git clone --recursive https://github.com/Alvin-Zeng/PGCN

Download Datasets

We support experimenting with two publicly available datasets for temporal action detection: THUMOS14 & ActivityNet v1.3. Here are some steps to download these two datasets.

THUMOS14: We need the validation videos for training and testing videos for testing. You can download them from the THUMOS14 challenge website.
ActivityNet v1.3: this dataset is provided in the form of YouTube URL list. You can use the official ActivityNet downloader to download videos from the YouTube.

Download Features

Here, we provide the I3D features (RGB+Flow) for training and testing.

THUMOS14: You can download it from Google Cloud or Baidu Cloud.

Anet: You can download the I3D Flow features from Baidu Cloud (password: jbsa) and the I3D RGB features from Google Cloud (Note: set the interval to 16 in ops/I3D_Pooling_Anet.py when training with RGB features)

Download Proposal Lists (ActivityNet)

Here, we provide the proposal lists for ActivityNet 1.3. You can download them from Google Cloud

Training PGCN

[back to top]

Plesse first set the path of features in data/dataset_cfg.yaml

train_ft_path: $PATH_OF_TRAINING_FEATURES
test_ft_path: $PATH_OF_TESTING_FEATURES

Then, you can use the following commands to train PGCN

python pgcn_train.py thumos14 --snapshot_pre $PATH_TO_SAVE_MODEL

After training, there will be a checkpoint file whose name contains the information about dataset and the number of epoch. This checkpoint file contains the trained model weights and can be used for testing.

Testing Trained Models

[back to top]

You can obtain the detection scores by running

sh test.sh TRAINING_CHECKPOINT

Here, TRAINING_CHECKPOINT denotes for the trained model. This script will report the detection performance in terms of mean average precision at different IoU thresholds.

The trained models and evaluation results are put in the "results" folder.

You can obtain the two-stream results on THUMOS14 by running

sh test_two_stream.sh

THUMOS14

[email protected] (%)	RGB	Flow	RGB+Flow
P-GCN (I3D)	37.23	47.42	49.07 (49.64)

#####Here, 49.64% is obtained by setting the combination weights to Flow:RGB=1.2:1 and nms threshold to 0.32

Other Info

[back to top]

Citation

Please cite the following paper if you feel PGCN useful to your research

@inproceedings{PGCN2019ICCV,
  author    = {Runhao Zeng and
               Wenbing Huang and
               Mingkui Tan and
               Yu Rong and
               Peilin Zhao and
               Junzhou Huang and
               Chuang Gan},
  title     = {Graph Convolutional Networks for Temporal Action Localization},
  booktitle   = {ICCV},
  year      = {2019},
}

Contact

For any question, please file an issue or contact

Runhao Zeng: [email protected]

Comments

the proposal_list for ActivityNet

hello,have you got the I3D feature or the proposal_list for ActivityNet? I'm also working on activitynet dataset. Thank you! My email is [email protected]

opened by ShaoQiBNU 9
Question about proposal generation

Hi, thanks for your sharing of the code. I'd like to know that where the pre-extracted proposals come from. Did you reimplement the paper Boundary Sensitive Network or just use their provided proposals?

opened by yangwf1 9
how to generate bsn_proposal_list.txt

thank you for your great work. I have some question wether if I want to apply your work for my own dataset how can i generate the bsn_proposal_list files ? and am I need to train bsn proposal generator with my own dataset ?

thank you so much

opened by thxkew 7
How to run this model on a New datasets

I would like to test your model on a new TAL dataset collected by our laboratory. Hence we want to know how should we prepare the dataset directory and ground truth files. Any suggestions will be very helpful!

opened by makecent 6
About Activitynet feature

Hi, thanks a lot for open resource .I'm working on activitynet dataset. Do u have I3D features u used in this project, I'm appreciate it if u can share one copy to me! My email address is [email protected]

opened by yklilfft 6
How to predict a single unlabeled video?

Dear author, I have trained the PGCN model on my own data set, but I need to make a prediction on a video (not in the training set nor in the test set), I see that the code needs to generate the corresponding proposal which need corresponding GT information, But the videos I'm testing now don't have tag files. could u tell me how to do it? thanks a lot

opened by mrlihellohorld 5
Is RGB model saved with float datatype?

Thanks for the brilliant work!

I happen to see an error when the RGB model is directly loaded into the PGCN architecture. The reason seems to be a mismatch of the datatype.

To solve that, I replaced one line of code in pgcn_test.py reg_scores[prop_idx, :] = net((act_batch_var, comp_batch_var), None, None, None) by reg_scores[prop_idx, :] = net((act_batch_var.float(), comp_batch_var.float()), None, None, None)

Do it first if you find a similar error :)

opened by frostinassiky 4
The performance of the best model is lower than the results in the paper?

Thanks for your excellent work. I trained the model that you provided and found that the best model's(at the epoch 15) performance is
| IoU thresh | 0.10 | 0.20 | 0.30 | 0.40 | 0.50 | 0.60 | 0.70 | 0.80 | 0.90 | Average | | mean AP | 0.6574 | 0.6382 | 0.6009 | 0.5374 | 0.4578 | 0.3369 | 0.2172 | 0.0903 | 0.0134 | 0.3944 | +------------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------- And it's lower than results in the paper. Could you provide the pretrained model or explain why this happened ? Thank you

opened by shadowclouds 3
A detail question about the model

Hey! Great Job! After seeing through your paper, I've got one question. I was wondering how did you process the output feature of GCN model (Nxd) before fc layer. Because you know, you got to get rid of the two dimensions. And I saw the code, found that you just picked the first row of all N features. Did I understand it correctly? If so, could you please explain why would you do that. Why not perform an average pool between the N features? Thanks a lot!

opened by HaiyiMei 3
question about the I3D feature
In your paper, it says "We first uniformly divide each input video into 64-frame segments. We then use a two-stream Inflated 3D ConvNet (I3D) model pre-trained on Kinetics [5] to extract the segment features." However, in your code

interval = 8 clip_length = 64 start_unit = int(min(ft_num - 1, np.floor(float(start_ind + off) / interval))) end_unit = int(min(ft_num - 2, np.ceil(float(end_ind - clip_length) / interval)))

I guess minusing 64 means you do not use the last few frames not divisible by 64, but why should interval=8? Is it means that you divide each input video into 8-frame?

By the way? Could you offer the I3D feature on ActivityNet? It's so time-comsuming to extrat.
opened by JJBOY 2
Using G-TAD results In PGCN :
Hi,

I am trying to use PGCN on my own dataset. I have annotated the data according to Thumos'14 annotation format and extracted features using I3D. I have also trained and infered a G-TAD model.

Can you let me know how I can re-score G-TAD generated output using PGCN?

Your answers to the above questions will clarify a lot of doubts.

Thank you for your time!
opened by lakshaymehra 2
anet dataset with flow feature

when i use numpy to load the data file, only one int data can be loaded. is there any trick for loading the flow i3d anet dataset, or is there anything wrong when i process the data?

opened by dyjjjjj 0
where is proposal folder for THUMOS14?And the mismatch between RGB and Flow data for activitynet

I can not find the thumos14's proposals like activitynet. And I also can not understand why the number of features are different between flow and rgb, they also could not match the Proposal_Lists.txt provided for acitvitynet.

I am very confused and I think no one can run the code now using provided features and porposals

Thanks!!!

opened by yangmin666 0
How to choose proposals when inferring an unmarked video

Hi, With the help of the author and many others, I trained the PGCN model. But I have a question about how to choose inferential proposals. At present, I use GTAD + PGCN to output the prediction results (that is, a certain number of proposals, sorted according to score). However, in actual processing of an unmarked video, a certain number of proposals will be generated through PGCN. Then how can I finally select these proposals as the prediction results?Top100/50 or something like that

opened by mrlihellohorld 1
The Proposal List in PGCN

I am confused on how to read the proposal list for the dataset to be used in PGCN. Can someone explain how to read each number in one line of the proposal list? Thank you!

opened by hongkyhao 0
Features of THUMOS14

There are total 413 videos in this dataset, including training and testing. The number of provided RGB features is 413, but the number of provided flow features is 412. Why there is someone is missing?

opened by Hanqer 0

Owner

Runhao Zeng

GitHub

Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

AFSD: Learning Salient Boundary Feature for Anchor-free Temporal Action Localization This is an official implementation in PyTorch of AFSD. Our paper

146 Dec 24, 2022

CVPR2021: Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Temporal Context Aggregation Network - Pytorch This repo holds the pytorch-version codes of paper: "Temporal Context Aggregation Network for Temporal

63 Sep 27, 2022

Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition in CVPR19

2s-AGCN Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition in CVPR19 Note PyTorch version should be 0.3! For PyTor

547 Dec 26, 2022

SSL_SLAM2: Lightweight 3-D Localization and Mapping for Solid-State LiDAR (mapping and localization separated) ICRA 2021

SSL_SLAM2 Lightweight 3-D Localization and Mapping for Solid-State LiDAR (Intel Realsense L515 as an example) This repo is an extension work of SSL_SL

1.3k Jan 8, 2023

Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX.

ONNX Object Localization Network Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX. Ori

15 Oct 14, 2022

This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of Coordinate Independent Convolutional Networks.

Orientation independent Möbius CNNs This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of

59 Dec 9, 2022

TDN: Temporal Difference Networks for Efficient Action Recognition

TDN: Temporal Difference Networks for Efficient Action Recognition Overview We release the PyTorch code of the TDN(Temporal Difference Networks).

Multimedia Computing Group, Nanjing University

326 Dec 13, 2022

Efficient Two-Step Networks for Temporal Action Segmentation (Neurocomputing 2021)

Efficient Two-Step Networks for Temporal Action Segmentation This repository provides a PyTorch implementation of the paper Efficient Two-Step Network

8 Apr 16, 2022

This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning].

CG3 This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning]. R

12 Oct 28, 2022

Spontaneous Facial Micro Expression Recognition using 3D Spatio-Temporal Convolutional Neural Networks

Spontaneous Facial Micro Expression Recognition using 3D Spatio-Temporal Convolutional Neural Networks Abstract Facial expression recognition in video

103 Dec 29, 2022

Sequence modeling benchmarks and temporal convolutional networks

Sequence Modeling Benchmarks and Temporal Convolutional Networks (TCN) This repository contains the experiments done in the work An Empirical Evaluati

3.5k Jan 1, 2023

Codes for TIM2021 paper "Anchor-Based Spatio-Temporal Attention 3-D Convolutional Networks for Dynamic 3-D Point Cloud Sequences"

Intelligent Robotics and Machine Vision Lab

4 Jul 19, 2022

Allows including an action inside another action (by preprocessing the Yaml file). This is how composite actions should have worked.

actions-includes Allows including an action inside another action (by preprocessing the Yaml file). Instead of using uses or run in your action step,

70 Nov 4, 2022

Official implementation of ACTION-Net: Multipath Excitation for Action Recognition (CVPR'21).

ACTION-Net Official implementation of ACTION-Net: Multipath Excitation for Action Recognition (CVPR'21). Getting Started EgoGesture data folder struct

171 Dec 26, 2022

Human Action Controller - A human action controller running on different platforms.

Human Action Controller (HAC) Goal A human action controller running on different platforms. Fun Easy-to-use Accurate Anywhere Fun Examples Mouse Cont

27 Jul 20, 2022

The official TensorFlow implementation of the paper Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition

Action Transformer A Self-Attention Model for Short-Time Human Action Recognition This repository contains the official TensorFlow implementation of t

20 Jan 3, 2023

A pytorch-version implementation codes of paper: "BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation"

BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation A pytorch-version implementation

11 Oct 8, 2022

AEI: Actors-Environment Interaction with Adaptive Attention for Temporal Action Proposals Generation

AEI: Actors-Environment Interaction with Adaptive Attention for Temporal Action Proposals Generation A pytorch-version implementation codes of paper:

11 Dec 13, 2022

Official PyTorch implementation of "AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks"

AASIST This repository provides the overall framework for training and evaluating audio anti-spoofing systems proposed in 'AASIST: Audio Anti-Spoofing

56 Jan 2, 2023

Graph Convolutional Networks for Temporal Action Localization (ICCV2019)

Related tags

Overview

Graph Convolutional Networks for Temporal Action Localization

Updates

Contents

Usage Guide

Prerequisites

Code and Data Preparation

Get the code

Download Datasets

Download Features

Download Proposal Lists (ActivityNet)

Training PGCN

Testing Trained Models

THUMOS14

Other Info

Citation

Contact

Comments

Owner

Runhao Zeng

Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

CVPR2021: Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition in CVPR19

SSL_SLAM2: Lightweight 3-D Localization and Mapping for Solid-State LiDAR (mapping and localization separated) ICRA 2021

Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX.

This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of Coordinate Independent Convolutional Networks.

TDN: Temporal Difference Networks for Efficient Action Recognition

Efficient Two-Step Networks for Temporal Action Segmentation (Neurocomputing 2021)

This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning].

Spontaneous Facial Micro Expression Recognition using 3D Spatio-Temporal Convolutional Neural Networks

Sequence modeling benchmarks and temporal convolutional networks

Codes for TIM2021 paper "Anchor-Based Spatio-Temporal Attention 3-D Convolutional Networks for Dynamic 3-D Point Cloud Sequences"

Allows including an action inside another action (by preprocessing the Yaml file). This is how composite actions should have worked.

Official implementation of ACTION-Net: Multipath Excitation for Action Recognition (CVPR'21).

Human Action Controller - A human action controller running on different platforms.

The official TensorFlow implementation of the paper Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition

A pytorch-version implementation codes of paper: "BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation"

AEI: Actors-Environment Interaction with Adaptive Attention for Temporal Action Proposals Generation

Official PyTorch implementation of "AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks"