Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding

Joya Chen

Last update: Dec 31, 2022

Related tags

Deep Learning 2dtan

Overview

2D-TAN (Optimized)

Introduction

This is an optimized re-implementation repository for AAAI'2020 paper: Learning 2D Temporal Localization Networks for Moment Localization with Natural Language.

We show advantages in speed and performance compared with the official implementation (https://github.com/microsoft/2D-TAN).

Comparison

Performance: Better Results

1. TACoS Dataset

Repo	[email protected]	[email protected]	[email protected]	[email protected]	[email protected]	[email protected]
Official	47.59	37.29	25.32	70.31	57.81	45.04
Ours	57.54	45.36	31.87	77.88	65.83	54.29

2. ActivityNet Dataset

Repo	[email protected]	[email protected]	[email protected]	[email protected]	[email protected]	[email protected]
Official	59.45	44.51	26.54	85.53	77.13	61.96
Ours	60.00	45.25	28.62	85.80	77.25	62.11

Speed and Cost: Faster Training/Inference, Less Memory Cost

1. Speed (ActivityNet Dataset)

Repo	Training	Inferece	Required Training Epoches
Official	1.98 s/batch	0.81 s/batch	100
Ours	1.50 s/batch	0.61 s/batch	5

2. Memory Cost (ActivityNet Dataset)

Repo	Training	Inferece
Official	4*10145 MB/batch	4*3065 MB/batch
Ours	*45345 MB/batch**	*42121 MB/batch**

Note: These results are measured on 4 NVIDIA Tesla V100 GPUs, with batch size 32.

Installation

The installation for this repository is easy. Please refer to INSTALL.md.

Dataset

Please refer to DATASET.md to prepare datasets.

Quick Start

We provide scripts for simplifying training and inference. Please refer to scripts/train.sh, scripts/eval.sh.

For example, if you want to train TACoS dataset, just modifying scripts/train.sh as follows:

# find all configs in configs/
model=2dtan_128x128_pool_k5l8_tacos
# set your gpu id
gpus=0,1,2,3
# number of gpus
gpun=4
# please modify it with different value (e.g., 127.0.0.2, 29502) when you run multi 2dtan task on the same machine
master_addr=127.0.0.1
master_port=29501
...

Another example, if you want to evaluate on ActivityNet dataset, just modifying scripts/eval.sh as follows:

# find all configs in configs/
config_file=configs/2dtan_64x64_pool_k9l4_activitynet.yaml
# the dir of the saved weight
weight_dir=outputs/2dtan_64x64_pool_k9l4_activitynet
# select weight to evaluate
weight_file=model_1e.pth
# test batch size
batch_size=32
# set your gpu id
gpus=0,1,2,3
# number of gpus
gpun=4
# please modify it with different value (e.g., 127.0.0.2, 29502) when you run multi 2dtan task on the same machine
master_addr=127.0.0.2
master_port=29502
...

Support

Please open a new issue. We would like to answer it. Please feel free to contact me: [email protected] if you need my help.

Acknowledgements

We greatly appreciate the official 2D-Tan repository https://github.com/microsoft/2D-TAN and maskrcnn-benchmark https://github.com/facebookresearch/maskrcnn-benchmark. We learned a lot from them. Moreover, please remember to cite the paper:

@InProceedings{2DTAN_2020_AAAI,
author = {Zhang, Songyang and Peng, Houwen and Fu, Jianlong and Luo, Jiebo},
title = {Learning 2D Temporal Adjacent Networks forMoment Localization with Natural Language},
booktitle = {AAAI},
year = {2020}
}

Comments

Charade Dataset

Dear Joya,

Thanks for sharing the optimized code! The improvements on Charade and ActivityNet Datasets are impressive!

Do you have any plan to support Charade dataset?

opened by frostinassiky 7
ActivityNet feature corrupt?

Hi, Thanks for your great work and code. The ActivityNet feature seems to be corrupted. I tried several times and it always has this problem. Could you please help? Or could you provide the link for downloading sub_activitynet_v1-3.c3d.hdf5? Thanks.

opened by wjn922 3
Where is the improvement come from?

Hi Joya,

Thanks again for sharing your brilliant work!

From your experience, where is the improvement come from? Why does the same architecture (2D-TAN) give very different performances?

opened by frostinassiky 2
Inference performance on Tacos

Hi, I use the default setting to train the model on tacos. During testing, when the batch_size is set as 64 (which is the default setting in dafault.py). The performace is normal. However, if I change the batch_size by TEST.BATCH_SIZE 32, the performance would become extremely low. Why would this happen?

opened by wjn922 0
Inference on custom dataset

Hi, I have successfully trained the model on the ActivityNet Dataset. I wanted to run the trained model on a custom dataset. How can I create the dataset and extract features?

opened by brookn08 0
about ending automatically after training

Hi, when the model is finished training it cannot stop itself, I have to terminaterminate the taste the task, what causes this? How can I modify it so that it ends automatically after training?

opened by menghuaa 0
How to choose the best trained model

Hi, how do I select a model for testing when I have trained it? Does it rely on the loss or the test effect on the validation set? Your code does not give how to choose the best model.

opened by menghuaa 0
get nan in loss while running tacos dataset

hi，I run you config my environment with your instructions, and got torch==1.6.0. when running tocas dataset , the loss is nan. But, I can run activatynet dataset normally. Do you know the reason?

opened by lumiaomiao 5
died with

The experiments on TACOS dataset is ok. But on activitynet dataset, there are some errors when preparing data. How can I fix it? Thanks!

2020-08-18 10:03:40,666 tan.trainer INFO: Preparing data, please wait... Traceback (most recent call last): File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/usr/local/lib/python3.6/dist-packages/torch/distributed/launch.py", line 253, in main() File "/usr/local/lib/python3.6/dist-packages/torch/distributed/launch.py", line 249, in main cmd=cmd) subprocess.CalledProcessError: Command '['/usr/bin/python', '-u', '/home/.jupyter/ngsv/2dtan/train_net.py', '--local_rank=1', '--config-file', '/home/.jupyter/ngsv/2dtan/configs/2dtan_64x64_pool_k9l4_activitynet.yaml', 'OUTPUT_DIR', '/home/.jupyter/ngsv/2dtan/outputs/2dtan_64x64_pool_k9l4_activitynet']' died with <Signals.SIGKILL: 9>.

opened by balabanahei 1

Owner

Joya Chen

Hopes never die

GitHub

[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers

TubeDETR: Spatio-Temporal Video Grounding with Transformers Website • STVG Demo • Paper This repository provides the code for our paper. This includes

108 Dec 27, 2022

Yolox-bytetrack-sample - Python sample of MOT (Multiple Object Tracking) using YOLOX and ByteTrack

yolox-bytetrack-sample YOLOXとByteTrackを用いたMOT(Multiple Object Tracking)のPythonサン

12 Nov 9, 2022

Official PyTorch implementation for paper Context Matters: Graph-based Self-supervised Representation Learning for Medical Images

Context Matters: Graph-based Self-supervised Representation Learning for Medical Images Official PyTorch implementation for paper Context Matters: Gra

49 Nov 23, 2022

naked is a Python tool which allows you to strip a model and only keep what matters for making predictions.

naked is a Python tool which allows you to strip a model and only keep what matters for making predictions. The result is a pure Python function with no third-party dependencies that you can simply copy/paste wherever you wish.

24 Dec 20, 2022

Pytorch implementation of Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization https://arxiv.org/abs/2008.11646

[TCSVT] Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization LPN [Paper] NEWs Prerequisites Python 3.6 GPU Memory >= 8G Numpy > 1.

46 Dec 14, 2022

Normalization Matters in Weakly Supervised Object Localization (ICCV 2021)

Normalization Matters in Weakly Supervised Object Localization (ICCV 2021) 99% of the code in this repository originates from this link. ICCV 2021 pap

10 Feb 1, 2022

Network Pruning That Matters: A Case Study on Retraining Variants (ICLR 2021)

18 Jun 13, 2022

The Pytorch code of "Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification", CVPR 2022 (Oral).

DeepBDC for few-shot learning Introduction In this repo, we provide the implementation of the following paper: "Joint Distribution Matters: Dee

116 Dec 19, 2022

[CVPR2021] Look before you leap: learning landmark features for one-stage visual grounding.

LBYL-Net This repo implements paper Look Before You Leap: Learning Landmark Features For One-Stage Visual Grounding CVPR 2021. Getting Started Prerequ

45 Dec 12, 2022

CVPR2021: Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Temporal Context Aggregation Network - Pytorch This repo holds the pytorch-version codes of paper: "Temporal Context Aggregation Network for Temporal

63 Sep 27, 2022

Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

5 Sep 16, 2022

Cascaded Deep Video Deblurring Using Temporal Sharpness Prior and Non-local Spatial-Temporal Similarity

This repository is the official PyTorch implementation of Cascaded Deep Video Deblurring Using Temporal Sharpness Prior and Non-local Spatial-Temporal Similarity

4 Dec 11, 2022

Personal implementation of paper "Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval"

Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval This repo provides personal implementation of paper Approximate Ne

8 Oct 7, 2022

PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World [ACL 2021]

piglet PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World [ACL 2021] This repo contains code and data for PIGLeT. If you like

51 Oct 8, 2022

A PyTorch implementation of the baseline method in Panoptic Narrative Grounding (ICCV 2021 Oral)

52 Dec 19, 2022

A Fast and Accurate One-Stage Approach to Visual Grounding, ICCV 2019 (Oral)

One-Stage Visual Grounding ***** New: Our recent work on One-stage VG is available at ReSC.***** A Fast and Accurate One-Stage Approach to Visual Grou

118 Dec 5, 2022

The official implementation of CVPR 2021 Paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation This repository is the official implementation of CVPR 2021 paper:

9 Nov 14, 2022

[ICCV2021] 3DVG-Transformer: Relation Modeling for Visual Grounding on Point Clouds

3DVG-Transformer This repository is for the ICCV 2021 paper "3DVG-Transformer: Relation Modeling for Visual Grounding on Point Clouds" Our method "3DV

22 Dec 11, 2022

SeqTR: A Simple yet Universal Network for Visual Grounding

SeqTR This is the official implementation of SeqTR: A Simple yet Universal Network for Visual Grounding, which simplifies and unifies the modelling fo

76 Dec 24, 2022