git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

NingWang

Last update: Dec 22, 2022

Related tags

Deep Learning TransformerTrack

Overview

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking

Ning Wang, Wengang Zhou, Jie Wang, and Houqiang Li

Accepted by CVPR 2021 (Oral). [Paper Link]

This repository includes Python (PyTorch) implementation of the TrDiMP and TrSiam trackers, to appear in CVPR 2021.

Abstract

In video object tracking, there exist rich temporal contexts among successive frames, which have been largely overlooked in existing trackers. In this work, we bridge the individual video frames and explore the temporal contexts across them via a transformer architecture for robust object tracking. Different from classic usage of the transformer in natural language processing tasks, we separate its encoder and decoder into two parallel branches and carefully design them within the Siamese-like tracking pipelines. The transformer encoder promotes the target templates via attention-based feature reinforcement, which benefits the high-quality tracking model generation. The transformer decoder propagates the tracking cues from previous templates to the current frame, which facilitates the object searching process. Our transformer-assisted tracking framework is neat and trained in an end-to-end manner. With the proposed transformer, a simple Siamese matching approach is able to outperform the current top-performing trackers. By combining our transformer with the recent discriminative tracking pipeline, our method sets several new state-of-the-art records on prevalent tracking benchmarks.

Tracking Results and Pretrained Model

Tracking results: the raw results of TrDiMP/TrSiam on 7 benchmarks including OTB, UAV, NFS, VOT2018, GOT-10k, TrackingNet, and LaSOT can be found here.

Pretrained model: please download the TrDiMP model and put it in the pytracking/networks folder.

TrDiMP and TrSiam share the same model. The main difference between TrDiMP and TrSiam lies in the tracking model generation. TrSiam does not utilize the background information and simply crops the target/foreground area to generate the tracking model, which can be regarded as the initialization step of TrDiMP.

Environment Setup

Clone the GIT repository.

git clone https://github.com/594422814/TransformerTrack.git

Clone the submodules.

In the repository directory, run the commands:

git submodule update --init

Install dependencies

Run the installation script to install all the dependencies. You need to provide the conda install path (e.g. ~/anaconda3) and the name for the created conda environment (here pytracking).

bash install.sh conda_install_path pytracking

This script will also download the default networks and set-up the environment.

Note: The install script has been tested on an Ubuntu 18.04 system. In case of issues, check the detailed installation instructions.

Our code is based on the PyTracking framework. For more details, please refer to PyTracking.

Training the TrDiMP/TrSiam Model

Please refer to the README in the ltr folder.

Testing the TrDiMP/TrSiam Tracker

Please refer to the README in the pytracking folder. As shown in pytracking/README.md, you can either use this PyTracking toolkit or GOT-10k toolkit to reproduce the tracking results.

Citation

If you find this work useful for your research, please consider citing our work:

@inproceedings{Wang_2021_Transformer,
    title={Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking},
    author={Wang, Ning and Zhou, Wengang and Wang, Jie and Li, Houqiang},
    booktitle={The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2021}
}

Acknowledgment

Our transformer-assisted tracker is based on PyTracking. We sincerely thank the authors Martin Danelljan and Goutam Bhat for providing this great framework.

Contact

If you have any questions, please feel free to contact [email protected]

Comments

some qusetions on testing VOT18

when i use GOT10K_VOT.py to test VOT18, some errors will happen in the ./pytracking/tracker/trdimp.py, it seems some parameters do not match the test of vot18. But i use GOT10K_GOT.py to test GOT_10k, this errors didn't happen, because ./pytracking/tracker/trdimp_for_GOT.py works. Do you have specialied trdimp_for_VOT.py that uses on testing VOT18? Your prompt attention to my question would be highly appreciated!

opened by yuyudiandian 4
Difference between TrSiam and TrDiMP

Hi,

Thanks for your good work!

When I was reading your paper and your code I found that during the training phase, there is no difference between TrSiam and TrDimp. I want to to know if the difference between thiese two algorithms only occurs during the tracking phase? For TrSiam, you also used the DCF during the training phase. Is this true?

Thanks

opened by YanWanquan 3
Training

In your code I found that you import Lasot, Got10k, TrackingNet and MSCOCOSeq. I want to know if you used all these datasets for training.

In your paper, you said the batch size you used is 36 image pairs and there is total 1500 iterations per epoch. However, 1500*36=1,8000 which is much smaller thant the number of images in CoCo dataset. I remember batch_size * iterations = dataset_sizse. I am a newer in the tracking field, and just a little confused about this. Thanks.

opened by YanWanquan 3
what is your hardware for trainning?

what is your hardware for trainning? Super_dimp use one TATIAN X for training. And your traning setting is slight different， I wonder what's the reason。

opened by Lightning980729 3
No matching checkpoint file found

I run with: python run_tracker.py trdimp trdimp --sequence Biker

The reported error is : "No matching checkpoint file found"

How should I repair this? or How should I set the checkpoint file?

Thanks a lot.

opened by gyc9709 3
some questions on testing

I did some changes on the structure of code，but when testing GOT or OTB ，it will happen “cuda memory is not enough ”。Same question not happened on testing VOT18，so do you have some suggestions on this question？Or how to reduce the the cuda memory on testing dataset？Thank you.

opened by yuyudiandian 2
no matching checkpoint file found

Hi, when I want to run python run_training dimp transformer_dimp, it always shows no matching checkpoint file found. Does this model need a checkpoint file before training? If yes, where can I download this file?

When I run to the 52 line of ltr_trainer.py file: for i, data in enumerate(loader,1), it shows an error: out of memory. My computer has 16G memory, and my GPU has 32G memory, so I think it is large enough to run the model. Can you give any suggestions about this problem? Thanks.

opened by YanWanquan 2
Question about FFN

Thanks for your work! I noticed that in the paper, you claimed that "To achieve a good balance of speed and performance, we slim the classic transformer by omitting the fully-connected feed-forward layers and maintaining the lightweight single-head attention", and it seemed that you also do not use the FFN layer in your code. I'm wondering how about the performance regardless of speed if you use the classic transformer including FFN ？

opened by 3bobo 2
使用got10k的测试工具报错

segmentation_dir: /trdimp/trdimp_vot Files already downloaded. Running tracker GOT_Tracker on VOT... Running supervised experiment... --Sequence 1/60: ants1 Repetition: 1 Traceback (most recent call last): File "GOT10k_VOT.py", line 45, in experiment.run(tracker, visualize=False) File "/home/admin312/anaconda3/envs/pytracking/lib/python3.7/site-packages/got10k/experiments/vot.py", line 71, in run self.run_supervised(tracker, visualize) File "/home/admin312/anaconda3/envs/pytracking/lib/python3.7/site-packages/got10k/experiments/vot.py", line 126, in run_supervised tracker.init(frame, anno_rects[0]) File "GOT10k_VOT.py", line 32, in init self.tracker.initialize(image, box) File "../pytracking/tracker/trdimp/trdimp.py", line 55, in initialize state = info['init_bbox'] IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices

看上去是版本导致的错误？请问作者有遇到过或者有修改的思路吗

opened by 1145284121 1

训练和测试运行不报错，程序停止了

您好，在使用您的代码过程中，我使用了自己的数据作为训练和测试。但是在训练或者测试过程中出现如下提示，但是不报错，程序也不在运行了。

/home/miniconda3/envs/pytracking/bin/python /home/TransformerTrack-main/pytracking/run_tracker.py trdimp trsiam --dataset eotb --sequence val
Evaluating    1 trackers on    31 sequences
Tracker: trdimp trsiam None ,  Sequence: test_seq_474
/home/miniconda3/envs/pytracking/lib/python3.8/site-packages/torch/nn/functional.py:3060: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
  warnings.warn("Default upsampling behavior when mode={} is changed "
Using /home/.cache/torch_extensions as PyTorch extensions root...

因为DIMP代码机制的问题，在训练的时候，如果ctrl+c代码则会重新运行，且运行正常。但是测试的时候不会。请问这是什么原因呢？

opened by Jee-King 1

Some questions about train setting?

Some questions about your baseline? It seems that you use the same parameter of SuperDiMP, but not DiMP. Based on the SuperDiMP does achieves the performance described in your paper, but the performance of DiMP is doubtful. However, your paper is elaborated from the perspective of temporal feature, which is novel and great.

opened by yxxxqqq 1
Question about source code of Transformer part

As illustrated in paper, Encoder is used to ehanced template feature(train_feat in dimp) only, and Decoder is used to produce decoded search feature. But in the transformer's forward function, decoder is also used on train_feat, which is not described in the paper. Could you please explain why?

opened by ChuzzZz 0
linear transformation for key and query

I noticed that linear transformations used to reduce the dimension of key and query share same weight. Is it out of computaition consideration or used different weight degrades performance?

opened by ChuzzZz 0
Problems in Training process

Hi, when I run python run_training dimp transformer_dimp, it shows ‘No matching checkpoint file found’ and ‘ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])’. The batch_size is set to 6. I don’t know why it change to 1 in the training process. Do you have any solutions to this problem? Thanks very much!

The error message reported during the training process is shown below.

_Training: dimp transformer_dimp No matching checkpoint file found Using /tmp/torch_extensions as PyTorch extensions root...Using /tmp/torch_extensions as PyTorch extensions root...

Detected CUDA files, patching ldflags Emitting ninja build file /tmp/torch_extensions/_prroi_pooling/build.ninja... Building extension module _prroi_pooling... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) Using /tmp/torch_extensions as PyTorch extensions root... No modifications detected for re-loaded extension module _prroi_pooling, skipping build step... Loading extension module _prroi_pooling... ninja: no work to do. Loading extension module _prroi_pooling... Loading extension module _prroi_pooling... Training crashed at epoch 1 Traceback for the error! Traceback (most recent call last): File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/trainers/base_trainer.py", line 70, in train self.train_epoch() File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/trainers/ltr_trainer.py", line 80, in train_epoch self.cycle_dataset(loader) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/trainers/ltr_trainer.py", line 61, in cycle_dataset loss, stats = self.actor(data) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/actors/tracking.py", line 97, in call test_proposals=data['test_proposals']) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 155, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 165, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply output.reraise() File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg) ValueError: Caught ValueError in replica 0 on device 0. Original Traceback (most recent call last): File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker output = module(*input, **kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/models/tracking/dimpnet.py", line 75, in forward iou_pred = self.bb_regressor(train_feat_iou, test_feat_iou, train_bb, test_proposals) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/models/bbreg/atom_iou_net.py", line 86, in forward modulation = self.get_modulation(feat1, bb1) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/models/bbreg/atom_iou_net.py", line 162, in get_modulation fc3_r = self.fc3_1r(roi3r) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/container.py", line 100, in forward input = module(input) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 106, in forward exponential_average_factor, self.eps) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/functional.py", line 1919, in batch_norm _verify_batch_size(input.size()) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/functional.py", line 1902, in verify_batch_size raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size)) ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])

opened by Chenlulu1993 2

Releases(model)

model(Mar 17, 2021)

Source code(tar.gz)
Source code(zip)
trdimp_net.pth.tar(360.25 MB)
trdimp_net_new.pth.tar(360.35 MB)
results(Mar 17, 2021)

Source code(tar.gz)
Source code(zip)
Tracking_results.zip(63.77 MB)

Owner

NingWang

PhD student in University of Science and Technology of China (USTC)

GitHub

This is an official implementation for "Exploiting Temporal Contexts with Strided Transformer for 3D Human Pose Estimation".

Exploiting Temporal Contexts with Strided Transformer for 3D Human Pose Estimation This repo is the official implementation of Exploiting Temporal Con

241 Jan 7, 2023

Estimating and Exploiting the Aleatoric Uncertainty in Surface Normal Estimation

95 Jan 4, 2023

MOT-Tracking-by-Detection-Pipeline - For Tracking-by-Detection format MOT (Multi Object Tracking), is it a framework that separates Detection and Tracking processes?

MOT-Tracking-by-Detection-Pipeline Tracking-by-Detection形式のMOT(Multi Object Trac

41 Nov 23, 2022

Learning Spatio-Temporal Transformer for Visual Tracking

STARK The official implementation of the paper Learning Spatio-Temporal Transformer for Visual Tracking Hiring research interns for visual transformer

484 Dec 29, 2022

Python and C++ implementation of "MarkerPose: Robust real-time planar target tracking for accurate stereo pose estimation". Accepted at LXCV @ CVPR 2021.

MarkerPose: Robust real-time planar target tracking for accurate stereo pose estimation This is a PyTorch and LibTorch implementation of MarkerPose: a

47 Nov 18, 2022

CVPR2021: Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Temporal Context Aggregation Network - Pytorch This repo holds the pytorch-version codes of paper: "Temporal Context Aggregation Network for Temporal

63 Sep 27, 2022

Self-Learned Video Rain Streak Removal: When Cyclic Consistency Meets Temporal Correspondence

In this paper, we address the problem of rain streaks removal in video by developing a self-learned rain streak removal method, which does not require any clean groundtruth images in the training process.

44 Dec 6, 2022

AI-Fitness-Tracker - AI Fitness Tracker With Python

AI-Fitness-Tracker We have build a AI based Fitness Tracker using OpenCV and Pyt

5 Feb 9, 2022

Visual Tracking by TridenAlign and Context Embedding

Visual Tracking by TridentAlign and Context Embedding (TACT) Test code for "Visual Tracking by TridentAlign and Context Embedding" Janghoon Choi, Juns

32 Aug 25, 2021

git《Commonsense Knowledge Base Completion with Structural and Semantic Context》(AAAI 2020) GitHub: [fig1]

Commonsense Knowledge Base Completion with Structural and Semantic Context Code for the paper Commonsense Knowledge Base Completion with Structural an

96 Nov 5, 2022

Official PyTorch implementation of UACANet: Uncertainty Aware Context Attention for Polyp Segmentation

UACANet: Uncertainty Aware Context Attention for Polyp Segmentation Official pytorch implementation of UACANet: Uncertainty Aware Context Attention fo

85 Dec 14, 2022

Exploiting Robust Unsupervised Video Person Re-identification

Exploiting Robust Unsupervised Video Person Re-identification Implementation of the proposed uPMnet. For the preprint, please refer to [Arxiv]. Gettin

1 Apr 9, 2022

This repository is the offical Pytorch implementation of ContextPose: Context Modeling in 3D Human Pose Estimation: A Unified Perspective (CVPR 2021).

Context Modeling in 3D Human Pose Estimation: A Unified Perspective (CVPR 2021) Introduction This repository is the offical Pytorch implementation of

37 Nov 21, 2022

Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

Noise Contrastive Estimation for pyTorch Overview This repository contains a re-implementation of the Noise Contrastive Estimation algorithm, implemen

42 Nov 24, 2022

git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Related tags

Overview

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking

Accepted by CVPR 2021 (Oral). [Paper Link]

Abstract

Tracking Results and Pretrained Model

Environment Setup

Clone the GIT repository.

Clone the submodules.

Install dependencies

Training the TrDiMP/TrSiam Model

Testing the TrDiMP/TrSiam Tracker

Citation

Acknowledgment

Contact

Comments

Releases(model)

model(Mar 17, 2021)

results(Mar 17, 2021)

Owner

NingWang

This is an official implementation for "Exploiting Temporal Contexts with Strided Transformer for 3D Human Pose Estimation".

Estimating and Exploiting the Aleatoric Uncertainty in Surface Normal Estimation

MOT-Tracking-by-Detection-Pipeline - For Tracking-by-Detection format MOT (Multi Object Tracking), is it a framework that separates Detection and Tracking processes?

Learning Spatio-Temporal Transformer for Visual Tracking

Python and C++ implementation of "MarkerPose: Robust real-time planar target tracking for accurate stereo pose estimation". Accepted at LXCV @ CVPR 2021.

CVPR2021: Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Self-Learned Video Rain Streak Removal: When Cyclic Consistency Meets Temporal Correspondence

AI-Fitness-Tracker - AI Fitness Tracker With Python

Visual Tracking by TridenAlign and Context Embedding

git《Commonsense Knowledge Base Completion with Structural and Semantic Context》(AAAI 2020) GitHub: [fig1]

Official PyTorch implementation of UACANet: Uncertainty Aware Context Attention for Polyp Segmentation

Exploiting Robust Unsupervised Video Person Re-identification

This repository is the offical Pytorch implementation of ContextPose: Context Modeling in 3D Human Pose Estimation: A Unified Perspective (CVPR 2021).

VID-Fusion: Robust Visual-Inertial-Dynamics Odometry for Accurate External Force Estimation

This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

ESTDepth: Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks (CVPR 2021)

git《Self-Attention Attribution: Interpreting Information Interactions Inside Transformer》(AAAI 2021) GitHub:

Object tracking using YOLO and a tracker(KCF, MOSSE, CSRT) in openCV

Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)