TCTrack: Temporal Contexts for Aerial Tracking (CVPR2022)

Overview

TCTrack: Temporal Contexts for Aerial Tracking (CVPR2022)

Ziang Cao and Ziyuan Huang and Liang Pan and Shiwei Zhang and Ziwei Liu and Changhong Fu

In CVPR, 2022.

[paper]

Abstract

Temporal contexts among consecutive frames are far from being fully utilized in existing visual trackers. In this work, we present TCTrack, a comprehensive framework to fully exploit temporal contexts for aerial tracking. The temporal contexts are incorporated at two levels: the extraction of features and the refinement of similarity maps. Specifically, for feature extraction, an online temporally adaptive convolution is proposed to enhance the spatial features using temporal information, which is achieved by dynamically calibrating the convolution weights according to the previous frames. For similarity map refinement, we propose an adaptive temporal transformer, which first effectively encodes temporal knowledge in a memory-efficient way, before the temporal knowledge is decoded for accurate adjustment of the similarity map. TCTrack is effective and efficient: evaluation on four aerial tracking benchmarks shows its impressive performance; real-world UAV tests show its high speed of over 27 FPS on NVIDIA Jetson AGX Xavier.

Workflow of our tracker

The implementation of our online temporally adaptive convolution is based on TadaConv (ICLR2022).

1. Environment setup

This code has been tested on Ubuntu 18.04, Python 3.8.3, Pytorch 0.7.0/1.6.0, CUDA 10.2. Please install related libraries before running this code:

pip install -r requirements.txt

2. Test

Download pretrained model by Baidu (code: 2u1l) or Googledrive and put it into tools/snapshot directory.

Download testing datasets and put them into test_dataset directory.

python ./tools/test.py                                
	--dataset UAV123_10fps                  
    --tracker_name TCTrack
	--snapshot snapshot/general_model.pth # pre-train model path

The testing result will be saved in the results/dataset_name/tracker_name directory.

Note: The results of TCTrack can be downloaded (code:kh3e).

3. Train

Prepare training datasets

Download the datasets:

Note: train_dataset/dataset_name/readme.md has listed detailed operations about how to generate training datasets.

Train a model

To train the TCTrack model, run train.py with the desired configs:

cd tools
python train.py

4. Evaluation

If you want to evaluate the results of our tracker, please put those results into results directory.

python eval.py 	                          \
	--tracker_path ./results          \ # result path
	--dataset UAV10fps                  \ # dataset_name
	--tracker_prefix 'general_model'   # tracker_name

Note: The code is implemented based on pysot-toolkit. We would like to express our sincere thanks to the contributors.

Demo video

TCTrack

References

@article{cao2022tctrack,
  title={{TCTrack: Temporal Contexts for Aerial Tracking}},
  author={Cao, Ziang and Huang, Ziyuan and Pan, Liang and Zhang, Shiwei and Liu, Ziwei and Fu, Changhong},
  journal={arXiv preprint arXiv:2203.01885},
  year={2022}
}

Acknowledgement

The code is implemented based on pysot. We would like to express our sincere thanks to the contributors.

Comments
  • Changing the length of sequences

    Changing the length of sequences

    Hi,

    Congrats on achieving this amazing result. I was so impressed by the speed of this tracker. However, I'm facing a serious accuracy loss when the object becomes occluded (fully, not partial). As can be seen in this video, when the bus goes under the bridge, TCTrack cannot track it correctly anymore. I'm getting the same result with some other aerial videos, while Stark can still handle all of those.

    I'm thinking of improving this by trading a bit of performance for accuracy and extending the number of frames that the model saves. Doing so could help save more temporal information and potentially catch the correct object. Is it possible to extend the number L? If so, can you provide some hints on where should I modify?

    Thank you.

    opened by notnitsuj 3
  • About test_dataset

    About test_dataset

    Could you provide the code for the preprocessing of the UAV123 dataset and other test dataset? I can't find the .json file in the raw UAV123 dataset, so maybe you have preprocessed it.

    opened by Leiyi-Hu 2
  • Can you provide result files for other comparison algorithms?

    Can you provide result files for other comparison algorithms?

    Thank you for your open source work in the field of UAV Tracking. We want to generate the results of figure 6 in your paper, and can you provide the result files for other comparison algorithms? Very Grateful!

    opened by HonglinChu 1
  • A more accessible download link

    A more accessible download link

    Dear Ziang, first of all thanks for releasing the code for this promising pproach and making it publicly available. Is it possible to upload the pretrained model file to a more accessible platform, such as google drive or dropbox? That would be great, as it is not that straigthforward to download files from Baidu outside China, as far as I know. At least, my attempts were not successful so far. Thanks in advance!

    opened by zanilzanzan 1
  • AttributeError: Caught AttributeError in DataLoader worker process 0.

    AttributeError: Caught AttributeError in DataLoader worker process 0.

    Traceback (most recent call last): File "train_tctrack.py", line 288, in main() File "train_tctrack.py", line 283, in main train(train_loader, dist_model, optimizer, lr_scheduler, tb_writer) File "train_tctrack.py", line 132, in train for idx, data in enumerate(train_loader): File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 363, in next data = self._next_data() File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 989, in _next_data return self._process_data(data) File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 1014, in _process_data data.reraise() File "/usr/local/lib/python3.6/dist-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg) AttributeError: Caught AttributeError in DataLoader worker process 0. Original Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 185, in _worker_loop data = fetcher.fetch(index) File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/mnt/TCTrack-main/pysot/datasets/dataset.py", line 323, in getitem current_templatebox = self._get_bbox(current_templateimage, current[0][1]) File "/mnt/TCTrack-main/pysot/datasets/dataset.py", line 277, in _get_bbox imh, imw = image.shape[:2] AttributeError: 'NoneType' object has no attribute 'shape'

    opened by Old-detective 1
Owner
Intelligent Vision for Robotics in Complex Environment
Adaptive Vision for Robotics in Complex Environment
Intelligent Vision for Robotics in Complex Environment
Aerial Imagery dataset for fire detection: classification and segmentation (Unmanned Aerial Vehicle (UAV))

Aerial Imagery dataset for fire detection: classification and segmentation using Unmanned Aerial Vehicle (UAV) Title FLAME (Fire Luminosity Airborne-b

null 79 Jan 6, 2023
This is an official implementation for "Exploiting Temporal Contexts with Strided Transformer for 3D Human Pose Estimation".

Exploiting Temporal Contexts with Strided Transformer for 3D Human Pose Estimation This repo is the official implementation of Exploiting Temporal Con

Vegetabird 241 Jan 7, 2023
HiFT: Hierarchical Feature Transformer for Aerial Tracking (ICCV2021)

HiFT: Hierarchical Feature Transformer for Aerial Tracking Ziang Cao, Changhong Fu, Junjie Ye, Bowen Li, and Yiming Li Our paper is Accepted by ICCV 2

Intelligent Vision for Robotics in Complex Environment 55 Nov 23, 2022
Official code for 'Robust Siamese Object Tracking for Unmanned Aerial Manipulator' and offical introduction to UAMT100 benchmark

SiamSA: Robust Siamese Object Tracking for Unmanned Aerial Manipulator Demo video ?? Our video on Youtube and bilibili demonstrates the evaluation of

Intelligent Vision for Robotics in Complex Environment 12 Dec 18, 2022
Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts The rapid progress in 3D scene understanding has come with growing dem

Facebook Research 182 Dec 30, 2022
Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Coming soon!

ToxiChat Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Install depen

Ashutosh Baheti 11 Jan 1, 2023
A Pytorch implementation of "Splitter: Learning Node Representations that Capture Multiple Social Contexts" (WWW 2019).

Splitter ⠀⠀ A PyTorch implementation of Splitter: Learning Node Representations that Capture Multiple Social Contexts (WWW 2019). Abstract Recent inte

Benedek Rozemberczki 201 Nov 9, 2022
CVPR2021: Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Temporal Context Aggregation Network - Pytorch This repo holds the pytorch-version codes of paper: "Temporal Context Aggregation Network for Temporal

Zhiwu Qing 63 Sep 27, 2022
Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

Zhengzhong Tu 5 Sep 16, 2022
Cascaded Deep Video Deblurring Using Temporal Sharpness Prior and Non-local Spatial-Temporal Similarity

This repository is the official PyTorch implementation of Cascaded Deep Video Deblurring Using Temporal Sharpness Prior and Non-local Spatial-Temporal Similarity

hippopmonkey 4 Dec 11, 2022
git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking Ning Wang, Wengang Zhou, Jie Wang, and Houqiang Li Accepted by CVPR

NingWang 236 Dec 22, 2022
Learning Spatio-Temporal Transformer for Visual Tracking

STARK The official implementation of the paper Learning Spatio-Temporal Transformer for Visual Tracking Hiring research interns for visual transformer

Multimedia Research 484 Dec 29, 2022
Joint detection and tracking model named DEFT, or ``Detection Embeddings for Tracking.

DEFT: Detection Embeddings for Tracking DEFT: Detection Embeddings for Tracking, Mohamed Chaabane, Peter Zhang, J. Ross Beveridge, Stephen O'Hara

Mohamed Chaabane 253 Dec 18, 2022
Tracking code for the winner of track 1 in the MMP-Tracking Challenge at ICCV 2021 Workshop.

Tracking Code for the winner of track1 in MMP-Trakcing challenge This repository contains our tracking code for the Multi-camera Multiple People Track

DamoCV 29 Nov 13, 2022
Tracking Pipeline helps you to solve the tracking problem more easily

Tracking_Pipeline Tracking_Pipeline helps you to solve the tracking problem more easily I integrate detection algorithms like: Yolov5, Yolov4, YoloX,

VNOpenAI 32 Dec 21, 2022
Quadruped-command-tracking-controller - Quadruped command tracking controller (flat terrain)

Quadruped command tracking controller (flat terrain) Prepare Install RAISIM link

Yunho Kim 4 Oct 20, 2022
Python package for multiple object tracking research with focus on laboratory animals tracking.

motutils is a Python package for multiple object tracking research with focus on laboratory animals tracking. Features loads: MOTChallenge CSV, sleap

Matěj Šmíd 2 Sep 5, 2022
This is an official implementation of the CVPR2022 paper "Blind2Unblind: Self-Supervised Image Denoising with Visible Blind Spots".

Blind2Unblind: Self-Supervised Image Denoising with Visible Blind Spots Blind2Unblind Citing Blind2Unblind @inproceedings{wang2022blind2unblind, tit

demonsjin 58 Dec 6, 2022