HiFT: Hierarchical Feature Transformer for Aerial Tracking (ICCV2021)

Related tags

Deep Learning HiFT

Overview

HiFT: Hierarchical Feature Transformer for Aerial Tracking

Ziang Cao, Changhong Fu, Junjie Ye, Bowen Li, and Yiming Li

Our paper is Accepted by ICCV 2021.

Abstract

Most existing Siamese-based tracking methods execute the classification and regression of the target object based on the similarity maps. However, they either employ a single map from the last convolutional layer which degrades the localization accuracy in complex scenarios or separately use multiple maps for decision making, introducing intractable computations for aerial mobile platforms. Thus, in this work, we propose an efficient and effective hierarchical feature transformer (HiFT) for aerial tracking. Hierarchical similarity maps generated by multi-level convolutional layers are fed into the feature transformer to achieve the interactive fusion of spatial (shallow layers) and semantics cues (deep layers). Consequently, not only the global contextual information can be raised, facilitating the target search, but also our end-to-end architecture with the transformer can efficiently learn the interdependencies among multi-level features, thereby discovering a tracking-tailored feature space with strong discriminability. Comprehensive evaluations on four aerial benchmarks have proven the effectiveness of HiFT. Real-world tests on the aerial platform have strongly validated its practicability with a real-time speed.

This figure shows the workflow of our tracker.

About Code

1. Environment setup

This code has been tested on Ubuntu 18.04, Python 3.8.3, Pytorch 0.7.0/1.6.0, CUDA 10.2. Please install related libraries before running this code:

pip install -r requirements.txt

2. Test

Download pretrained model: general_model(code: c99t) and put it into tools/snapshot directory.

Download testing datasets and put them into test_dataset directory. If you want to test the tracker on a new dataset, please refer to pysot-toolkit to set test_dataset.

python test.py                                
	--dataset UAV10fps                 #dataset_name
	--snapshot snapshot/general_model.pth  # tracker_name

The testing result will be saved in the results/dataset_name/tracker_name directory.

3. Train

Prepare training datasets

Download the datasets：

VID
YOUTUBEBB (code: t7j8)
COCO
GOT-10K

Note: train_dataset/dataset_name/readme.md has listed detailed operations about how to generate training datasets.

Train a model

To train the SiamAPN model, run train.py with the desired configs:

cd tools
python train.py

4. Evaluation

We provide the tracking results (code: tj12) of UAV123@10fps, DTB70, UAV20L, and UAV123. If you want to evaluate the tracker, please put those results into results directory.

python eval.py 	                          \
	--tracker_path ./results          \ # result path
	--dataset UAV20                  \ # dataset_name
	--tracker_prefix 'general_model'   # tracker_name

5. Contact

If you have any questions, please contact me.

Ziang Cao

Email: 1753419@tongji.edu.cn

Qualitative Evaluation

Performance Comparison

Result on DTB70 and UAV20L

For more evaluations, please refer to our paper.

References

@INPROCEEDINGS{cao2021iccv,       
	author={Cao, Ziang and Fu, Changhong and Ye, Junjie and Li, Bowen and Li, Yiming},   
	booktitle={Proceedings of the IEEE International Conference on Computer Vision (ICCV)}, 
	title={{HiFT: Hierarchical Feature Transformer for Aerial Tracking}},
	year={2021},
	volume={},
	number={},
	pages={1-10}
}

Acknowledgement

The code is implemented based on pysot. We would like to express our sincere thanks to the contributors.

You might also like...

source code of “Visual Saliency Transformer” (ICCV2021)

Comments

Could you please provide the result files of comparison trackers?

We notice that only the result files of your proposed tracker are uploaded on UAV123 and DTB70. We would greatly appreciate you if you can provide the result files of other comparison trackers in your paper, which will make us cite your work in our paper much more convenient.

opened by bit-bcilab 1
几个代码问题：encoder和label

1、请问一下，我感觉这里self.gamma*weight*x应该不用再乘以x吧？否则跟论文里面不太一样，就是Cattention模块，也就是论文中的modulation layer https://github.com/vision4robotics/HiFT/blob/7f560e9ca1506f4b275f73e8c9ca4d34bec945ce/pysot/models/utile/tran.py#L32 2、这个标签生成方法是不是和SiamAPN用的是同一个？

opened by laisimiao 1

HiFT: Hierarchical Feature Transformer for Aerial Tracking (ICCV2021)

Related tags

Overview

HiFT: Hierarchical Feature Transformer for Aerial Tracking

Ziang Cao, Changhong Fu, Junjie Ye, Bowen Li, and Yiming Li

Our paper is Accepted by ICCV 2021.

Abstract

About Code

1. Environment setup

2. Test

3. Train

Prepare training datasets

Train a model

4. Evaluation

5. Contact

Qualitative Evaluation

Performance Comparison

References

Acknowledgement

You might also like...

source code of “Visual Saliency Transformer” (ICCV2021)

Official code for "Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer. ICCV2021".

[ICCV2021] 3DVG-Transformer: Relation Modeling for Visual Grounding on Point Clouds

Joint detection and tracking model named DEFT, or ``Detection Embeddings for Tracking.

Tracking code for the winner of track 1 in the MMP-Tracking Challenge at ICCV 2021 Workshop.

Tracking Pipeline helps you to solve the tracking problem more easily

Quadruped-command-tracking-controller - Quadruped command tracking controller (flat terrain)

Python package for multiple object tracking research with focus on laboratory animals tracking.

VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).

Comments

Could you please provide the result files of comparison trackers?

几个代码问题：encoder和label

Owner

Intelligent Vision for Robotics in Complex Environment

Aerial Imagery dataset for fire detection: classification and segmentation (Unmanned Aerial Vehicle (UAV))

Official code for 'Robust Siamese Object Tracking for Unmanned Aerial Manipulator' and offical introduction to UAMT100 benchmark

TCTrack: Temporal Contexts for Aerial Tracking (CVPR2022)

Unsupervised Domain Adaptation for Nighttime Aerial Tracking (CVPR2022)

Official PyTorch code for Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021)

Official PyTorch code for Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021)

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Object Detection and Instance Segmentation.

Unofficial implementation of "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" (https://arxiv.org/abs/2103.14030)

Hierarchical-Bayesian-Defense - Towards Adversarial Robustness of Bayesian Neural Network through Hierarchical Variational Inference (Openreview)

A pytorch reproduction of { Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation }.