STMTrack: Template-free Visual Tracking with Space-time Memory Networks

Zhihong Fu

Last update: Dec 21, 2022

Related tags

Deep Learning STMTrack

Overview

STMTrack

This is the official implementation of the paper: STMTrack: Template-free Visual Tracking with Space-time Memory Networks.

Setup

Prepare Anaconda, CUDA and the corresponding toolkits. CUDA version required: 10.0+
Create a new conda environment and activate it.

conda create -n STMTrack python=3.7 -y
conda activate STMTrack

Install pytorch and torchvision.

conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.0 -c pytorch
# pytorch v1.5.0, v1.6.0, or higher should also be OK.

Install other required packages.

pip install -r requirements.txt

Test

Prepare the datasets: OTB2015, VOT2018, UAV123, GOT-10k, TrackingNet, LaSOT, ILSVRC VID*, ILSVRC DET*, COCO*, and something else you want to test. Set the paths as the following:

├── STMTrack
|   ├── ...
|   ├── ...
|   ├── datasets
|   |   ├── COCO -> /opt/data/COCO
|   |   ├── GOT-10k -> /opt/data/GOT-10k
|   |   ├── ILSVRC2015 -> /opt/data/ILSVRC2015
|   |   ├── LaSOT -> /opt/data/LaSOT/LaSOTBenchmark
|   |   ├── OTB
|   |   |   └── OTB2015 -> /opt/data/OTB2015
|   |   ├── TrackingNet -> /opt/data/TrackingNet
|   |   ├── UAV123 -> /opt/data/UAV123/UAV123
|   |   ├── VOT
|   |   |   ├── vot2018
|   |   |   |   ├── VOT2018 -> /opt/data/VOT2018
|   |   |   |   └── VOT2018.json

Notes

i. Star notation(*): just for training. You can ignore these datasets if you just want to test the tracker.

ii. In this case, we create soft links for every dataset. The real storage location of all datasets is /opt/data/. You can change them according to your situation.

iii. The VOT2018.json file can be download from here.

Download the models we trained.

📎 GOT-10k model 📎 fulldata model
Use the path of the trained model to set the pretrain_model_path item in the configuration file correctly, then run the shell command.
Note that all paths we used here are relative, not absolute. See any configuration file in the experiments directory for examples and details.

General command format

python main/test.py --config testing_dataset_config_file_path

Take GOT-10k as an example:

python main/test.py --config experiments/stmtrack/test/got10k/stmtrack-googlenet-got.yaml

Training

Prepare the datasets as described in the last subsection.
Download the pretrained backbone model from here.
Run the shell command.

training based on the GOT-10k benchmark

python main/train.py --config experiments/stmtrack/train/got10k/stmtrack-googlenet-trn.yaml

training with full data

python main/train.py --config experiments/stmtrack/train/fulldata/stmtrack-googlenet-trn-fulldata.yaml

Testing Results

Click here to download all the following.

OTB2015
GOT-10k
LaSOT
TrackingNet
UAV123
TNL2K
- evaluated by @Xiao Wang.
- The results can be downloaded from Google Drive. See issue #2 for more details.

Acknowledgement

Repository

This repository is developed based on the single object tracking framework video_analyst. See it for more instructions and details.

References

@inproceedings{fu2021stmtrack,
  title={STMTrack: Template-free Visual Tracking with Space-time Memory Networks},
  author={Fu, Zhihong and Liu, Qingjie and Fu, Zehua and Wang, Yunhong},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={13774--13783},
  year={2021}
}

Contact

Zhihong Fu@fzh0917

If you have any questions, just create issues or email me 😄 .

Comments

If I train in full-dataset,but batchsize<=32,how I set the hyperparameters?

hellow，author. I want to train this model in full-data，but I just have a rtx3090.It just has 24GB memory.How to set the hyperparameters to get the best result? Thank you.

opened by hekaijie123 7
Issues about the json files of UAV123 datasets

Dear author, Should I put the JSON file of the UAV dataset in the same level directory of the uav123 python file? And, where is the the json files of UAV123 datasets .Could you share it? Thanks a lot!

opened by 20713 6
About receptive field

Thanks for your work again. I encountered a problem while using your network . The feature map output by your network neck is 1717, so six 33 convolutional layers can still meet the requirements of the receptive field. If the size of the input picture increases and the feature map increases, and the tracking target itself is relatively large, can it be possible to expand the receptive field by stacking 3*3 convolutional layers? Or is there any better solution?

opened by zzg-zzg 4
Issue about got-10k Training

Hi, author. Thanks for your nice work. I excute offline training for the got10k benchmark. To be specific, I set the number of samples per epoch to 150,000 and the mini-batch size to 15, other settings are same with your code. The whole training phase takes about 1.5 days with one NVIDIA Tesla V100 GPU. But, the evaluation looks bad, Do you meets this problem? or do you have some suggestions.

opened by DavidZhangdw 3
issue about the dataset of GOT-10k

Dear Author,I have read your paper carefully, but I am a beginner of visual tracking area. I want to test the pre-trained model. But when I run the command

python main/test.py --config experiments/stmtrack/test/got10k/stmtrack-googlenet-got.yaml

I meet

Exception: Dataset /STMTrack/datasets/GOT-10k/test/list.txt not found or corrupted.

But I have download the got-10k dataset of your link https://drive.google.com/file/d/17wQ9lvEa4jLhv72TZatw03EHUx9UgTcA/view?usp=sharing and then it come many files when I unzip it How can I solve this problem?Thanks

opened by Zexuanxexuan 3
A typo in your papper

Thank you for your excellent work. I read your paper and found an obvious typo. In the last paragraph of section 3.3 ，STM's citation number should be [37] and GraghMemVOS should be [30]. Another question I want to know is whether dividing the features extracted by keys and values will affect the detection of occluded objects. What should I do if I want the bbox to contain the occluded part？
good first issue

opened by zzg-zzg 2
when I test in VOT,it always has a error. please help me

when I test in VOT, it worked fine at first, but failed after some tests.Here are some information for when something goes wrong.

2021-12-12 05:07:47.807 | INFO | videoanalyst.engine.tester.tester_impl.vot:track_single_video:296 - (59) Video: zebrafish1 Time: 2.5s Speed: 151.2fps Lost: 5 2%|█▍ | 1/60 [28:21<27:53:19, 1701.68s/it]2021-12-12 05:07:47.808 | INFO | videoanalyst.engine.tester.tester_impl.vot:run_tracker:147 - Total Lost: 120 2021-12-12 05:07:47.808 | INFO | videoanalyst.engine.tester.tester_impl.vot:run_tracker:148 - Mean Speed: 176.03 FPS loading VOT2018: 0%| | 0/4 [00:00<?, ?it/s, homepage] Traceback (most recent call last): | 0/4 [00:00<?, ?it/s, homepage] File "main/test.py", line 77, in tester.test() File "/home/hkj/Desktop/video_analyst/videoanalyst/engine/tester/tester_impl/vot.py", line 89, in test test_result_dict = self.evaluation() File "/home/hkj/Desktop/video_analyst/videoanalyst/engine/tester/tester_impl/vot.py", line 191, in evaluation dataset = vot_benchmark.VOTDataset( File "/home/hkj/Desktop/video_analyst/videoanalyst/evaluation/vot_benchmark/pysot/datasets/vot.py", line 113, in init video, dataset_root, meta_data[video]['video_dir'], TypeError: string indices must be integers 100%|███████████████████████████████████████████████████████████████████████████████████████| 60/60 [28:21<00:00, 28.36s/it]

opened by hekaijie123 1
Hi, can I add your wechat to have a deeper talk on this work?

Hi, can I add your WeChat to have a deeper talk on this work? Currently, I am extending this tracker into dual-modality version. My wechat is wangxiao5791509. Thanks.

opened by wangxiao5791509 1
KeyError: 'Non-existent config key: train.track.data.transformer.RandomCropTransformer.x_size'

作者你好，我跑了您的代码，发现一些代码中缺少了x_size这一个参数（比如RandomCropTransformer这个类下面的default_hyper_params，DenseboxTarget这个类下同样缺失），导致在train函数中root_cfg.merge_from_file(exp_cfg_path)出现错误，请问这些地方的x_size应该设为多少？

opened by n1-k0 1

visualization issue

Hi, thanks for sharing your code. I find a bug on visualization: line 361 and line 370. 'frame_idx' -> 'cur_frame_idx'.

    if self._hp_visualization:
        score1 = tensor_to_numpy(score[0])[:, 0]
        vsm.visualize(score1, self._hp_score_size, im_q_crop, self._state['cur_frame_idx'], 'raw_score')

opened by wangxiao5791509 1

About GOT-10k train and test

Thanks for your excellent work. I meet a problem when I want to reproduce the results in the paper. When I used GOT alone for training, the iou suppression remained at about 0.38 and there was no improvement. I wonder if there was something wrong with the configuration?

opened by wjc0602 6
About Pixel-wise

May I ask if you have compared the performance difference between Pixel-wise Correlation and Depth-wise Correlation in the process of the experiment？

And their time-consuming?

opened by jianjiandandande 1

Owner

Zhihong Fu

Keep thinking, doing, reading and fighting.

GitHub

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

191 Dec 31, 2022

The Dual Memory is build from a simple CNN for the deep memory and Linear Regression fro the fast Memory

Simple-DMA a simple Dual Memory Architecture for classifications. based on the paper Dual-Memory Deep Learning Architectures for Lifelong Learning of

1 Jan 27, 2022

Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation

STCN Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang [a

456 Dec 12, 2022

Space Time Recurrent Memory Network - Pytorch

Space Time Recurrent Memory Network - Pytorch (wip) Implementation of Space Time Recurrent Memory Network, recurrent network competitive with attentio

50 Nov 7, 2021

Drone-based Joint Density Map Estimation, Localization and Tracking with Space-Time Multi-Scale Attention Network

DroneCrowd Paper Detection, Tracking, and Counting Meets Drones in Crowds: A Benchmark. Introduction This paper proposes a space-time multi-scale atte

98 Nov 16, 2022

Segcache: a memory-efficient and scalable in-memory key-value cache for small objects

Segcache: a memory-efficient and scalable in-memory key-value cache for small objects This repo contains the code of Segcache described in the followi

78 Jan 7, 2023

PyTorch Code of "Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spatiotemporal Dynamics"

Memory In Memory Networks It is based on the paper Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spati

12 May 30, 2022

Episodic-memory - Ego4D Episodic Memory Benchmark

Ego4D Episodic Memory Benchmark EGO4D is the world's largest egocentric (first p

3 Feb 18, 2022

Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Memory Efficient Attention Pytorch Implementation of a memory efficient multi-head attention as proposed in the paper, Self-attention Does Not Need O(

180 Jan 5, 2023

Space robot - (Course Project) Using the space robot to capture the target satellite that is disabled and spinning, then stabilize and fix it up

3 Jan 7, 2022

STMTrack: Template-free Visual Tracking with Space-time Memory Networks

Related tags

Overview

STMTrack

Setup

Test

General command format

Training

training based on the GOT-10k benchmark

training with full data

Testing Results

Acknowledgement

Repository

References

Contact

Comments

Owner

Zhihong Fu

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

The Dual Memory is build from a simple CNN for the deep memory and Linear Regression fro the fast Memory

Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation

Space Time Recurrent Memory Network - Pytorch

Drone-based Joint Density Map Estimation, Localization and Tracking with Space-Time Multi-Scale Attention Network

Segcache: a memory-efficient and scalable in-memory key-value cache for small objects

PyTorch Code of "Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spatiotemporal Dynamics"

Episodic-memory - Ego4D Episodic Memory Benchmark

Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Space robot - (Course Project) Using the space robot to capture the target satellite that is disabled and spinning, then stabilize and fix it up

Implementation of MeMOT - Multi-Object Tracking with Memory - in Pytorch

The official implementation of paper Siamese Transformer Pyramid Networks for Real-Time UAV Tracking, accepted by WACV22

This project uses Template Matching technique for object detecting by detection of template image over base image.

This project uses Template Matching technique for object detecting by detection of template image over base image

Joint detection and tracking model named DEFT, or ``Detection Embeddings for Tracking.

Tracking code for the winner of track 1 in the MMP-Tracking Challenge at ICCV 2021 Workshop.

Tracking Pipeline helps you to solve the tracking problem more easily

Quadruped-command-tracking-controller - Quadruped command tracking controller (flat terrain)

Python package for multiple object tracking research with focus on laboratory animals tracking.