STMTrack: Template-free Visual Tracking with Space-time Memory Networks

Overview

STMTrack

This is the official implementation of the paper: STMTrack: Template-free Visual Tracking with Space-time Memory Networks.

Setup

  • Prepare Anaconda, CUDA and the corresponding toolkits. CUDA version required: 10.0+

  • Create a new conda environment and activate it.

conda create -n STMTrack python=3.7 -y
conda activate STMTrack
  • Install pytorch and torchvision.
conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.0 -c pytorch
# pytorch v1.5.0, v1.6.0, or higher should also be OK. 
  • Install other required packages.
pip install -r requirements.txt

Test

  • Prepare the datasets: OTB2015, VOT2018, UAV123, GOT-10k, TrackingNet, LaSOT, ILSVRC VID*, ILSVRC DET*, COCO*, and something else you want to test. Set the paths as the following:
├── STMTrack
|   ├── ...
|   ├── ...
|   ├── datasets
|   |   ├── COCO -> /opt/data/COCO
|   |   ├── GOT-10k -> /opt/data/GOT-10k
|   |   ├── ILSVRC2015 -> /opt/data/ILSVRC2015
|   |   ├── LaSOT -> /opt/data/LaSOT/LaSOTBenchmark
|   |   ├── OTB
|   |   |   └── OTB2015 -> /opt/data/OTB2015
|   |   ├── TrackingNet -> /opt/data/TrackingNet
|   |   ├── UAV123 -> /opt/data/UAV123/UAV123
|   |   ├── VOT
|   |   |   ├── vot2018
|   |   |   |   ├── VOT2018 -> /opt/data/VOT2018
|   |   |   |   └── VOT2018.json
  • Notes

i. Star notation(*): just for training. You can ignore these datasets if you just want to test the tracker.

ii. In this case, we create soft links for every dataset. The real storage location of all datasets is /opt/data/. You can change them according to your situation.

iii. The VOT2018.json file can be download from here.

  • Download the models we trained.

    📎 GOT-10k model 📎 fulldata model

  • Use the path of the trained model to set the pretrain_model_path item in the configuration file correctly, then run the shell command.

  • Note that all paths we used here are relative, not absolute. See any configuration file in the experiments directory for examples and details.

General command format

python main/test.py --config testing_dataset_config_file_path

Take GOT-10k as an example:

python main/test.py --config experiments/stmtrack/test/got10k/stmtrack-googlenet-got.yaml

Training

  • Prepare the datasets as described in the last subsection.
  • Download the pretrained backbone model from here.
  • Run the shell command.

training based on the GOT-10k benchmark

python main/train.py --config experiments/stmtrack/train/got10k/stmtrack-googlenet-trn.yaml

training with full data

python main/train.py --config experiments/stmtrack/train/fulldata/stmtrack-googlenet-trn-fulldata.yaml

Testing Results

Click here to download all the following.

Acknowledgement

Repository

This repository is developed based on the single object tracking framework video_analyst. See it for more instructions and details.

References

@inproceedings{fu2021stmtrack,
  title={STMTrack: Template-free Visual Tracking with Space-time Memory Networks},
  author={Fu, Zhihong and Liu, Qingjie and Fu, Zehua and Wang, Yunhong},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={13774--13783},
  year={2021}
}

Contact

If you have any questions, just create issues or email me 😄 .

Comments
  • If I train in full-dataset,but batchsize<=32,how I set the hyperparameters?

    If I train in full-dataset,but batchsize<=32,how I set the hyperparameters?

    hellow,author. I want to train this model in full-data,but I just have a rtx3090.It just has 24GB memory.How to set the hyperparameters to get the best result? Thank you.

    opened by hekaijie123 7
  • Issues about the json files of UAV123 datasets

    Issues about the json files of UAV123 datasets

    Dear author, Should I put the JSON file of the UAV dataset in the same level directory of the uav123 python file? And, where is the the json files of UAV123 datasets .Could you share it? Thanks a lot!

    opened by 20713 6
  • About receptive field

    About receptive field

    Thanks for your work again. I encountered a problem while using your network . The feature map output by your network neck is 1717, so six 33 convolutional layers can still meet the requirements of the receptive field. If the size of the input picture increases and the feature map increases, and the tracking target itself is relatively large, can it be possible to expand the receptive field by stacking 3*3 convolutional layers? Or is there any better solution?

    opened by zzg-zzg 4
  • Issue about got-10k Training

    Issue about got-10k Training

    Hi, author. Thanks for your nice work. I excute offline training for the got10k benchmark. To be specific, I set the number of samples per epoch to 150,000 and the mini-batch size to 15, other settings are same with your code. The whole training phase takes about 1.5 days with one NVIDIA Tesla V100 GPU. But, the evaluation looks bad, Do you meets this problem? or do you have some suggestions. 1628252602

    opened by DavidZhangdw 3
  • issue about the dataset of GOT-10k

    issue about the dataset of GOT-10k

    Dear Author,I have read your paper carefully, but I am a beginner of visual tracking area. I want to test the pre-trained model. But when I run the command

    python main/test.py --config experiments/stmtrack/test/got10k/stmtrack-googlenet-got.yaml

    I meet

    Exception: Dataset /STMTrack/datasets/GOT-10k/test/list.txt not found or corrupted.

    But I have download the got-10k dataset of your link https://drive.google.com/file/d/17wQ9lvEa4jLhv72TZatw03EHUx9UgTcA/view?usp=sharing and then it come many files when I unzip it image How can I solve this problem?Thanks

    opened by Zexuanxexuan 3
  • A typo in your papper

    A typo in your papper

    Thank you for your excellent work. I read your paper and found an obvious typo. In the last paragraph of section 3.3 ,STM's citation number should be [37] and GraghMemVOS should be [30]. Another question I want to know is whether dividing the features extracted by keys and values will affect the detection of occluded objects. What should I do if I want the bbox to contain the occluded part?

    good first issue 
    opened by zzg-zzg 2
  • when I test in VOT,it always has a error. please help me

    when I test in VOT,it always has a error. please help me

    when I test in VOT, it worked fine at first, but failed after some tests.Here are some information for when something goes wrong.

    2021-12-12 05:07:47.807 | INFO | videoanalyst.engine.tester.tester_impl.vot:track_single_video:296 - (59) Video: zebrafish1 Time: 2.5s Speed: 151.2fps Lost: 5 2%|█▍ | 1/60 [28:21<27:53:19, 1701.68s/it]2021-12-12 05:07:47.808 | INFO | videoanalyst.engine.tester.tester_impl.vot:run_tracker:147 - Total Lost: 120 2021-12-12 05:07:47.808 | INFO | videoanalyst.engine.tester.tester_impl.vot:run_tracker:148 - Mean Speed: 176.03 FPS loading VOT2018: 0%| | 0/4 [00:00<?, ?it/s, homepage] Traceback (most recent call last): | 0/4 [00:00<?, ?it/s, homepage] File "main/test.py", line 77, in tester.test() File "/home/hkj/Desktop/video_analyst/videoanalyst/engine/tester/tester_impl/vot.py", line 89, in test test_result_dict = self.evaluation() File "/home/hkj/Desktop/video_analyst/videoanalyst/engine/tester/tester_impl/vot.py", line 191, in evaluation dataset = vot_benchmark.VOTDataset( File "/home/hkj/Desktop/video_analyst/videoanalyst/evaluation/vot_benchmark/pysot/datasets/vot.py", line 113, in init video, dataset_root, meta_data[video]['video_dir'], TypeError: string indices must be integers 100%|███████████████████████████████████████████████████████████████████████████████████████| 60/60 [28:21<00:00, 28.36s/it]

    opened by hekaijie123 1
  • Hi, can I add your wechat to have a deeper talk on this work?

    Hi, can I add your wechat to have a deeper talk on this work?

    Hi, can I add your WeChat to have a deeper talk on this work? Currently, I am extending this tracker into dual-modality version. My wechat is wangxiao5791509. Thanks.

    opened by wangxiao5791509 1
  • KeyError: 'Non-existent config key: train.track.data.transformer.RandomCropTransformer.x_size'

    KeyError: 'Non-existent config key: train.track.data.transformer.RandomCropTransformer.x_size'

    作者你好,我跑了您的代码,发现一些代码中缺少了x_size这一个参数(比如RandomCropTransformer这个类下面的default_hyper_params,DenseboxTarget这个类下同样缺失),导致在train函数中root_cfg.merge_from_file(exp_cfg_path)出现错误,请问这些地方的x_size应该设为多少?

    opened by n1-k0 1
  • visualization issue

    visualization issue

    Hi, thanks for sharing your code. I find a bug on visualization: line 361 and line 370. 'frame_idx' -> 'cur_frame_idx'.

        if self._hp_visualization:
            score1 = tensor_to_numpy(score[0])[:, 0]
            vsm.visualize(score1, self._hp_score_size, im_q_crop, self._state['cur_frame_idx'], 'raw_score')
    
    opened by wangxiao5791509 1
  • About GOT-10k train and test

    About GOT-10k train and test

    Thanks for your excellent work. I meet a problem when I want to reproduce the results in the paper. When I used GOT alone for training, the iou suppression remained at about 0.38 and there was no improvement. I wonder if there was something wrong with the configuration?

    opened by wjc0602 6
  • About Pixel-wise

    About Pixel-wise

    May I ask if you have compared the performance difference between Pixel-wise Correlation and Depth-wise Correlation in the process of the experiment?

    And their time-consuming?

    opened by jianjiandandande 1
Owner
Zhihong Fu
Keep thinking, doing, reading and fighting.
Zhihong Fu
FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

XCL 191 Dec 31, 2022
The Dual Memory is build from a simple CNN for the deep memory and Linear Regression fro the fast Memory

Simple-DMA a simple Dual Memory Architecture for classifications. based on the paper Dual-Memory Deep Learning Architectures for Lifelong Learning of

null 1 Jan 27, 2022
Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation

STCN Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang [a

Rex Cheng 456 Dec 12, 2022
Space Time Recurrent Memory Network - Pytorch

Space Time Recurrent Memory Network - Pytorch (wip) Implementation of Space Time Recurrent Memory Network, recurrent network competitive with attentio

Phil Wang 50 Nov 7, 2021
Drone-based Joint Density Map Estimation, Localization and Tracking with Space-Time Multi-Scale Attention Network

DroneCrowd Paper Detection, Tracking, and Counting Meets Drones in Crowds: A Benchmark. Introduction This paper proposes a space-time multi-scale atte

VisDrone 98 Nov 16, 2022
Segcache: a memory-efficient and scalable in-memory key-value cache for small objects

Segcache: a memory-efficient and scalable in-memory key-value cache for small objects This repo contains the code of Segcache described in the followi

TheSys Group @ CMU CS 78 Jan 7, 2023
PyTorch Code of "Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spatiotemporal Dynamics"

Memory In Memory Networks It is based on the paper Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spati

Yang Li 12 May 30, 2022
Episodic-memory - Ego4D Episodic Memory Benchmark

Ego4D Episodic Memory Benchmark EGO4D is the world's largest egocentric (first p

null 3 Feb 18, 2022
Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Memory Efficient Attention Pytorch Implementation of a memory efficient multi-head attention as proposed in the paper, Self-attention Does Not Need O(

Phil Wang 180 Jan 5, 2023
Space robot - (Course Project) Using the space robot to capture the target satellite that is disabled and spinning, then stabilize and fix it up

Space robot - (Course Project) Using the space robot to capture the target satellite that is disabled and spinning, then stabilize and fix it up

Mingrui Yu 3 Jan 7, 2022
Implementation of MeMOT - Multi-Object Tracking with Memory - in Pytorch

MeMOT - Pytorch (wip) Implementation of MeMOT - Multi-Object Tracking with Memory - in Pytorch. This paper is just one in a line of work, but importan

Phil Wang 15 May 9, 2022
The official implementation of paper Siamese Transformer Pyramid Networks for Real-Time UAV Tracking, accepted by WACV22

SiamTPN Introduction This is the official implementation of the SiamTPN (WACV2022). The tracker intergrates pyramid feature network and transformer in

Robotics and Intelligent Systems Control @ NYUAD 28 Nov 25, 2022
This project uses Template Matching technique for object detecting by detection of template image over base image.

Object Detection Project Using OpenCV This project uses Template Matching technique for object detecting by detection the template image over base ima

Pratham Bhatnagar 7 May 29, 2022
This project uses Template Matching technique for object detecting by detection of template image over base image

Object Detection Project Using OpenCV This project uses Template Matching technique for object detecting by detection the template image over base ima

Pratham Bhatnagar 4 Nov 16, 2021
Joint detection and tracking model named DEFT, or ``Detection Embeddings for Tracking.

DEFT: Detection Embeddings for Tracking DEFT: Detection Embeddings for Tracking, Mohamed Chaabane, Peter Zhang, J. Ross Beveridge, Stephen O'Hara

Mohamed Chaabane 253 Dec 18, 2022
Tracking code for the winner of track 1 in the MMP-Tracking Challenge at ICCV 2021 Workshop.

Tracking Code for the winner of track1 in MMP-Trakcing challenge This repository contains our tracking code for the Multi-camera Multiple People Track

DamoCV 29 Nov 13, 2022
Tracking Pipeline helps you to solve the tracking problem more easily

Tracking_Pipeline Tracking_Pipeline helps you to solve the tracking problem more easily I integrate detection algorithms like: Yolov5, Yolov4, YoloX,

VNOpenAI 32 Dec 21, 2022
Quadruped-command-tracking-controller - Quadruped command tracking controller (flat terrain)

Quadruped command tracking controller (flat terrain) Prepare Install RAISIM link

Yunho Kim 4 Oct 20, 2022
Python package for multiple object tracking research with focus on laboratory animals tracking.

motutils is a Python package for multiple object tracking research with focus on laboratory animals tracking. Features loads: MOTChallenge CSV, sleap

Matěj Šmíd 2 Sep 5, 2022