Learning Spatio-Temporal Transformer for Visual Tracking

Related tags

Deep Learning Stark
Overview

STARK

PWC
PWC
PWC

The official implementation of the paper Learning Spatio-Temporal Transformer for Visual Tracking

Hiring research interns for visual transformer projects: [email protected]

STARK_Framework

Highlights

End-to-End, Post-processing Free

STARK is an end-to-end tracking approach, which directly predicts one accurate bounding box as the tracking result.
Besides, STARK does not use any hyperparameters-sensitive post-processing, leading to stable performances.

Real-Time Speed

STARK-ST50 and STARK-ST101 run at 40FPS and 30FPS respectively on a Tesla V100 GPU.

Strong performance

Tracker LaSOT (AUC) GOT-10K (AO) TrackingNet (AUC)
STARK 67.1 68.8 82.0
TransT 64.9 67.1 81.4
TrDiMP 63.7 67.1 78.4
Siam R-CNN 64.8 64.9 81.2

Purely PyTorch-based Code

STARK is implemented purely based on the PyTorch.

Install the environment

Option1: Use the Anaconda

conda create -n stark python=3.6
conda activate stark
bash install.sh

Option2: Use the docker file

We provide the complete docker at here

Data Preparation

Put the tracking datasets in ./data. It should look like:

${STARK_ROOT}
 -- data
     -- lasot
         |-- airplane
         |-- basketball
         |-- bear
         ...
     -- got10k
         |-- test
         |-- train
         |-- val
     -- coco
         |-- annotations
         |-- images
     -- trackingnet
         |-- TRAIN_0
         |-- TRAIN_1
         ...
         |-- TRAIN_11
         |-- TEST

Run the following command to set paths for this project

python tracking/create_default_local_file.py --workspace_dir . --data_dir ./data --save_dir .

After running this command, you can also modify paths by editing these two files

lib/train/admin/local.py  # paths about training
lib/test/evaluation/local.py  # paths about testing

Train STARK

Training with multiple GPUs using DDP

# STARK-S50
python tracking/train.py --script stark_s --config baseline --save_dir . --mode multiple --nproc_per_node 8  # STARK-S50
# STARK-ST50
python tracking/train.py --script stark_st1 --config baseline --save_dir . --mode multiple --nproc_per_node 8  # STARK-ST50 Stage1
python tracking/train.py --script stark_st2 --config baseline --save_dir . --mode multiple --nproc_per_node 8 --script_prv stark_st1 --config_prv baseline  # STARK-ST50 Stage2
# STARK-ST101
python tracking/train.py --script stark_st1 --config baseline_R101 --save_dir . --mode multiple --nproc_per_node 8  # STARK-ST101 Stage1
python tracking/train.py --script stark_st2 --config baseline_R101 --save_dir . --mode multiple --nproc_per_node 8 --script_prv stark_st1 --config_prv baseline_R101  # STARK-ST101 Stage2

(Optionally) Debugging training with a single GPU

python tracking/train.py --script stark_s --config baseline --save_dir . --mode single

Test and evaluate STARK on benchmarks

  • LaSOT
python tracking/test.py stark_st baseline --dataset lasot --threads 32
python tracking/analysis_results.py # need to modify tracker configs and names
  • GOT10K-test
python tracking/test.py stark_st baseline_got10k_only --dataset got10k_test --threads 32
python lib/test/utils/transform_got10k.py --tracker_name stark_st --cfg_name baseline_got10k_only
  • TrackingNet
python tracking/test.py stark_st baseline --dataset trackingnet --threads 32
python lib/test/utils/transform_trackingnet.py --tracker_name stark_st --cfg_name baseline
  • VOT2020
    Before evaluating "STARK+AR" on VOT2020, please install some extra packages following external/AR/README.md
cd external/vot20/<workspace_dir>
export PYTHONPATH=<path to the stark project>:$PYTHONPATH
bash exp.sh
  • VOT2020-LT
cd external/vot20_lt/<workspace_dir>
export PYTHONPATH=<path to the stark project>:$PYTHONPATH
bash exp.sh

Test FLOPs, Params, and Speed

# Profiling STARK-S50 model
python tracking/profile_model.py --script stark_s --config baseline
# Profiling STARK-ST50 model
python tracking/profile_model.py --script stark_st2 --config baseline
# Profiling STARK-ST101 model
python tracking/profile_model.py --script stark_st2 --config baseline_R101

Model Zoo

The trained models, the training logs, and the raw tracking results are provided in the model zoo

Acknowledgments

Comments
  • Dataloader will randomly crashed

    Dataloader will randomly crashed

    Hi.

    I found that the training process will randomly crashed with RuntimeError: DataLoader worker (pid(s) 36469) exited unexpectedly, is that normal?

    I use the following training command.

    python tracking/train.py --script stark_s --config baseline_got10k_only --save_dir . --mode multiple --nproc_per_node 8
    

    thanks!

    opened by memoiry 7
  • A problem about loading checkpoint

    A problem about loading checkpoint

    When I train ‘st’ model,I found the 'net_type' is ''STARKS'',but the checkpoint_dict['net_type'] is ''LittleBoy_clean_corner'', so assert net_type == checkpoint_dict['net_type'], 'Network is not of correct type.'It's always wrong. image

    How can I solve this problem?Thanks!

    opened by 1071189147 5
  • cuda10.2 and 3060 do not match

    cuda10.2 and 3060 do not match

    run: python tracking/video_demo.py stark_s baseline test_video/demo.mp4

    cuda10.2:

    NVIDIA GeForce RTX 3060 Laptop GPU with CUDA capability sm_86 is not compatible with the current PyTorch installation.
    The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_61 sm_70 sm_75 compute_37.
    If you want to use the NVIDIA GeForce RTX 3060 Laptop GPU GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/
      warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))
    

    cuda11.0:

    WARNING: You are using tensorboardX instead sis you have a too old pytorch version.
    Traceback (most recent call last):
      File "tracking/../lib/train/admin/tensorboard.py", line 4, in <module>
        from torch.utils.tensorboard import SummaryWriter
      File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/torch/__init__.py", line 189, in <module>
        _load_global_deps()
      File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/torch/__init__.py", line 142, in _load_global_deps
        ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
      File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/ctypes/__init__.py", line 348, in __init__
        self._handle = _dlopen(self._name, mode)
    OSError: /home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/torch/lib/../../../../libcublas.so.11: symbol free_gemm_select version libcublasLt.so.11 not defined in file libcublasLt.so.11 with link time reference
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "tracking/video_demo.py", line 9, in <module>
        from lib.test.evaluation import Tracker
      File "tracking/../lib/test/evaluation/__init__.py", line 1, in <module>
        from .data import Sequence
      File "tracking/../lib/test/evaluation/data.py", line 3, in <module>
        from lib.train.data.image_loader import imread_indexed
      File "tracking/../lib/train/__init__.py", line 1, in <module>
        from .admin.multigpu import MultiGPU
      File "tracking/../lib/train/admin/__init__.py", line 3, in <module>
        from .tensorboard import TensorboardWriter
      File "tracking/../lib/train/admin/tensorboard.py", line 7, in <module>
        from tensorboardX import SummaryWriter
    ModuleNotFoundError: No module named 'tensorboardX'
    

    but when i installed tensorboardX:

    WARNING: You are using tensorboardX instead sis you have a too old pytorch version.
    Traceback (most recent call last):
      File "tracking/../lib/train/admin/tensorboard.py", line 4, in <module>
        from torch.utils.tensorboard import SummaryWriter
      File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/torch/__init__.py", line 189, in <module>
        _load_global_deps()
      File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/torch/__init__.py", line 142, in _load_global_deps
        ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
      File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/ctypes/__init__.py", line 348, in __init__
        self._handle = _dlopen(self._name, mode)
    OSError: /home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/torch/lib/../../../../libcublas.so.11: symbol free_gemm_select version libcublasLt.so.11 not defined in file libcublasLt.so.11 with link time reference
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "tracking/video_demo.py", line 9, in <module>
        from lib.test.evaluation import Tracker
      File "tracking/../lib/test/evaluation/__init__.py", line 1, in <module>
        from .data import Sequence
      File "tracking/../lib/test/evaluation/data.py", line 3, in <module>
        from lib.train.data.image_loader import imread_indexed
      File "tracking/../lib/train/__init__.py", line 1, in <module>
        from .admin.multigpu import MultiGPU
      File "tracking/../lib/train/admin/__init__.py", line 3, in <module>
        from .tensorboard import TensorboardWriter
      File "tracking/../lib/train/admin/tensorboard.py", line 7, in <module>
        from tensorboardX import SummaryWriter
      File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/tensorboardX/__init__.py", line 5, in <module>
        from .torchvis import TorchVis
      File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/tensorboardX/torchvis.py", line 11, in <module>
        from .writer import SummaryWriter
      File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/tensorboardX/writer.py", line 34, in <module>
        import torch
      File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/torch/__init__.py", line 189, in <module>
        _load_global_deps()
      File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/torch/__init__.py", line 142, in _load_global_deps
        ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
      File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/ctypes/__init__.py", line 348, in __init__
        self._handle = _dlopen(self._name, mode)
    OSError: /home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/torch/lib/../../../../libcublas.so.11: symbol free_gemm_select version libcublasLt.so.11 not defined in file libcublasLt.so.11 with link time reference
    
    opened by Richard-mei 4
  • About GOT-10k test set results

    About GOT-10k test set results

    Hi, Thanks for your wonderful work. I notice that Transformer Tracking use the model trained with all datasets(LaSOT, GOT10K, COCO, TrackingNet) to get the evaluation result on GOT-10k test set, and the result is much better than the model trained with GOT10K only.

    image

    However, when I use the STARK-S50 pre-trained model(trained on all datasets) in your model zoo to evaluate the GOT-10k test set, I find that the AO is 0.688, which only gains small improvement compared with 0.672

    I am confused with this phenomenon. Have you ever tried to evaluate the model trained with all datasets on GOT-10k test set? Or can you kindly explain the reason why there is just little performance gain to use the model trained on all datasets?

    opened by botaoye 4
  • how to analysis the model on GOT10k-val dataset?

    how to analysis the model on GOT10k-val dataset?

    Thanks for your work! I trained the model and want to evaluate it on the GOT10k-Val dataset to see its performance, but only see 'LaSOT', 'otb', 'nfs', 'uav', 'tc128ce' datasets, so how to evaluate on the GOT10k-Val? By the way, what's the difference between analysis_results and analysis_results_ITP files?

    opened by 3bobo 3
  • Training process not utilizing a dynamically updated template

    Training process not utilizing a dynamically updated template

    It seems that STARK doesn't mention anything about a dynamically updated template (DUT for short) during training procedure, is it a deliberate design or am I missing something?

    I reckon that the DUT is actually something like a short-term memory, and it should not be treated equally as a normal template from the first frame by the transformer, so the DUT should be explicitly included in training. However, this is not how STARK has been implemented.

    So I'm curious what's the intuition or reasoning behind STARK's current training protocol of dismissing the DUT?

    opened by luowyang 3
  • why not set sequential input of the data

    why not set sequential input of the data

    Hi, thanks for your work. I find from your codes that "shuffle = True" when setting dataloaders. So if the input data is not sequential, how to update template every 200 frames? thanks!

    opened by ANdong-star 3
  • ModuleNotFoundError: No module named 'lib'

    ModuleNotFoundError: No module named 'lib'

    hi,

    I run several times about vot (python version), but still got the problem: from lib.test.vot20.stark_vot20 import run_vot_exp ModuleNotFoundError: No module named 'lib'

    It seems not finding the stark project path, though I export it as: export PYTHONPATH=/home/xxxx/projects/transformer/Stark-main:$PYTHONPATH.

    Expected to solve it by inspiring from any of your answers.

    Thanks!

    opened by zhanglichao 2
  • the meaning of

    the meaning of "lmdb" in "self.lasot_lmdb_dir"

    Hi! Could you please tell what is the meaning of "lmdb" in class EnvironmentSettings "self.lasot_lmdb_dir"? I guess it is the dir of val dataset of lasot?

    opened by ANdong-star 2
  • where is the definition of a parameter of your codes

    where is the definition of a parameter of your codes

    Hi, thanks for your work! There is a parameter making me confused. I don't find the definition of params in class STARK_ST and could you please tell me? Thanks!

    WY(8T DNSSX YPNIIT 6UE7

    opened by ANdong-star 2
  • Effect of template choice on transformer

    Effect of template choice on transformer

    Thanks for sharing! I have some questions around the choice of template. From the paper you cropped 2^2 times the ground truth bounding box, rather than just the actual target bounding box resized to square image. My questions are:

    1. Is the purpose here to include more surrounding information? If so what would be the optimal template size here? Also a factor of 2 would not always include the whole tracking object if aspect ratio is high.
    2. By not specifying the bounding box exactly I assume the transformer has to learn some segmentation capability? For instance right now I noticed that if you change the template crop size (output size stay the same) a little bit during the inference time, the model would give very poor performance. So it seems that some information sensitive to absolute positions are learned in this setting. Would passing the exact coordinates into the transformer help in any way?
    opened by waterknows 2
  • How to create the data folder with all the datasets on it?

    How to create the data folder with all the datasets on it?

    Hi guys, according to the readme, I should create a folder called data, just under the root stark folder. This folder should contain different datasets: lasot, got10k, coco and trackingnet.

    How can I add all this datasets to that data folder?

    opened by salcanmor 0
  • Where to download STARKST_ep0500.pth.tar ?

    Where to download STARKST_ep0500.pth.tar ?

    Hi guys, I'm trying to run this tracker but it is throwing the error:

    FileNotFoundError: [Errno 2] No such file or directory: '/home/salva/submit_STARK_LT-code/checkpoints/train/stark_ref/baseline/STARKST_ep0500.pth.tar'

    I cannot find STARKST_ep0500.pth.tar in the links provided in the modelzoo, so, how can I solve this error?

    Thanks in advance.

    opened by salcanmor 2
  • the checkpoint file of stark-st1  link has expired

    the checkpoint file of stark-st1 link has expired

    Hi! the checkpoint file of stark-st1 link has expired(https://drive.google.com/file/d/1HswUW0oHKjiTL9xR7d2WNW9QOLE040vS/view?usp=sharing), can you re-upload the model of the first stage( (baseline / baseline_got10k_only / baseline_R101 / baseline_R101_got10k_only) ), and then give a new link, thank you very much, or send me an email:[email protected], thank you again!

    opened by kuaiJL 2
  • Some questions about AR(Alpha Refine).

    Some questions about AR(Alpha Refine).

    Have you tried to use Alpha-Refine for evaluation on datasets such as GOT-1OK and TrackingNet? If you have tried, can you provide this part of the code? Thanks.

    opened by RelayZ 0
Owner
Multimedia Research
Multimedia Research at Microsoft Research Asia
Multimedia Research
Implementation of the "Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos" paper.

Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos Introduction Point cloud videos exhibit irregularities and lack of or

Hehe Fan 101 Dec 29, 2022
git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking Ning Wang, Wengang Zhou, Jie Wang, and Houqiang Li Accepted by CVPR

NingWang 236 Dec 22, 2022
Code for the paper "Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds" (ICCV 2021)

Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds This is the official code implementation for the paper "Spatio-temporal Se

Hesper 63 Jan 5, 2023
Digital Twin Mobility Profiling: A Spatio-Temporal Graph Learning Approach

Digital Twin Mobility Profiling: A Spatio-Temporal Graph Learning Approach This is the implementation of traffic prediction code in DTMP based on PyTo

chenxin 1 Dec 19, 2021
Self-supervised spatio-spectro-temporal represenation learning for EEG analysis

EEG-Oriented Self-Supervised Learning and Cluster-Aware Adaptation This repository provides a tensorflow implementation of a submitted paper: EEG-Orie

Wonjun Ko 4 Jun 9, 2022
Deep generative modeling for time-stamped heterogeneous data, enabling high-fidelity models for a large variety of spatio-temporal domains.

Neural Spatio-Temporal Point Processes [arxiv] Ricky T. Q. Chen, Brandon Amos, Maximilian Nickel Abstract. We propose a new class of parameterizations

Facebook Research 75 Dec 19, 2022
Spontaneous Facial Micro Expression Recognition using 3D Spatio-Temporal Convolutional Neural Networks

Spontaneous Facial Micro Expression Recognition using 3D Spatio-Temporal Convolutional Neural Networks Abstract Facial expression recognition in video

Bogireddy Sai Prasanna Teja Reddy 103 Dec 29, 2022
Implementation of the "PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences" paper.

PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences Introduction Point cloud sequences are irregular and unordered in the spatial dimen

Hehe Fan 63 Dec 9, 2022
ESTDepth: Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks (CVPR 2021)

ESTDepth: Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks (CVPR 2021) Project Page | Video | Paper | Data We present a novel metho

null 65 Nov 28, 2022
DeepSTD: Mining Spatio-temporal Disturbances of Multiple Context Factors for Citywide Traffic Flow Prediction

DeepSTD: Mining Spatio-temporal Disturbances of Multiple Context Factors for Citywide Traffic Flow Prediction This is the implementation of DeepSTD in

null 5 Sep 26, 2022
Codes for TIM2021 paper "Anchor-Based Spatio-Temporal Attention 3-D Convolutional Networks for Dynamic 3-D Point Cloud Sequences"

Codes for TIM2021 paper "Anchor-Based Spatio-Temporal Attention 3-D Convolutional Networks for Dynamic 3-D Point Cloud Sequences"

Intelligent Robotics and Machine Vision Lab 4 Jul 19, 2022
Spatio-Temporal Entropy Model (STEM) for end-to-end leaned video compression.

Spatio-Temporal Entropy Model A Pytorch Reproduction of Spatio-Temporal Entropy Model (STEM) for end-to-end leaned video compression. More details can

null 16 Nov 28, 2022
[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers

TubeDETR: Spatio-Temporal Video Grounding with Transformers Website • STVG Demo • Paper This repository provides the code for our paper. This includes

Antoine Yang 108 Dec 27, 2022
TrTr: Visual Tracking with Transformer

TrTr: Visual Tracking with Transformer We propose a novel tracker network based on a powerful attention mechanism called Transformer encoder-decoder a

趙 漠居(Zhao, Moju) 66 Dec 27, 2022
CVPR2021: Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Temporal Context Aggregation Network - Pytorch This repo holds the pytorch-version codes of paper: "Temporal Context Aggregation Network for Temporal

Zhiwu Qing 63 Sep 27, 2022
Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

Zhengzhong Tu 5 Sep 16, 2022
Cascaded Deep Video Deblurring Using Temporal Sharpness Prior and Non-local Spatial-Temporal Similarity

This repository is the official PyTorch implementation of Cascaded Deep Video Deblurring Using Temporal Sharpness Prior and Non-local Spatial-Temporal Similarity

hippopmonkey 4 Dec 11, 2022
Alex Pashevich 62 Dec 24, 2022
TCTrack: Temporal Contexts for Aerial Tracking (CVPR2022)

TCTrack: Temporal Contexts for Aerial Tracking (CVPR2022) Ziang Cao and Ziyuan Huang and Liang Pan and Shiwei Zhang and Ziwei Liu and Changhong Fu In

Intelligent Vision for Robotics in Complex Environment 100 Dec 19, 2022