Unified tracking framework with a single appearance model

ZhongdaoWang

Last update: Dec 24, 2022

Related tags

Deep Learning UniTrack

Overview

Paper: Do different tracking tasks require different appearance model?

[ArXiv] (comming soon) [Project Page] (comming soon)

UniTrack is a simple and Unified framework for versatile visual Tracking tasks.

As an important problem in computer vision, tracking has been fragmented into a multitude of different experimental setups. As a consequence, the literature has fragmented too, and now the novel approaches proposed by the community are usually specialized to fit only one specific setup. To understand to what extend this specialization is actually necessary, we present UniTrack, a solution to address multiple different tracking tasks within the same framework. All tasks share the same universal appearance model. UniTrack enjoys the following advantages,

Do NOT need training on a specific tracking task.
Good performance in existing tracking tasks, thus can serve as strong baselines for each task.
Could be easily adapted to novel tasks with different setup.
Could serve as an evaluation platform to test pre-trained representations on tracking tasks (e.g. via self-supervised models).

Tasks & Framework

Tasks

We classify existing tracking tasks along four axes: (1) Single or multiple targets; (2) Users specify targets or automatic detectors specify targets; (3) Observation formats (bounding box/mask/pose); (2) Class-agnostic or class-specific (i.e. human/vehicles). We mainly expriment on 5 tasks: SOT, VOS, MOT, MOTS, and PoseTrack. Task setups are summarized in the above figure.

Appearance model

An appearance model is the only learnable component in UniTrack. It should provide universal visual representation, and is usually pre-trained on large-scale dataset in supervised or unsupervised manners. Typical examples include ImageNet pre-trained ResNets (supervised), and recent self-supervised models such as MoCo and SimCLR (unsupervised).

Propagation and Association

Two fundamental algorithm building blocks in UniTrack. Both employ features extracted by the appearance model as input. For propagation we adopt exiting methods such as cross correlation, DCF, and mask propation. For association we employ a simple algorithm and develop a novel similarity metric to make full use of the appearance model.

Results

Below we show results of UniTrack with a simple ImageNet Pre-trained ResNet-18 as the appearance model. More results (other tasks/datasets, more visualization) can be found in results.md.

Qualitative results

Single Object Tracking (SOT) on OTB-2015

Video Object Segmentation (VOS) on DAVIS-2017 val split

Multiple Object Tracking (MOT) on MOT-16 test set private detector track (Detections from FairMOT)

Multiple Object Tracking and Segmentation (MOTS) on MOTS challenge test set (Detections from COSTA_st)

Pose Tracking on PoseTrack-2018 val split (Detections from LightTrack)

Quantitative results

Single Object Tracking (SOT) on OTB-2015

Method	SiamFC	SiamRPN	SiamRPN++	UDT*	UDT+*	LUDT*	LUDT+*	UniTrack_XCorr*	UniTrack_DCF*
AUC	58.2	63.7	69.6	59.4	63.2	60.2	63.9	55.5	61.8

* indicates non-supervised methods

Video Object Segmentation (VOS) on DAVIS-2017 val split

Method	SiamMask	FeelVOS	STM	Colorization*	TimeCycle*	UVC*	CRW*	VFS*	UniTrack*
J-mean	54.3	63.7	79.2	34.6	40.1	56.7	64.8	66.5	58.4

* indicates non-supervised methods

Multiple Object Tracking (MOT) on MOT-16 test set private detector track

Method	POI	DeepSORT-2	JDE	CTrack	TubeTK	TraDes	CSTrack	FairMOT*	UniTrack*
IDF-1	65.1	62.2	55.8	57.2	62.2	64.7	71.8	72.8	71.8
IDs	805	781	1544	1897	1236	1144	1071	1074	683
MOTA	66.1	61.4	64.4	67.6	66.9	70.1	70.7	74.9	74.7

* indicates methods using the same detections

Multiple Object Tracking and Segmentation (MOTS) on MOTS challenge test set

Method	TrackRCNN	SORTS	PointTrack	GMPHD	COSTA_st*	UniTrack*
IDF-1	42.7	57.3	42.9	65.6	70.3	67.2
IDs	567	577	868	566	421	622
sMOTA	40.6	55.0	62.3	69.0	70.2	68.9

* indicates methods using the same detections

Pose Tracking on PoseTrack-2018 val split

Method	MDPN	OpenSVAI	Miracle	KeyTrack	LightTrack*	UniTrack*
IDF-1	-	-	-	-	52.2	73.2
IDs	-	-	-	-	3024	6760
sMOTA	50.6	62.4	64.0	66.6	64.8	63.5

* indicates methods using the same detections

Getting started

Demo

Update log

[2021.6.24]: Start writing docs, please stay tuned!

Acknowledgement

VideoWalk by Allan A. Jabri

SOT code by Zhipeng Zhang

Comments

Bad result on YOLOX + UniTrack demo

Hi, Zhangdao thank you for your wonderful UniTrack framwork. but when i tried use YOLOX + UniTrack demo mot_demo.py on a video sequence,the result was not that good. i put the yolox_x(the better version) pre-traind model on yolox/weights and i used the config imagenet_resnet18_s3.yaml . it seems like the framwork could not detect the person object but works pretty well on other object . by the way i use the default classed arg list(range(80)) what should i do?

opened by kumi123 5
How to get the mask for mots task?

Hi!

Thanks for your great work!

When I prepare the segmentation mask for mots task, I followed the recommended instruction https://github.com/Zhongdao/UniTrack/blob/main/docs/DATA.md, and used the gen_mots_costa.py. Then i get the txt fils like follows:

1 2001 2 1080 1920 UkU\1`0RQ1>PoN\OVP1X1F=I3oSOTNlg0U2lWOVNng0m1nWOWNlg0n1PXOWNlg0l1SXOUNjg0P2....... But it seems that the txt fils are not segmentation mask. Are these txt files right? or could you pls describe the mask generation process in more details?

Thank you!

opened by JessicaChanzc 4
What is the UniTrack Pose FPS in Realtime?

Hello! I'm trying to use the Pose in realtime, but I have a Question What's the Realtime FPS of Pose(LightTrack) in README example? if it is not implemented realtime, please can you tell me expected FPS?

opened by LeeJeongHwi 3
ssib info

https://github.com/Zhongdao/UniTrack/blob/a83e782f5c56c17ccf30f0e330cb50bd3349d5b7/model/model.py#L132 https://github.com/Zhongdao/UniTrack/blob/54347ba1bdba0903b241e00de2b5d0dc3c1a3d14/config/ssib_s3_womotion.yaml#L5 https://github.com/Zhongdao/UniTrack/blob/54347ba1bdba0903b241e00de2b5d0dc3c1a3d14/config/ssib_s3_womotion.yaml#L11

Excuse me, I have not been able to find an introduction to ssib in the paper and code. What is the full name of ssib, and where do I need to download the pre-trained model of ssib?

Thank you very much!

opened by Yulv-git 2
mot_demo issue

python demo/mot_demo.py --classes 1 2demo/mot_demo.py:186: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. common_args = yaml.load(f) 2022-04-23 12:44:56.034 | INFO | main:main:135 - Args: Namespace(asso_with_motion=True, ckpt='./detector/YOLOX/weights/yolox_x.pth', classes=[1, 2], conf=0.65, conf_thres=0.65, config='./config/imagenet_resnet18_s3.yaml', confirm_iou_thres=0.7, demo='video', device='cuda', down_factor=8, dup_iou_thres=0.15, exp_file='./detector/YOLOX/exps/default/yolox_x.py', exp_name='imagenet_resnet18_s3', feat_size=[4, 10], gpu_id=0, im_mean=[0.485, 0.456, 0.406], im_std=[0.229, 0.224, 0.225], img_size=[640, 480], infer2D=True, iou_thres=0.5, min_box_area=200, model_type='imagenet18', mot_root='/home/wangzd/datasets/MOT/MOT16', motion_gated=True, motion_lambda=0.98, nms=None, nms_thres=0.4, nopadding=False, obid='FairMOT', output_root='./results/mot_demo', path='../mmtracking-master/demo/demo.mp4', prop_flag=False, remove_layers=['layer4'], resume='None', save_images=False, save_result=False, save_videos=True, test_mot16=False, track_buffer=30, tsize=[640, 480], use_kalman=True, workers=4) Lenth of the video: 8 frames 1111111111111111111111111 /dfs/data/miniconda3/envs/openmmlab/lib/python3.7/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1639180594101/work/aten/src/ATen/native/TensorShape.cpp:2157.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] 2022-04-23 12:44:57.335 | INFO | main:main:149 - Model Summary: Params: 99.07M, Gflops: 211.45 2022-04-23 12:45:05.732 | INFO | main:main:152 - loading checkpoint <class 'dict'> Traceback (most recent call last): File "demo/mot_demo.py", line 201, in main(exp, args) File "demo/mot_demo.py", line 158, in main det_model.load_state_dict(ckpt["model"]) KeyError: 'model'

i using YOLOX default pretrain train coco weight.(https://github.com/open-mmlab/mmdetection/tree/master/configs/yolox)

opened by CarlHuangNuc 2
how to get MOTS-train.txt
Hi~, How do I get the MOTS-train.txt？

` Eval Config: USE_PARALLEL : False
NUM_PARALLEL_CORES : 8
BREAK_ON_ERROR : True
RETURN_ON_ERROR : False
LOG_ON_ERROR : results/mots/debug/quantitive/error.log PRINT_RESULTS : True
PRINT_ONLY_COMBINED : False
PRINT_CONFIG : True
TIME_PROGRESS : True
DISPLAY_LESS_PROGRESS : True
OUTPUT_SUMMARY : True
OUTPUT_EMPTY_CLASSES : True
OUTPUT_DETAILED : True
PLOT_CURVES : False

MOTSChallenge Config: GT_FOLDER : /data7/fenghao/dataset/MOTS_unitrack//images/train TRACKERS_FOLDER : results/mots/debug/quantitive/.. OUTPUT_FOLDER : None
TRACKERS_TO_EVAL : ['quantitive']
CLASSES_TO_EVAL : ['pedestrian']
SPLIT_TO_EVAL : train
INPUT_AS_ZIP : False
PRINT_CONFIG : True
TRACKER_SUB_FOLDER :
OUTPUT_SUB_FOLDER :
TRACKER_DISPLAY_NAMES : None
SEQMAP_FOLDER : /data7/fenghao/dataset/MOTS_unitrack//images/train/../../seqmaps SEQMAP_FILE : None
SEQ_INFO : None
GT_LOC_FORMAT : {gt_folder}/{seq}/gt/gt.txt
SKIP_SPLIT_FOL : True
BENCHMARK : MOTS20
no seqmap found: /data7/fenghao/dataset/MOTS_unitrack//images/train/../../seqmaps/MOTS-train.txt Traceback (most recent call last): File "/home/wangwd/.pycharm_helpers/pydev/pydevd.py", line 1448, in _exec pydev_imports.execfile(file, globals, locals) # execute the script File "/home/wangwd/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/data7/fenghao/UniTrack/test/test_mots.py", line 170, in save_videos=opt.save_videos) File "/data7/fenghao/UniTrack/test/test_mots.py", line 121, in main dataset_list = [trackeval.datasets.MOTSChallenge(dataset_config)] File "/data7/fenghao/UniTrack/eval/trackeval/datasets/mots_challenge.py", line 76, in init self.seq_list, self.seq_lengths = self._get_seq_info() File "/data7/fenghao/UniTrack/eval/trackeval/datasets/mots_challenge.py", line 152, in _get_seq_info raise TrackEvalException('no seqmap found: ' + os.path.basename(seqmap_file)) eval.trackeval.utils.TrackEvalException: no seqmap found: MOTS-train.txt python-BaseException

Process finished with exit code 1

`

This is my config common: exp_name: debug

# Model related model_type: crw remove_layers: ['layer4'] im_mean: [0.4914, 0.4822, 0.4465] im_std: [0.2023, 0.1994, 0.2010] nopadding: False head_depth: -1 resume: 'weights/crw.pth' # Misc down_factor: 8 infer2D: True workers: 4 gpu_id: 3 device: cuda

mots: obid: 'gt' mots_root: '/data7/fenghao/dataset/MOTS_unitrack/' save_videos: False save_images: False test: False track_buffer: 30 nms_thres: 0.4 conf_thres: 0.5 iou_thres: 0.5 prop_flag: False max_mask_area: 200 dup_iou_thres: 0.15 confirm_iou_thres: 0.7 first_stage_thres: 0.7 feat_size: [4,10] use_kalman: True asso_with_motion: True motion_lambda: 0.98 motion_gated: False
opened by ToumaKazusa3 2
SOT on LaSOT

Hi, Zhongdao,

Thank you for your great work!

I have tested your code on LaSOT using the crw_resnet18_s3 model by modifying the datasets root in utils.py. But the AUC is only 23.02 on this dataset. I'm not sure I got the correct results. Have you tested the unitrack on LaSOT or GOT10k? Could you provide the results on LaSOT and GOT10k at your convenience?

opened by Flowerfan 2
Using a custom Resnet-18 Classification model

I have trained a resnet-18 model on a custom dataset for classification. I also have trained YOLOX for detection on this custom dataset. How do I use my resnet-18 model as an appearance model? Since its not trained on crw,imagenet etc, what model_type should I give in the config file? And do I have to edit model/model.py to handle this model_type and load the model from my checkpoint? Thanks for your help!

opened by kenrickfernandes 2
Model zoo link doesn't work

Hi, I am trying to use different pretrained models provided by you model zoo but some of the links are empty. And also wondering if you have the im_mean and im_std for each of them.

Thank you,

opened by philip-fu 2
Inference Statistics

Hello,

I know this project was just released and some things are still being put up, but do you have any information on the inference speeds for UniTrack with ResNet18 and ResNet50 base appearance models?

opened by vjsrinivas 2

Tensor size mismatch when running mot_demo.py with custom test image size

I'd like to give it a try to run mot_demo.py with a custom test image size --tsize 800 600. Exception arises in this case and the message said: Sizes of tensors must match except in dimension 2. Got 75 and 76 (The offending index is 0)

No additional debug information is available, so it is difficult for me to find out which line of code throws this exception.

Here is my exp arguments and logs:

2022-04-04 01:35:38.341 | INFO     | __main__:main:135 - Args: Namespace(
asso_with_motion=True, 
ckpt='detector/YOLOX/weights/yolox_m.pth', 
classes=[0], conf=0.65, conf_thres=0.65, 
config='./config/imagenet_resnet18_s3.yaml', 
confirm_iou_thres=0.7, demo='video', device='cuda', 
down_factor=8, dup_iou_thres=0.15, 
exp_file='detector/YOLOX/exps/default/yolox_m.py', 
exp_name='imagenet_resnet18_s3', 
feat_size=[4, 10], gpu_id=0, 
im_mean=[0.485, 0.456, 0.406], im_std=[0.229, 0.224, 0.225], 
img_size=[800, 600], 
infer2D=True, iou_thres=0.5, min_box_area=200, 
model_type='imagenet18', 
mot_root='/home/wangzd/datasets/MOT/MOT16', 
motion_gated=True, motion_lambda=0.98, 
nms=0.3, nms_thres=0.4, 
nopadding=False, obid='FairMOT', 
output_root='./results/mot_demo', 
path='/workspace/project/samples/videos/G175647144539.mp4', 
prop_flag=False, remove_layers=['layer4'], 
resume='None', save_images=False, save_result=False, 
save_videos=True, test_mot16=False, track_buffer=30, 
tsize=[800, 600], use_kalman=True, workers=4)
[NULL @ 0x55b86843db00] PPS id out of range: 0
[hevc @ 0x55b86843db00] PPS id out of range: 0
Lenth of the video: 1107126 frames
2022-04-04 01:35:40.498 | INFO     | __main__:main:147 - Model Summary: Params: 25.33M, Gflops: 86.43
2022-04-04 01:35:45.542 | INFO     | __main__:main:150 - loading checkpoint
2022-04-04 01:35:45.888 | INFO     | __main__:main:154 - loaded checkpoint done.
[hevc @ 0x55b868566400] Could not find ref with POC 4
2022-04-04 01:35:46.662 | INFO     | __main__:eval_seq:100 - Processing frame 0 (100000.00 fps)
Sizes of tensors must match except in dimension 2. Got 75 and 76 (The offending index is 0)
...

opened by UKeyboard 1

scikit-learn version error
I use scikit-learn==1.1.2 on my project and I also use Unitrack as a submodule. However, Unitrack uses scikit-learn==0.22, which version is too old.

So, import error is occured in utils/mask.py line.16.

from sklearn.metrics import jaccard_similarity_score

jaccard_similarity_score is deleted from scikit-learn >= 0.23. (And it isn't used in utils/mask.py, so it can be deleted.)

Will you update scikit-learn? If scikit-learn is updated, it might be some other errors will be occured.
opened by kojikojiprg 0
Bounding Box still appears when object goes out of frame

Hi,

Bounding Box still appears when object goes out of frame after some time and SOT is tracking other objects in frame until the initially selected object reappears. Is there any solution to solve this problem?

Thanks.

opened by saishiva024 0
Bounding Boxes doesn't scale for SOT

I've tried SOT on custom videos using demo/sot_demo.py. After drawing the initial Bounding Box, it doesn't scale even if size of Object is either reduced/increased. What might be the possible rootcause for this problem?

Thanks in advance.

opened by saishiva024 0

Apperance Model performance very bad in night scenario ?

Hi Author,

         I found Apperance Model can work in many scenario. but in night ,the picture quality or light have big impact on Apperance model peformance . Why it is normal ？？ because  most of  image net data  came from daytime scene ?

opened by CarlHuangNuc 1

PoseTrack download

Thanks for your excellent work. I can't access PoseTrack's home page for data download. Do you have a link available for PoseTrack data download? Thank you very much!

opened by Yulv-git 2

Owner

ZhongdaoWang

Computer Vision, Multi-Object Tracking

GitHub

Joint detection and tracking model named DEFT, or ``Detection Embeddings for Tracking.

DEFT: Detection Embeddings for Tracking DEFT: Detection Embeddings for Tracking, Mohamed Chaabane, Peter Zhang, J. Ross Beveridge, Stephen O'Hara

253 Dec 18, 2022

DeepFaceEditing: Deep Face Generation and Editing with Disentangled Geometry and Appearance Control

DeepFaceEditing: Deep Face Generation and Editing with Disentangled Geometry and Appearance Control One version of our system is implemented using the

260 Nov 28, 2022

[CVPR'21] DeepSurfels: Learning Online Appearance Fusion

DeepSurfels: Learning Online Appearance Fusion Paper | Video | Project Page This is the official implementation of the CVPR 2021 submission DeepSurfel

52 Nov 14, 2022

Canonical Appearance Transformations

CAT-Net: Learning Canonical Appearance Transformations Code to accompany our paper "How to Train a CAT: Learning Canonical Appearance Transformations

54 Dec 24, 2022

Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance

Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance Project Page | Paper | Data This repository contains an implementatio

521 Dec 30, 2022

A customisable game where you have to quickly click on black tiles in order of appearance while avoiding clicking on white squares.

W.I.P-Aim-Memory-Game A customisable game where you have to quickly click on black tiles in order of appearance while avoiding clicking on white squar

1 Dec 8, 2021

⚡ Fast • 🪶 Lightweight • 0️⃣ Dependency • 🔌 Pluggable • 😈 TLS interception • 🔒 DNS-over-HTTPS • 🔥 Poor Man's VPN • ⏪ Reverse & ⏩ Forward • 👮🏿 "Proxy Server" framework • 🌐 "Web Server" framework • ➵ ➶ ➷ ➠ "PubSub" framework • 👷 "Work" acceptor & executor framework

Table of Contents Features Install Using PIP Stable version Development version Using Docker Stable version Development version Using HomeBrew Stable

2.2k Jan 8, 2023

Unified tracking framework with a single appearance model

Related tags

Overview

Tasks & Framework

Tasks

Appearance model

Propagation and Association

Results

Qualitative results

Quantitative results

Getting started

Demo

Update log

Acknowledgement

Comments

Owner

ZhongdaoWang

Joint detection and tracking model named DEFT, or ``Detection Embeddings for Tracking.

DeepFaceEditing: Deep Face Generation and Editing with Disentangled Geometry and Appearance Control

[CVPR'21] DeepSurfels: Learning Online Appearance Fusion

Canonical Appearance Transformations

Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance

A customisable game where you have to quickly click on black tiles in order of appearance while avoiding clicking on white squares.

SLAMP: Stochastic Latent Appearance and Motion Prediction

Pytorch implementation for A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose

A general python framework for single object tracking in LiDAR point clouds, based on PyTorch Lightning.

Tracking code for the winner of track 1 in the MMP-Tracking Challenge at ICCV 2021 Workshop.

Tracking Pipeline helps you to solve the tracking problem more easily

Quadruped-command-tracking-controller - Quadruped command tracking controller (flat terrain)

Python package for multiple object tracking research with focus on laboratory animals tracking.

The official implementation of ICCV paper "Box-Aware Feature Enhancement for Single Object Tracking on Point Clouds".

A simple implementation of Kalman filter in single object tracking

Python Single Object Tracking Evaluation

Keyhole Imaging: Non-Line-of-Sight Imaging and Tracking of Moving Objects Along a Single Optical Path

UMEC: Unified Model and Embedding Compression for Efficient Recommendation Systems