TrackFormer: Multi-Object Tracking with Transformers

Tim Meinhardt

Last update: Dec 29, 2022

Related tags

Deep Learning trackformer

Overview

TrackFormer: Multi-Object Tracking with Transformers

This repository provides the official implementation of the TrackFormer: Multi-Object Tracking with Transformers paper by Tim Meinhardt, Alexander Kirillov, Laura Leal-Taixe and Christoph Feichtenhofer. The codebase builds upon DETR, Deformable DETR and Tracktor.

As the paper is still under submission this repository will continuously be updated and might at times not reflect the current state of the arXiv paper.

Abstract

The challenging task of multi-object tracking (MOT) requires simultaneous reasoning about track initialization, identity, and spatiotemporal trajectories. We formulate this task as a frame-to-frame set prediction problem and introduce TrackFormer, an end-to-end MOT approach based on an encoder-decoder Transformer architecture. Our model achieves data association between frames via attention by evolving a set of track predictions through a video sequence. The Transformer decoder initializes new tracks from static object queries and autoregressively follows existing tracks in space and time with the new concept of identity preserving track queries. Both decoder query types benefit from self- and encoder-decoder attention on global frame-level features, thereby omitting any additional graph optimization and matching or modeling of motion and appearance. TrackFormer represents a new tracking-by-attention paradigm and yields state-of-the-art performance on the task of multi-object tracking (MOT17) and segmentation (MOTS20).

TrackFormer casts multi-object tracking as a set prediction problem performing joint detection and tracking-by-attention. The architecture consists of a CNN for image feature extraction, a Transformer encoder for image feature encoding and a Transformer decoder which applies self- and encoder-decoder attention to produce output embeddings with bounding box and class information.

Installation

We refer to our docs/INSTALL.md for detailed installation instructions.

Train TrackFormer

We refer to our docs/TRAIN.md for detailed training instructions.

Evaluate TrackFormer

In order to evaluate TrackFormer on a multi-object tracking dataset, we provide the src/track.py script which supports several datasets and splits interchangle via the dataset_name argument (See src/datasets/tracking/factory.py for an overview of all datasets.) The default tracking configuration is specified in cfgs/track.yaml. To facilitate the reproducibility of our results, we provide evaluation metrics for both the train and test set.

MOT17

Private detections

python src/track.py reid

MOT17	MOTA	IDF1	MT	ML	FP	FN	ID SW.
Train	68.1	67.6	816	207	33549	71937	1935
Test	65.0	63.9	1074	324	70443	123552	3528

Public detections (DPM, FRCNN, SDP)

python src/track.py with \
    reid \
    public_detections=min_iou_0_5 \
    obj_detect_checkpoint_file=models/mots20_train_masks/checkpoint.pth

MOT17	MOTA	IDF1	MT	ML	FP	FN	ID SW.
Train	67.2	66.9	663	294	14640	94122	1866
Test	62.5	60.7	702	632	32828	174921	3917

MOTS20

python src/track.py with \
    dataset_name=MOTS20-ALL \
    obj_detect_checkpoint_file=models/mots20_train_masks/checkpoint.pth

Our tracking script only applies MOT17 metrics evaluation but outputs MOTS20 mask prediction files. To evaluate these download the official MOTChallengeEvalKit.

MOTS20	sMOTSA	IDF1	FP	FN	IDs
Train	--	--	--	--	--
Test	54.9	63.6	2233	7195	278

Demo

To facilitate the application of TrackFormer, we provide a demo interface which allows for a quick processing of a given video sequence.

ffmpeg -i data/snakeboard/snakeboard.mp4 -vf fps=30 data/snakeboard/%06d.png

python src/track.py with \
    dataset_name=DEMO \
    data_root_dir=data/snakeboard \
    output_dir=data/snakeboard \
    write_images=pretty

Publication

If you use this software in your research, please cite our publication:

@InProceedings{meinhardt2021trackformer,
    title={TrackFormer: Multi-Object Tracking with Transformers},
    author={Tim Meinhardt and Alexander Kirillov and Laura Leal-Taixe and Christoph Feichtenhofer},
    year={2021},
    eprint={2101.02702},
    archivePrefix={arXiv},
}

Comments

Not able to install MultiscaleDeformableAttention

Hi,

I am trying to run a test for training trackformer. I am looking at INSTALL.md to build the environment. When I try python src/trackformer/models/ops/setup.py build --build-base=src/trackformer/models/ops/ install I get an error saying:

Traceback (most recent call last):
  File "src/trackformer/models/ops/setup.py", line 62, in <module>
    setup(
  File "/z/home/mahzad-khosh/trackformer/lib/python3.8/site-packages/setuptools/__init__.py", line 87, in setup
    return distutils.core.setup(**attrs)
  File "/z/home/mahzad-khosh/trackformer/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 148, in setup
    return run_commands(dist)
  File "/z/home/mahzad-khosh/trackformer/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 163, in run_commands
    dist.run_commands()
  File "/z/home/mahzad-khosh/trackformer/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 967, in run_commands
    self.run_command(cmd)
  File "/z/home/mahzad-khosh/trackformer/lib/python3.8/site-packages/setuptools/dist.py", line 1214, in run_command
    super().run_command(command)
  File "/z/home/mahzad-khosh/trackformer/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 986, in run_command
    cmd_obj.run()
  File "/z/home/mahzad-khosh/trackformer/lib/python3.8/site-packages/setuptools/_distutils/command/build.py", line 135, in run
    self.run_command(cmd_name)
  File "/z/home/mahzad-khosh/trackformer/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/z/home/mahzad-khosh/trackformer/lib/python3.8/site-packages/setuptools/dist.py", line 1214, in run_command
    super().run_command(command)
  File "/z/home/mahzad-khosh/trackformer/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 986, in run_command
    cmd_obj.run()
  File "/z/home/mahzad-khosh/trackformer/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 79, in run
    _build_ext.run(self)
  File "/z/home/mahzad-khosh/trackformer/lib/python3.8/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run
    _build_ext.build_ext.run(self)
  File "/z/home/mahzad-khosh/trackformer/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 339, in run
    self.build_extensions()
  File "/z/home/mahzad-khosh/trackformer/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 580, in build_extensions
    build_ext.build_extensions(self)
  File "/z/home/mahzad-khosh/trackformer/lib/python3.8/site-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions
    _build_ext.build_ext.build_extensions(self)
  File "/z/home/mahzad-khosh/trackformer/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 448, in build_extensions
    self._build_extensions_serial()
  File "/z/home/mahzad-khosh/trackformer/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 473, in _build_extensions_serial
    self.build_extension(ext)
  File "/z/home/mahzad-khosh/trackformer/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 202, in build_extension
    _build_ext.build_extension(self, ext)
  File "/z/home/mahzad-khosh/trackformer/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 528, in build_extension
    objects = self.compiler.compile(sources,
  File "/z/home/mahzad-khosh/trackformer/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 414, in unix_wrap_ninja_compile
    _write_ninja_file_and_compile_objects(
  File "/z/home/mahzad-khosh/trackformer/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1135, in _write_ninja_file_and_compile_objects
    _run_ninja_build(
  File "/z/home/mahzad-khosh/trackformer/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1413, in _run_ninja_build
    raise RuntimeError(message)
RuntimeError: Error compiling objects for extension

How can I get this package installed?

Thanks,

opened by mkhoshle 16

AttributeError: 'ReadOnlyList' object has no attribute 'message'
Hi, when I load the trained checkpoint, I got this error: AttributeError: 'ReadOnlyList' object has no attribute 'message'

I think this mistake is related to Sacred.

when reading config.yaml , i got this message: message: The configuration is read-only in a captured function!

the config.yaml :

lr_linear_proj_names: !!python/object/new:sacred.config.custom_containers.ReadOnlyList listitems: - reference_points - sampling_offsets state: message: The configuration is read-only in a captured function!

I'm just following the training instructions. And I don't know much about Sacred. Can anyone help me? Thanks firstly.
opened by AzeroGYH 15
Expected results after training on the joint set of CrowdHuman and MOT17

Expected results after training on the joint set of CrowdHuman and MOT17

Hey, thank you for you excellent work!

I train TrackFormer on your default setting (load from pretrained CrowdHuaman checkpoint) on the joint set of CrowdHuman and MOT17, but get result on MOT17 73.3 MOTA (I think it is validated on your default dataset mot17_train_cross_val_frame_0_5_to_1_0_coco).

Is the result corresponds to the training set result (74.2 MOTA provided in ReadMe) or the cross-validate result (71.3 MOTA provided in the paper)?

Thank you in advance.

opened by FengLi-ust 7
The DDP hung up at torch.nn.parallel.DistributedDataParallel(model)

Hi, I really enjoyed reading your paper and code. Great work. I am trying to reproduce the results by running your code on HPC (cluster, one node with 2 GPUs). As mentioned in read me training section, I followed the following command in interactive slurm mode. " python -m torch.distributed.launch --nproc_per_node=2 --use_env src/train.py with \ crowdhuman
deformable
multi_frame
tracking
output_dir=models/crowdhuman_deformable_multi_frame \ "

But my code is getting hung up at line " model = torch.nn.parallel.DistributedDataParallel(model, device_ids=[args.gpu], find_unused_parameters=True)."

Could you please help me? Following is the output

Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.

WARNING - root - Changed type of config entry "train_split" from str to NoneType WARNING - train - No observers have been added to this run WARNING - root - Changed type of config entry "train_split" from str to NoneType WARNING - train - No observers have been added to this run INFO - train - Running command 'load_config' INFO - train - Started INFO - train - Running command 'load_config' INFO - train - Started Configuration (modified, added, typechanged, doc): aux_loss = True backbone = 'resnet50' batch_size = 1 bbox_loss_coef = 5.0 clip_max_norm = 0.1 cls_loss_coef = 2.0 coco_and_crowdhuman_prev_frame_rnd_augs = 0.2 coco_min_num_objects = 0 coco_panoptic_path = None coco_path = 'data/coco_2017' coco_person_train_split = None crowdhuman_path = 'data/CrowdHuman' crowdhuman_train_split = 'train_val' dataset = 'mot_crowdhuman' debug = False dec_layers = 6 dec_n_points = 4 deformable = True device = 'cuda' dice_loss_coef = 1.0 dilation = False dim_feedforward = 1024 dist_url = 'env://' dropout = 0.1 enc_layers = 6 enc_n_points = 4 eos_coef = 0.1 epochs = 80 eval_only = False eval_train = False focal_alpha = 0.25 focal_gamma = 2 focal_loss = True freeze_detr = False giou_loss_coef = 2 hidden_dim = 288 load_mask_head_from_model = None lr = 0.0002 lr_backbone = 2e-05 lr_backbone_names = ['backbone.0'] lr_drop = 50 lr_linear_proj_mult = 0.1 lr_linear_proj_names = ['reference_points', 'sampling_offsets'] lr_track = 0.0001 mask_loss_coef = 1.0 masks = False merge_frame_features = False mot_path_train = 'data/MOT17' mot_path_val = 'data/MOT17' multi_frame_attention = True multi_frame_attention_separate_encoder = True multi_frame_encoding = True nheads = 8 no_vis = False num_feature_levels = 4 num_queries = 500 num_workers = 2 output_dir = 'models/crowdhuman_deformable_multi_frame' overflow_boxes = True overwrite_lr_scheduler = False overwrite_lrs = False position_embedding = 'sine' pre_norm = False resume = '' resume_optim = False resume_shift_neuron = False resume_vis = False save_model_interval = 5 seed = 42 set_cost_bbox = 5.0 set_cost_class = 2.0 set_cost_giou = 2.0 start_epoch = 1 track_attention = False track_backprop_prev_frame = False track_prev_frame_range = 5 track_prev_frame_rnd_augs = 0.01 track_prev_prev_frame = False track_query_false_negative_prob = 0.4 track_query_false_positive_eos_weight = True track_query_false_positive_prob = 0.1 tracking = True tracking_eval = True train_split = None two_stage = False val_interval = 5 val_split = 'mot17_train_cross_val_frame_0_5_to_1_0_coco' vis_and_log_interval = 50 vis_port = 8097 vis_server = '' weight_decay = 0.0001 with_box_refine = True world_size = 2 img_transform: max_size = 1333 val_width = 800 INFO - train - Completed after 0:00:00 Namespace(aux_loss=True, backbone='resnet50', batch_size=1, bbox_loss_coef=5.0, clip_max_norm=0.1, cls_loss_coef=2.0, coco_and_crowdhuman_prev_frame_rnd_augs=0.2, coco_min_num_objects=0, coco_panoptic_path=None, coco_path='data/coco_2017', coco_person_train_split=None, crowdhuman_path='data/CrowdHuman', crowdhuman_train_split='train_val', dataset='mot_crowdhuman', debug=False, dec_layers=6, dec_n_points=4, deformable=True, device='cuda', dice_loss_coef=1.0, dilation=False, dim_feedforward=1024, dist_url='env://', dropout=0.1, enc_layers=6, enc_n_points=4, eos_coef=0.1, epochs=80, eval_only=False, eval_train=False, focal_alpha=0.25, focal_gamma=2, focal_loss=True, freeze_detr=False, giou_loss_coef=2, hidden_dim=288, img_transform=Namespace(max_size=1333, val_width=800), load_mask_head_from_model=None, lr=0.0002, lr_backbone=2e-05, lr_backbone_names=['backbone.0'], lr_drop=50, lr_linear_proj_mult=0.1, lr_linear_proj_names=['reference_points', 'sampling_offsets'], lr_track=0.0001, mask_loss_coef=1.0, masks=False, merge_frame_features=False, mot_path_train='data/MOT17', mot_path_val='data/MOT17', multi_frame_attention=True, multi_frame_attention_separate_encoder=True, multi_frame_encoding=True, nheads=8, no_vis=False, num_feature_levels=4, num_queries=500, num_workers=2, output_dir='models/crowdhuman_deformable_multi_frame', overflow_boxes=True, overwrite_lr_scheduler=False, overwrite_lrs=False, position_embedding='sine', pre_norm=False, resume='', resume_optim=False, resume_shift_neuron=False, resume_vis=False, save_model_interval=5, seed=42, set_cost_bbox=5.0, set_cost_class=2.0, set_cost_giou=2.0, start_epoch=1, track_attention=False, track_backprop_prev_frame=False, track_prev_frame_range=5, track_prev_frame_rnd_augs=0.01, track_prev_prev_frame=False, track_query_false_negative_prob=0.4, track_query_false_positive_eos_weight=True, track_query_false_positive_prob=0.1, tracking=True, tracking_eval=True, train_split=None, two_stage=False, val_interval=5, val_split='mot17_train_cross_val_frame_0_5_to_1_0_coco', vis_and_log_interval=50, vis_port=8097, vis_server='', weight_decay=0.0001, with_box_refine=True, world_size=2) using distributed mode | distributed init (rank 1): env:// Configuration (modified, added, typechanged, doc): aux_loss = True backbone = 'resnet50' batch_size = 1 bbox_loss_coef = 5.0 clip_max_norm = 0.1 cls_loss_coef = 2.0 coco_and_crowdhuman_prev_frame_rnd_augs = 0.2 coco_min_num_objects = 0 coco_panoptic_path = None coco_path = 'data/coco_2017' coco_person_train_split = None crowdhuman_path = 'data/CrowdHuman' crowdhuman_train_split = 'train_val' dataset = 'mot_crowdhuman' debug = False dec_layers = 6 dec_n_points = 4 deformable = True device = 'cuda' dice_loss_coef = 1.0 dilation = False dim_feedforward = 1024 dist_url = 'env://' dropout = 0.1 enc_layers = 6 enc_n_points = 4 eos_coef = 0.1 epochs = 80 eval_only = False eval_train = False focal_alpha = 0.25 focal_gamma = 2 focal_loss = True freeze_detr = False giou_loss_coef = 2 hidden_dim = 288 load_mask_head_from_model = None lr = 0.0002 lr_backbone = 2e-05 lr_backbone_names = ['backbone.0'] lr_drop = 50 lr_linear_proj_mult = 0.1 lr_linear_proj_names = ['reference_points', 'sampling_offsets'] lr_track = 0.0001 mask_loss_coef = 1.0 masks = False merge_frame_features = False mot_path_train = 'data/MOT17' mot_path_val = 'data/MOT17' multi_frame_attention = True multi_frame_attention_separate_encoder = True multi_frame_encoding = True nheads = 8 no_vis = False num_feature_levels = 4 num_queries = 500 num_workers = 2 output_dir = 'models/crowdhuman_deformable_multi_frame' overflow_boxes = True overwrite_lr_scheduler = False overwrite_lrs = False position_embedding = 'sine' pre_norm = False resume = '' resume_optim = False resume_shift_neuron = False resume_vis = False save_model_interval = 5 seed = 42 set_cost_bbox = 5.0 set_cost_class = 2.0 set_cost_giou = 2.0 start_epoch = 1 track_attention = False track_backprop_prev_frame = False track_prev_frame_range = 5 track_prev_frame_rnd_augs = 0.01 track_prev_prev_frame = False track_query_false_negative_prob = 0.4 track_query_false_positive_eos_weight = True track_query_false_positive_prob = 0.1 tracking = True tracking_eval = True train_split = None two_stage = False val_interval = 5 val_split = 'mot17_train_cross_val_frame_0_5_to_1_0_coco' vis_and_log_interval = 50 vis_port = 8097 vis_server = '' weight_decay = 0.0001 with_box_refine = True world_size = 2 img_transform: max_size = 1333 val_width = 800 INFO - train - Completed after 0:00:00 Namespace(aux_loss=True, backbone='resnet50', batch_size=1, bbox_loss_coef=5.0, clip_max_norm=0.1, cls_loss_coef=2.0, coco_and_crowdhuman_prev_frame_rnd_augs=0.2, coco_min_num_objects=0, coco_panoptic_path=None, coco_path='data/coco_2017', coco_person_train_split=None, crowdhuman_path='data/CrowdHuman', crowdhuman_train_split='train_val', dataset='mot_crowdhuman', debug=False, dec_layers=6, dec_n_points=4, deformable=True, device='cuda', dice_loss_coef=1.0, dilation=False, dim_feedforward=1024, dist_url='env://', dropout=0.1, enc_layers=6, enc_n_points=4, eos_coef=0.1, epochs=80, eval_only=False, eval_train=False, focal_alpha=0.25, focal_gamma=2, focal_loss=True, freeze_detr=False, giou_loss_coef=2, hidden_dim=288, img_transform=Namespace(max_size=1333, val_width=800), load_mask_head_from_model=None, lr=0.0002, lr_backbone=2e-05, lr_backbone_names=['backbone.0'], lr_drop=50, lr_linear_proj_mult=0.1, lr_linear_proj_names=['reference_points', 'sampling_offsets'], lr_track=0.0001, mask_loss_coef=1.0, masks=False, merge_frame_features=False, mot_path_train='data/MOT17', mot_path_val='data/MOT17', multi_frame_attention=True, multi_frame_attention_separate_encoder=True, multi_frame_encoding=True, nheads=8, no_vis=False, num_feature_levels=4, num_queries=500, num_workers=2, output_dir='models/crowdhuman_deformable_multi_frame', overflow_boxes=True, overwrite_lr_scheduler=False, overwrite_lrs=False, position_embedding='sine', pre_norm=False, resume='', resume_optim=False, resume_shift_neuron=False, resume_vis=False, save_model_interval=5, seed=42, set_cost_bbox=5.0, set_cost_class=2.0, set_cost_giou=2.0, start_epoch=1, track_attention=False, track_backprop_prev_frame=False, track_prev_frame_range=5, track_prev_frame_rnd_augs=0.01, track_prev_prev_frame=False, track_query_false_negative_prob=0.4, track_query_false_positive_eos_weight=True, track_query_false_positive_prob=0.1, tracking=True, tracking_eval=True, train_split=None, two_stage=False, val_interval=5, val_split='mot17_train_cross_val_frame_0_5_to_1_0_coco', vis_and_log_interval=50, vis_port=8097, vis_server='', weight_decay=0.0001, with_box_refine=True, world_size=2) using distributed mode | distributed init (rank 0): env:// git: sha: d62d81023dbffb4a1820db39ce527b66df6d7b61, status: has uncommited changes, branch: main

opened by shubham83183 6
TypeError: ms_deform_attn_forward(): incompatible function arguments.

Hi,

Cheers on the wonderful work. I am trying to run just the evaluation with 'python3 src/track.py with reid'.

I am getting the following error:

TypeError: ms_deform_attn_forward(): incompatible function arguments. The following argument types are supported: 1. (arg0: at::Tensor, arg1: at::Tensor, arg2: at::Tensor, arg3: at::Tensor, arg4: at::Tensor, arg5: int) -> at::Tensor Invoked with: tensor([[[[-9.0617e+00, 2.9002e+00, -5.3017e+00, ..., -1.4381e+00, 3.8348e+00, 2.1320e-01], [ 1.0434e+00, -3.8773e-01, -3.6883e+00, ..., -2.8583e+00, -5.9393e-01, 6.8181e-01], [ 1.6065e+00, 7.8195e-01, -2.3155e+00, ..., -2.0958e+00, -1.9994e-01, -1.6163e+00], ..., [-2.0559e+00, 6.3167e-02, 4.4025e+00, ..., 1.9450e+00, -8.6947e-01, 1.3416e+00], [ 2.9230e+00, 1.6198e+00, 3.9162e+00, ..., -1.7625e+00, -6.7662e-01, -2.4316e+00], [-2.7931e+00, -1.3822e-01, -1.1136e+00, ..., 1.2329e-01, 3.1032e+00, -1.0232e+00]],

..... device='cuda:0'), 64

opened by harkiratbehl 6
Some confusion about the paper

Hi, thanks for you great job! I have a question about your paper, In the MOT17 experiment section of the paper, The dataset you used for the test is the MOT17 test dataset or a part of training dataset as the test dataset?

opened by quxu91 6

Evaluate TrackFormer on MOT17 with the problem with numpy

Hello, when I tried to Evaluate TrackFormer on MOT17 with python src/track.py with \ reid \ tracker_cfg.public_detections=min_iou_0_5 \ obj_detect_checkpoint_file=models/mot17_deformable_multi_frame/checkpoint_epoch_50.pth, I got the problem.

INFO - main - TRACK SEQ: MOT17-02-DPM
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 600/600 [02:15<00:00,  4.41it/s]
INFO - main - NUM TRACKS: 96 ReIDs: 13
INFO - main - RUNTIME: 135.96 s
ERROR - track - Failed after 0:08:13!
Traceback (most recent calls WITHOUT Sacred internals):
  File "src/track.py", line 153, in main
    mot_accum = get_mot_accum(results, seq_loader)
  File "/media/HardDisk_new/wh/second_code/trackformer/src/trackformer/util/track_utils.py", line 397, in get_mot_accum
    distance)
  File "/home/wh/anaconda3/envs/trackformer/lib/python3.7/site-packages/motmetrics/mot.py", line 252, in update
    rids, cids = linear_sum_assignment(dists)
  File "/home/wh/anaconda3/envs/trackformer/lib/python3.7/site-packages/motmetrics/lap.py", line 73, in linear_sum_assignment
    rids, cids = solver(costs)
  File "/home/wh/anaconda3/envs/trackformer/lib/python3.7/site-packages/motmetrics/lap.py", line 288, in lsa_solve_lapjv
    from lap import lapjv
  File "/home/wh/anaconda3/envs/trackformer/lib/python3.7/site-packages/lap/__init__.py", line 25, in <module>
    from ._lapjv import (
  File "__init__.pxd", line 199, in init lap._lapjv
ValueError: numpy.ndarray has the wrong size, try recompiling. Expected 80, got 88

used python src/track.py with reid and got the same problem

INFO - main - TRACK SEQ: MOT17-02-DPM
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 600/600 [02:15<00:00,  4.44it/s]
INFO - main - NUM TRACKS: 133 ReIDs: 25
INFO - main - RUNTIME: 135.04 s
ERROR - track - Failed after 0:07:17!
Traceback (most recent calls WITHOUT Sacred internals):
  File "src/track.py", line 153, in main
    mot_accum = get_mot_accum(results, seq_loader)
  File "/media/HardDisk_new/wh/second_code/trackformer/src/trackformer/util/track_utils.py", line 397, in get_mot_accum
    distance)
  File "/home/wh/anaconda3/envs/trackformer/lib/python3.7/site-packages/motmetrics/mot.py", line 252, in update
    rids, cids = linear_sum_assignment(dists)
  File "/home/wh/anaconda3/envs/trackformer/lib/python3.7/site-packages/motmetrics/lap.py", line 73, in linear_sum_assignment
    rids, cids = solver(costs)
  File "/home/wh/anaconda3/envs/trackformer/lib/python3.7/site-packages/motmetrics/lap.py", line 288, in lsa_solve_lapjv
    from lap import lapjv
  File "/home/wh/anaconda3/envs/trackformer/lib/python3.7/site-packages/lap/__init__.py", line 25, in <module>
    from ._lapjv import (
  File "__init__.pxd", line 199, in init lap._lapjv
ValueError: numpy.ndarray has the wrong size, try recompiling. Expected 80, got 88

opened by a171232886 5

RuntimeError: Error compiling objects for extension
Hello, when i finished all the requirements installed,when compile the module i met this error.

_`/usr/include/c++/7/bits/basic_string.tcc:1067:16: error: cannot call member function ‘void std::basic_string<_CharT, _Traits, _Alloc>::_Rep::_M_set_sharable() [with _CharT = char16_t; _Traits = std::char_traits<char16_t>; _Alloc = std::allocator<char16_t>]’ without object __p->_M_set_sharable(); error.txt

~~~~~~~~~^~

/usr/include/c++/7/bits/basic_string.tcc:1067:16: error: cannot call member function ‘void std::basic_string<_CharT, _Traits, _Alloc>::_Rep::M_set_sharable() [with _CharT = char32_t; _Traits = std::char_traits<char32_t>; _Alloc = std::allocator<char32_t>]’ without object ninja: build stopped: subcommand failed.` i am not sure what i can do to address this issue so ask help for people who are solve this problem correctly,Thank you! and my system configuration is as follows: Ubuntu 18.04 LTS cuda 10.1 pytorch 1.5
opened by Soyad-yao 5
How do you select the initial track queries from the object queries?
Thank you for the wonderful work. I have read the paper and code, and have a question about track query initialization.

How do you select the initial track queries from the object queries in the evaluation? In the paper, the following sentences are stated,

Each valid object detection {b00, b10, . . . } with a classification score above σobject, i.e., output embedding not predicting the background class (crossed), initializes a new track query embedding.

After reading this, I expected to add the object queries with non-zero class labels to the new track queries. However, when looking at the code, it seems to be extracting only those that match 0.

new_det_keep = torch.logical_and( new_det_scores > self.detection_obj_score_thresh, result['labels'][-self.num_object_queries:] == 0)

I believe what is written in the paper is correct, but this implementation is beyond my understanding, could you please tell me what is happening in the implementation? Or if I have extracted the wrong part of the implementation, please let me know the correct part.
opened by Tsunehiko 4
Why don't the track queries get updated for two_stage?

https://github.com/timmeinhardt/trackformer/blob/d62d81023dbffb4a1820db39ce527b66df6d7b61/src/trackformer/models/deformable_transformer.py#L180-L230

I am confused why the track queries don't get updated for the two-stage.

Also, nice work by the way!

opened by owen24819 4
Using the DEMO code
This paper is very good! When I use the code of demo interface,I always get mistakes like this:

$ python src/track.py with \

dataset_name=DEMO \ data_root_dir=data/snakeboard \ output_dir=data/snakeboard \ write_images=pretty

WARNING - root - Changed type of config entry "write_images" from bool to str WARNING - track - No observers have been added to this run

In the end, I'll get something like this again: if (isinstance(colors[0], Sized) and len(colors[0]) == 2 IndexError: list index out of range

I have created a DEMO folder and put the video I'm going to demonstrate into it, but it still reports this error. I guess it's because the video I put in hasn't been converted to coco format?

But when I demonstrate, I should be able to enter any video and get visual results rather than having to convert it to Coco format.

I'm very confused.How can I solve the problem?
opened by Zachein 4
Use of args.multi_frame_attention

Hi @timmeinhardt , thanks so much for this great work!

While trying to reproduce the results for MOTS20, I noticed some differences between your DeformableDETR and the DETR implementations.

Could you explain the use of args.multi_frame_attention in the adjusted DeformableDETR? I'm wondering why it is not used in the DETR based model for mask tracking.

Is multi frame attention not necessary to utilise track queries in the model? I read section 4.2 in the paper, but I'm still a bit confused.

opened by tragians 3
error in ms_deformable_im2col_cuda

I have install MultiScaleDeformableAttention package, but here comes two errors and the model is still training: error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device error in ms_deformable_col2im_coord_cuda: no kernel image is available for execution on the device

opened by hahapt 1
Error while train when get dataset

Hi,When I train the model with python src/train.py with mot17 deformable multi_frame tracking output_dir=models/mot17_deformable_multi_frame, the following errors occur： Traceback (most recent call last): File "src/train.py", line 357, in train(args) File "src/train.py", line 284, in train visualizers['train'], args) File "/home/ubuntu/track/trackformer/src/trackformer/engine.py", line 119, in train_one_epoch for i, (samples, targets) in enumerate(metric_logger.log_every(data_loader, epoch)): File "/home/ubuntu/track/trackformer/src/trackformer/util/misc.py", line 230, in log_every for obj in iterable: File "/home/ubuntu/anaconda3/envs/trackformer/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in next data = self._next_data() File "/home/ubuntu/anaconda3/envs/trackformer/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data return self._process_data(data) File "/home/ubuntu/anaconda3/envs/trackformer/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data data.reraise() File "/home/ubuntu/anaconda3/envs/trackformer/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg) IndexError: Caught IndexError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/ubuntu/anaconda3/envs/trackformer/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop data = fetcher.fetch(index) File "/home/ubuntu/anaconda3/envs/trackformer/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/ubuntu/anaconda3/envs/trackformer/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/ubuntu/track/trackformer/src/trackformer/datasets/mot.py", line 49, in getitem img, target = self._getitem_from_id(idx, random_state, random_jitter=False) File "/home/ubuntu/track/trackformer/src/trackformer/datasets/coco.py", line 63, in _getitem_from_id img, target = self.prepare(img, target) File "/home/ubuntu/track/trackformer/src/trackformer/datasets/coco.py", line 220, in call masks = convert_coco_poly_to_mask(segmentations, h, w) File "/home/ubuntu/track/trackformer/src/trackformer/datasets/coco.py", line 177, in convert_coco_poly_to_mask rles = coco_mask.frPyObjects(polygons, height, width) File "pycocotools/_mask.pyx", line 292, in pycocotools._mask.frPyObjects IndexError: list index out of range

I found that the data set is not a pair. But the dataset I generated with this command：python src/generate_coco_from_mot.py.I don't know what's wrong.

opened by pjy125175 3
What does valid ratio mean?

Hello,

In the Deformable Transformer, there is a variable called valid_ratios which is used based on the masks. valid_ratios = torch.stack([self.get_valid_ratio(m) for m in masks], 1). If the masks are None in my case how am I supposed to calculate it?

Also, what is the purpose of valid_ratios? I could not find anything in the Trackformer and Original Deformable Detr paper.

I would appreciate it if you could clarify this.

opened by mkhoshle 1
Ran into trouble with MOTS

So my code run for MOT but I am getting the following error when I try to run MOTS: Command: python src/track.py with dataset_name=MOTS20-ALL obj_detect_checkpoint_file=models/mots20_train_masks/checkpoint.pth

Error: in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for DETRSegmTracking: size mismatch for class_embed.weight: copying a param with shape torch.Size([2, 256]) from checkpoint, the shape in current model is torch.Size([21, 256]). size mismatch for class_embed.bias: copying a param with shape torch.Size([2]) from checkpoint, the shape in current model is torch.Size([21]).

opened by harkiratbehl 0
Making compatible to recent pytorch and python versions

Hi,

I have made some minor changes in the requirements.txt and install.md to make the installation work for recent pytorch and python versions. I have tested with Pytorch1.12.1 and Python3.9.

Hope this will be helpful to others.

opened by harkiratbehl 1

Owner

Tim Meinhardt

Ph.D. candidate at the Dynamic Vision and Learning Group, TU Munich

GitHub

Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

This is the official PyTorch implementation of our paper: "Joint Object Detection and Multi-Object Tracking with Graph Neural Networks". Our project website and video demos are here.

443 Dec 6, 2022

Object Detection and Multi-Object Tracking

1.6k Jan 4, 2023

SiamMOT is a region-based Siamese Multi-Object Tracking network that detects and associates object instances simultaneously.

SiamMOT: Siamese Multi-Object Tracking

432 Dec 17, 2022

TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction

TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction TSDF++ is a novel multi-object TSDF formulation that can encode mult

130 Dec 29, 2022

Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling

TGraM Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling, Qibin He, Xian Sun, Zhiyuan Yan, Beibei Li, Kun Fu Abstract Rece

6 Nov 25, 2022

Python package for multiple object tracking research with focus on laboratory animals tracking.

motutils is a Python package for multiple object tracking research with focus on laboratory animals tracking. Features loads: MOTChallenge CSV, sleap

2 Sep 5, 2022

Object tracking and object detection is applied to track golf puts in real time and display stats/games.

Putting_Game Object tracking and object detection is applied to track golf puts in real time and display stats/games. Works best with the Perfect Prac

1 Dec 29, 2021

FairMOT - A simple baseline for one-shot multi-object tracking

3.6k Jan 8, 2023

Official code for "EagerMOT: 3D Multi-Object Tracking via Sensor Fusion" [ICRA 2021]

EagerMOT: 3D Multi-Object Tracking via Sensor Fusion Read our ICRA 2021 paper here. Check out the 3 minute video for the quick intro or the full prese

276 Dec 30, 2022

a project for 3D multi-object tracking

155 Jan 4, 2023

ByteTrack(Multi-Object Tracking by Associating Every Detection Box)のPythonでのONNX推論サンプル

ByteTrack-ONNX-Sample ByteTrack(Multi-Object Tracking by Associating Every Detection Box)のPythonでのONNX推論サンプルです。 ONNXに変換したモデルも同梱しています。変換自体を試したい方はByteT

16 Oct 26, 2022

Implementation of MeMOT - Multi-Object Tracking with Memory - in Pytorch

MeMOT - Pytorch (wip) Implementation of MeMOT - Multi-Object Tracking with Memory - in Pytorch. This paper is just one in a line of work, but importan

15 May 9, 2022

The official repo for OC-SORT: Observation-Centric SORT on video Multi-Object Tracking. OC-SORT is simple, online and robust to occlusion/non-linear motion.

OC-SORT Observation-Centric SORT (OC-SORT) is a pure motion-model-based multi-object tracker. It aims to improve tracking robustness in crowded scenes

325 Jan 5, 2023

Joint detection and tracking model named DEFT, or ``Detection Embeddings for Tracking.

DEFT: Detection Embeddings for Tracking DEFT: Detection Embeddings for Tracking, Mohamed Chaabane, Peter Zhang, J. Ross Beveridge, Stephen O'Hara

253 Dec 18, 2022

Tracking code for the winner of track 1 in the MMP-Tracking Challenge at ICCV 2021 Workshop.

Tracking Code for the winner of track1 in MMP-Trakcing challenge This repository contains our tracking code for the Multi-camera Multiple People Track

29 Nov 13, 2022

Tracking Pipeline helps you to solve the tracking problem more easily

Tracking_Pipeline Tracking_Pipeline helps you to solve the tracking problem more easily I integrate detection algorithms like: Yolov5, Yolov4, YoloX,

32 Dec 21, 2022

Quadruped-command-tracking-controller - Quadruped command tracking controller (flat terrain)

Quadruped command tracking controller (flat terrain) Prepare Install RAISIM link

4 Oct 20, 2022

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera. This project prepares training and testing data for various deep learning projects such as 6D object pose estimation projects singleshotpose, as well as object detection and instance segmentation projects.

305 Dec 16, 2022

Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

U2Fusion Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal (VIS-IR, medical), multi

129 Dec 11, 2022

TrackFormer: Multi-Object Tracking with Transformers

Related tags

Overview

TrackFormer: Multi-Object Tracking with Transformers

Abstract

Installation

Train TrackFormer

Evaluate TrackFormer

MOT17

Private detections

Public detections (DPM, FRCNN, SDP)

MOTS20

Demo

Publication

Comments

Expected results after training on the joint set of CrowdHuman and MOT17

Hello, when i finished all the requirements installed,when compile the module i met this error.

Owner

Tim Meinhardt

Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

Object Detection and Multi-Object Tracking

SiamMOT is a region-based Siamese Multi-Object Tracking network that detects and associates object instances simultaneously.

TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction

Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling

Python package for multiple object tracking research with focus on laboratory animals tracking.

Object tracking and object detection is applied to track golf puts in real time and display stats/games.

FairMOT - A simple baseline for one-shot multi-object tracking

Official code for "EagerMOT: 3D Multi-Object Tracking via Sensor Fusion" [ICRA 2021]

a project for 3D multi-object tracking

ByteTrack(Multi-Object Tracking by Associating Every Detection Box)のPythonでのONNX推論サンプル

Implementation of MeMOT - Multi-Object Tracking with Memory - in Pytorch

The official repo for OC-SORT: Observation-Centric SORT on video Multi-Object Tracking. OC-SORT is simple, online and robust to occlusion/non-linear motion.

Joint detection and tracking model named DEFT, or ``Detection Embeddings for Tracking.

Tracking code for the winner of track 1 in the MMP-Tracking Challenge at ICCV 2021 Workshop.

Tracking Pipeline helps you to solve the tracking problem more easily

Quadruped-command-tracking-controller - Quadruped command tracking controller (flat terrain)

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.