Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation, NeurIPS 2021 Spotlight

ETH VIS Group

Last update: Dec 29, 2022

Related tags

Deep Learning pcan

Overview

PCAN for Multiple Object Tracking and Segmentation

This is the offical implementation of paper PCAN for MOTS.

We also present a trailer that consists of method illustrations and tracking & segmentation visualizations. Our project website contains more information: vis.xyz/pub/pcan. Code is under organization.

Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation
NeurIPS 2021, Spotlight
Lei Ke, Xia Li, Martin Danelljan, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu

Abstract

Multiple object tracking and segmentation requires detecting, tracking, and segmenting objects belonging to a set of given classes. Most approaches only exploit the temporal dimension to address the association problem, while relying on single frame predictions for the segmentation mask itself. We propose Prototypical Cross-Attention Network (PCAN), capable of leveraging rich spatio-temporal information for online multiple object tracking and segmentation. PCAN first distills a space-time memory into a set of prototypes and then employs cross-attention to retrieve rich information from the past frames. To segment each object, PCAN adopts a prototypical appearance module to learn a set of contrastive foreground and background prototypes, which are then propagated over time. Extensive experiments demonstrate that PCAN outperforms current video instance tracking and segmentation competition winners on both Youtube-VIS and BDD100K datasets, and shows efficacy to both one-stage and two-stage segmentation frameworks.

Prototypical Cross-Attention Networks (PCAN)

Main results

Results on BDD100K

Detector	mMOTSA-val	mIDF1-val	ID Sw.-val	Scores-val	mMOTSA-test	mIDF1-test	ID Sw.-test	Scores-test	Config	Weights	Preds	Visuals
ResNet-50	28.1	45.4	874	scores	31.9	50.4	845	scores	config	model \| MD5	preds	visuals

Installation

Please refer to INSTALL.md for installation instructions.

Usages

Please refer to GET_STARTED.md for dataset preparation and running instructions.

Citation

If you find PCAN useful in your research or refer to the provided baseline results, please star ⭐ this repository and consider citing 📝 :

@inproceedings{pcan,
    author    = {Ke, Lei and Li, Xia and Danelljan, Martin and Tai, Yu-Wing and Tang, Chi-Keung and Yu, Fisher},
    booktitle = {Advances in Neural Information Processing Systems},
    title     = {Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation},
    year      = {2021}
}

Comments

TypeError: EMQuasiDenseFasterRCNN: init() missing 3 required positional arguments: 'channels', 'proto_num', and 'stage_num'

Thanks for your excellent work. While I was running the train.py , I ran into an error

Traceback (most recent call last):
  File "D:\ProgramData\Anaconda3\envs\PCAN\lib\site-packages\mmcv\utils\registry.py", line 179, in build_from_cfg
    return obj_cls(**args)
TypeError: __init__() missing 3 required positional arguments: 'channels', 'proto_num', and 'stage_num'

During handling of the above exception, another exception occurred:
    raise type(e)(f'{obj_cls.__name__}: {e}')
TypeError: EMQuasiDenseFasterRCNN: __init__() missing 3 required positional arguments: 'channels', 'proto_num', and 'stage_num'

What is the possible solution for the problem?

opened by Zachein 13

No such file or directory

I train the model after I refer to GET_STARTED.md for preparation. However, the following problems have arisen： FileNotFoundError: [Errno 2] No such file or directory: 'data/bdd/images/10k/train/fee92217-63b3f87f.jpg'.

I checked the json file and the dataset. I found that certain image names appear in the json file, but could not find the jpg file in the corresponding folder. Please help me!

opened by laybebe 6
Why does this command cause the computer to crash

python ./tools/to_bdd100k.py configs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan.py --res eval_pcan_results_val.pkl --task seg_track --bdd-dir converted_results/ --nproc 2

or

python ./tools/to_bdd100k.py configs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan.py --res eval_pcan_results_val.pkl --task seg_track --bdd-dir converted_results/ --nproc 1

or

python ./tools/to_bdd100k.py configs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan.py --res eval_pcan_results_val.pkl --task seg_track --bdd-dir converted_results/

Then,mouse and keyboard cannot be used

opened by Alxx999 6
No BDD100K format jsons generated after running convert_to_bdd.sh

No BDD100K format jsons (for MOTS challenge submit) generated after running convert_to_bdd.sh. After running convert_to_bdd.sh, In pcan-main/converted_results/seg_track dir, png masks generated but I can not find json files.

my convert_to_bdd.sh: python ./tools/to_bdd100k.py configs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan.py --res eval_result_pcan_test.pkl
--task 'seg_track' --bdd-dir converted_results

opened by f414158949 5
An error encountered while running test code

1、When I run the following test code, I get a graphical error

python tools/test.py work_dirs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan.py work_dirs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan/latest.pth
--out work_dirs/resnest/result.pkl --format-only --show

error: Traceback (most recent call last): File "tools/test.py", line 164, in main() File "tools/test.py", line 153, in main dataset.format_results(outputs, **kwargs) File "/home/lin/anaconda3/envs/pcan/lib/python3.6/site-packages/mmdet/datasets/coco.py", line 350, in format_results assert isinstance(results, list), 'results must be a list' AssertionError: results must be a list

My configuration file is as follows model = dict( type='EMQuasiDenseMaskRCNNRefine', pretrained=None, backbone=dict( type='ResNet', depth=50, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=1, norm_cfg=dict(type='BN', requires_grad=True), norm_eval=True, style='pytorch'), neck=dict( type='FPN', in_channels=[256, 512, 1024, 2048], out_channels=256, num_outs=5), rpn_head=dict( type='RPNHead', in_channels=256, feat_channels=256, anchor_generator=dict( type='AnchorGenerator', scales=[8], ratios=[0.5, 1.0, 2.0], strides=[4, 8, 16, 32, 64]), bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[1.0, 1.0, 1.0, 1.0]), loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0), loss_bbox=dict( type='SmoothL1Loss', beta=0.1111111111111111, loss_weight=1.0)), roi_head=dict( type='QuasiDenseSegRoIHeadRefine', bbox_roi_extractor=dict( type='SingleRoIExtractor', roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0), out_channels=256, featmap_strides=[4, 8, 16, 32]), bbox_head=dict( type='Shared2FCBBoxHead', in_channels=256, fc_out_channels=1024, roi_feat_size=7, num_classes=8, bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[0.1, 0.1, 0.2, 0.2]), reg_class_agnostic=False, loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)), track_roi_extractor=dict( type='SingleRoIExtractor', roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0), out_channels=256, featmap_strides=[4, 8, 16, 32]), track_head=dict( type='QuasiDenseEmbedHead', num_convs=4, num_fcs=1, embed_channels=256, norm_cfg=dict(type='GN', num_groups=32), loss_track=dict(type='MultiPosCrossEntropyLoss', loss_weight=0.25), loss_track_aux=dict( type='L2Loss', neg_pos_ub=3, pos_margin=0, neg_margin=0.3, hard_mining=True, loss_weight=1.0)), mask_roi_extractor=dict( type='SingleRoIExtractor', roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0), out_channels=256, featmap_strides=[4, 8, 16, 32]), mask_head=dict( type='FCNMaskHeadPlus', num_convs=4, in_channels=256, conv_out_channels=256, num_classes=8, loss_mask=dict( type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)), double_train=False, refine_head=dict( type='EMMatchHeadPlus', num_convs=4, in_channels=256, conv_kernel_size=3, conv_out_channels=256, upsample_method='deconv', upsample_ratio=2, num_classes=8, pos_proto_num=10, neg_proto_num=10, stage_num=6, conv_cfg=None, norm_cfg=None, mask_thr_binary=0.5, match_score_thr=0.5, with_mask_ref=False, with_mask_key=True, with_dilation=False, loss_mask=dict( type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0))), tracker=dict( type='QuasiDenseSegFeatEmbedTracker', init_score_thr=0.7, obj_score_thr=0.3, match_score_thr=0.5, memo_tracklet_frames=10, memo_backdrop_frames=1, memo_momentum=0.8, nms_conf_thr=0.5, nms_backdrop_iou_thr=0.3, nms_class_iou_thr=0.7, with_cats=True, match_metric='bisoftmax'), train_cfg=dict( rpn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.7, neg_iou_thr=0.3, min_pos_iou=0.3, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=256, pos_fraction=0.5, neg_pos_ub=-1, add_gt_as_proposals=False), allowed_border=-1, pos_weight=-1, debug=False), rpn_proposal=dict( nms_across_levels=False, nms_pre=2000, nms_post=1000, max_num=1000, nms_thr=0.7, min_bbox_size=0), rcnn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.5, neg_iou_thr=0.5, min_pos_iou=0.5, match_low_quality=False, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=512, pos_fraction=0.25, neg_pos_ub=-1, add_gt_as_proposals=True), pos_weight=-1, debug=False, mask_size=28), embed=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.7, neg_iou_thr=0.3, min_pos_iou=0.5, match_low_quality=False, ignore_iof_thr=-1), sampler=dict( type='CombinedSampler', num=256, pos_fraction=0.5, neg_pos_ub=3, add_gt_as_proposals=True, pos_sampler=dict(type='InstanceBalancedPosSampler'), neg_sampler=dict( type='IoUBalancedNegSampler', floor_thr=-1, floor_fraction=0, num_bins=3)))), test_cfg=dict( rpn=dict( nms_across_levels=False, nms_pre=1000, nms_post=1000, max_num=1000, nms_thr=0.7, min_bbox_size=0), rcnn=dict( score_thr=0.5, nms=dict(type='nms', iou_threshold=0.5), max_per_img=100, mask_thr_binary=0.5)), fixed=True) dataset_type = 'BDDVideoDataset' data_root = '' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadMultiImagesFromFile'), dict( type='SeqLoadAnnotations', with_bbox=True, with_ins_id=True, with_mask=True), dict(type='SeqResize', img_scale=(1296, 720), keep_ratio=True), dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.5), dict( type='SeqNormalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='SeqPad', size_divisor=32), dict(type='SeqDefaultFormatBundle'), dict( type='SeqCollect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_match_indices', 'gt_masks'], ref_prefix='ref') ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1296, 720), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='VideoCollect', keys=['img']) ]) ] data = dict( samples_per_gpu=8, workers_per_gpu=2, train=[ dict( type='BDDVideoDataset', ann_file='/media/lin/文件/bdd/labels/seg_track_train_cocoformat.json', img_prefix='/media/lin/文件/bdd/images/seg_track_20/train', key_img_sampler=dict(interval=1), ref_img_sampler=dict(num_ref_imgs=1, scope=3, method='uniform'), pipeline=[ dict(type='LoadMultiImagesFromFile'), dict( type='SeqLoadAnnotations', with_bbox=True, with_ins_id=True, with_mask=True), dict(type='SeqResize', img_scale=(1296, 720), keep_ratio=True), dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.5), dict( type='SeqNormalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='SeqPad', size_divisor=32), dict(type='SeqDefaultFormatBundle'), dict( type='SeqCollect', keys=[ 'img', 'gt_bboxes', 'gt_labels', 'gt_match_indices', 'gt_masks' ], ref_prefix='ref') ]) ], val=dict( type='BDDVideoDataset', ann_file='/media/lin/文件/bdd/labels/seg_track_val_cocoformat.json', img_prefix='/media/lin/文件/bdd/images/seg_track_20/val', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1296, 720), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='VideoCollect', keys=['img']) ]) ]), test=dict( type='BDDVideoDataset', ann_file='/media/lin/文件/bdd/labels/seg_track_test_cocoformat.json', img_prefix='/media/lin/文件/bdd/images/seg_track_20/test', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1296, 720), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='VideoCollect', keys=['img']) ]) ])) optimizer = dict(type='SGD', lr=0.0025, momentum=0.9, weight_decay=0.0001) optimizer_config = dict(grad_clip=None) lr_config = dict( policy='step', warmup='linear', warmup_iters=1000, warmup_ratio=0.001, step=[8, 11]) checkpoint_config = dict(interval=1) log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')]) total_epochs = 12 dist_params = dict(backend='nccl') log_level = 'INFO' load_from = './ckpts/segtrack-fixed-new.pth' resume_from = None workflow = [('train', 1)] evaluation = dict(metric=['bbox', 'segm', 'segtrack'], interval=12) work_dir = './work_dirs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan' gpu_ids = range(0, 1)

2、When I run the following test code, I get a graphical error

python tools/test.py work_dirs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan.py work_dirs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan/latest.pth --eval bbox segm segtrack

Error Traceback (most recent call last): File "tools/test.py", line 164, in main() File "tools/test.py", line 160, in main print(dataset.evaluate(outputs, **eval_kwargs)) File "/home/lin/Desktop/pcan/pcan/datasets/coco_video_dataset.py", line 317, in evaluate class_average=mot_class_average) File "/home/lin/Desktop/pcan/pcan/core/evaluation/mots.py", line 30, in eval_mots preprocessResult(all_results, anns, cats_mapping) File "/home/lin/Desktop/pcan/pcan/core/evaluation/mot.py", line 48, in preprocessResult for i, bbox in enumerate(anns['annotations']): # 枚举，i is index ,line is content KeyError: 'annotations'

3、When I changed ResNet to ResNest during the training, I did not modify the other parts, and the verification accuracy after the training was all 0

After changing the backbone network, what other locations need to be changed....

opened by fxooooooooo 5
cannot reproduce the results in README

like what is showed in this image, after running the tools/test.py in your code, with these in README given config and pretrained weights, I cannot reproduce the results of the "Scores-val" in README. Did you also get this result on seg_track_val_cocoformat.json (which can be downloaded in the bdd100k website)? I am really confused about that.

opened by jkd2021 4
Pretrained model

dear author , this error occurred when I used the model you pprovided .

_pickle.UnpicklingError: A load persistent id instruction was encountered, but no persistent_load function was specified.

How should I solve it . What other models can I use.

opened by fxooooooooo 4
Visualization Script

Thank you for the great work!

I actually have a custom dataset on which I would like to create MOTS visualizations. Additionally, I would like to perform performance evaluations (for the MOTS task) on my dataset.

Can you let me know if a script is available to directly create visualizations on a set of images? Also, can you provide a brief description of how custom datasets need to be prepared, to run the MOTS scripts? Thanks in advance!

opened by rahul1801 3
How should this error be resolved？

creating index... index created! 2022-04-17 21:40:15,020 - pcan - INFO - Start running, host: lin@lin, work_dir: /home/lin/Desktop/pcan/work_dirs/4.17 2022-04-17 21:40:15,020 - pcan - INFO - workflow: [('train', 1)], max: 12 epochs /home/lin/.conda/envs/pcan/lib/python3.6/site-packages/mmdet/models/dense_heads/rpn_head.py:180: UserWarning: In rpn_proposal or test_cfg, nms_thr has been moved to a dict named nms as iou_threshold, max_num has been renamed as max_per_img, name of original arguments and the way to specify iou_threshold of NMS will be deprecated. 'In rpn_proposal or test_cfg, ' Traceback (most recent call last): File "tools/train.py", line 168, in main() File "tools/train.py", line 164, in main meta=meta) File "/home/lin/Desktop/pcan/pcan/apis/train.py", line 123, in train_model runner.run(data_loaders, cfg.workflow, cfg.total_epochs) File "/home/lin/.conda/envs/pcan/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 125, in run epoch_runner(data_loaders[i], **kwargs) File "/home/lin/.conda/envs/pcan/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train self.run_iter(data_batch, train_mode=True) File "/home/lin/.conda/envs/pcan/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter **kwargs) File "/home/lin/.conda/envs/pcan/lib/python3.6/site-packages/mmcv/parallel/data_parallel.py", line 67, in train_step return self.module.train_step(*inputs[0], **kwargs[0]) File "/home/lin/.conda/envs/pcan/lib/python3.6/site-packages/mmdet/models/detectors/base.py", line 247, in train_step losses = self(**data) File "/home/lin/.conda/envs/pcan/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/lin/Desktop/pcan/pcan/models/mot/quasi_dense_pcan.py", line 45, in forward return self.forward_train(img, img_metas, **kwargs) File "/home/lin/Desktop/pcan/pcan/models/mot/quasi_dense_pcan.py", line 86, in forward_train ref_gt_bboxes_ignore, ref_gt_masks, **kwargs) File "/home/lin/Desktop/pcan/pcan/models/roi_heads/quasi_dense_roi_head.py", line 173, in forward_train if mask_results['loss_dice'] is not None: KeyError: 'loss_dice'

opened by fxooooooooo 3
用于测试的demo后续会更新吗？

我试着将mmdetection的demo中的video_demo.py拼接到您的网络中，显示如下问题： Traceback (most recent call last): File "demo/video_demo.py", line 60, in main() File "demo/video_demo.py", line 35, in main model = init_detector(args.config, args.checkpoint, device=args.device) File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmdet/apis/inference.py", line 39, in init_detector model = build_detector(config.model, test_cfg=config.get('test_cfg')) File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmdet/models/builder.py", line 77, in build_detector return build(cfg, DETECTORS, dict(train_cfg=train_cfg, test_cfg=test_cfg)) File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmdet/models/builder.py", line 34, in build return build_from_cfg(cfg, registry, default_args) File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmcv/utils/registry.py", line 172, in build_from_cfg f'{obj_type} is not in the {registry.name} registry') KeyError: 'QuasiDenseMaskRCNN is not in the detector registry' (pcan) zhy@king:~/pcan$ python test_video.py demo/demo.mp4 configs/segtrack-frcnn_r50_fpn_12e_bdd10k.py checkpoints/pcan_pretrained_model.pth --show configs/segtrack-frcnn_r50_fpn_12e_bdd10k.py Traceback (most recent call last): File "test_video.py", line 61, in main() File "test_video.py", line 36, in main model = init_detector(args.config, args.checkpoint, device=args.device) File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmdet/apis/inference.py", line 39, in init_detector model = build_detector(config.model, test_cfg=config.get('test_cfg')) File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmdet/models/builder.py", line 77, in build_detector return build(cfg, DETECTORS, dict(train_cfg=train_cfg, test_cfg=test_cfg)) File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmdet/models/builder.py", line 34, in build return build_from_cfg(cfg, registry, default_args) File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmcv/utils/registry.py", line 172, in build_from_cfg f'{obj_type} is not in the {registry.name} registry') KeyError: 'QuasiDenseMaskRCNN is not in the detector registry' 查找相关问题应该是处在setup.py那个文件。希望您对我的疑惑进行解答谢谢

opened by heuyu980817 3
The time it takes to train the model?

In your paper, I find "Our model is trained with initial learning rate 0.0025 on 4 GPUs using SGD, and executes with a speed of 15.0 FPS on ResNet-50.". and could you tell me the type of GPU and the time it takes to train the PCAN model on the BDD100K segmentation tracking dataset? (Using the the initial model weights from BDD100k MOT tracking set and not using the the initial model weights) thanks

opened by f414158949 2
Traing on Custom dataset

What do I need to do if I want to train on custom dataset? I use CVAT to annotate in MOTS format, and I don't know what format the annotation file should be exported.

opened by monster519 0
Where are the locations of prototype cross-attention in training?

Hi, Thank you for your great jobs, I'm now training the model, but I just found that there is no prototype updated in training. They only work in the test. Is that means the cross-attention only appear in training?

opened by jkd2021 0
Confusion on the ablation study in the paper
Hi, I really appreciate your great job on PCAN. Now I'm reading the ablation study part of the paper, and fell really confused on the Table 3 and Table 4:

My confusions are 1. does this "varying temporal memory length" in Table 3 mean the "memo_tracklet_frames" in the tracker or something else?

which part of the code does this "multi-layer prototypical feature fusion" in Table 4 mean? Does it represent the "memo_banks" as shown or something else?

I'll definitely appreciate it, if somebody can give me a help !!
opened by jkd2021 0

Owner

ETH VIS Group

Visual Intelligence and Systems Group at ETH Zürich

GitHub https://www.vis.xyz/pub/pcan/

Implementation of the CVPR 2021 paper "Online Multiple Object Tracking with Cross-Task Synergy"

Online Multiple Object Tracking with Cross-Task Synergy This repository is the implementation of the CVPR 2021 paper "Online Multiple Object Tracking

54 Oct 15, 2022

Python package for multiple object tracking research with focus on laboratory animals tracking.

motutils is a Python package for multiple object tracking research with focus on laboratory animals tracking. Features loads: MOTChallenge CSV, sleap

2 Sep 5, 2022

[NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning

SoCo [NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning By Fangyun Wei*, Yue Gao*, Zhirong Wu, Han Hu,

139 Dec 14, 2022

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021)

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021, official Pytorch implementatio

247 Dec 25, 2022

Semi-Supervised Graph Prototypical Networks for Hyperspectral Image Classification, IGARSS, 2021.

Semi-Supervised Graph Prototypical Networks for Hyperspectral Image Classification, IGARSS, 2021. Bobo Xi, Jiaojiao Li, Yunsong Li and Qian Du. Code f

7 Nov 3, 2022

Codebase for "ProtoAttend: Attention-Based Prototypical Learning."

Codebase for "ProtoAttend: Attention-Based Prototypical Learning." Authors: Sercan O. Arik and Tomas Pfister Paper: Sercan O. Arik and Tomas Pfister,

2 May 17, 2022

Vanilla and Prototypical Networks with Random Weights for image classification on Omniglot and mini-ImageNet. Made with Python3.

vanilla-rw-protonets-project Vanilla Prototypical Networks and PNs with Random Weights for image classification on Omniglot and mini-ImageNet. Made wi

8 Aug 31, 2022

This is an official PyTorch implementation of Task-Adaptive Neural Network Search with Meta-Contrastive Learning (NeurIPS 2021, Spotlight).

NeurIPS 2021 (Spotlight): Task-Adaptive Neural Network Search with Meta-Contrastive Learning This is an official PyTorch implementation of Task-Adapti

15 Nov 21, 2022

Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation, NeurIPS 2021 Spotlight

Related tags

Overview

PCAN for Multiple Object Tracking and Segmentation

Abstract

Prototypical Cross-Attention Networks (PCAN)

Main results

Results on BDD100K

Installation

Usages

Citation

Comments

1、When I run the following test code, I get a graphical error

2、When I run the following test code, I get a graphical error

3、When I changed ResNet to ResNest during the training, I did not modify the other parts, and the verification accuracy after the training was all 0

Owner

ETH VIS Group

Implementation of the CVPR 2021 paper "Online Multiple Object Tracking with Cross-Task Synergy"

Python package for multiple object tracking research with focus on laboratory animals tracking.

[NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021)

Semi-Supervised Graph Prototypical Networks for Hyperspectral Image Classification, IGARSS, 2021.

Codebase for "ProtoAttend: Attention-Based Prototypical Learning."

Vanilla and Prototypical Networks with Random Weights for image classification on Omniglot and mini-ImageNet. Made with Python3.

This is an official PyTorch implementation of Task-Adaptive Neural Network Search with Meta-Contrastive Learning (NeurIPS 2021, Spotlight).

PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".

[NeurIPS 2021 Spotlight] Code for Learning to Compose Visual Relations

Prototypical Networks for Few shot Learning in PyTorch

Quasi-Dense Similarity Learning for Multiple Object Tracking, CVPR 2021 (Oral)

Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking

Yolox-bytetrack-sample - Python sample of MOT (Multiple Object Tracking) using YOLOX and ByteTrack

[ICLR 2021, Spotlight] Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

Multiple-Object Tracking with Transformer

This repository is an official implementation of the paper MOTR: End-to-End Multiple-Object Tracking with TRansformer.

Multiple Object Tracking with Yolov5!