Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation, NeurIPS 2021 Spotlight

Related tags

Deep Learning pcan
Overview

PWC

PCAN for Multiple Object Tracking and Segmentation

This is the offical implementation of paper PCAN for MOTS.

We also present a trailer that consists of method illustrations and tracking & segmentation visualizations. Our project website contains more information: vis.xyz/pub/pcan. Code is under organization.

Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation
NeurIPS 2021, Spotlight
Lei Ke, Xia Li, Martin Danelljan, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu

Abstract

Multiple object tracking and segmentation requires detecting, tracking, and segmenting objects belonging to a set of given classes. Most approaches only exploit the temporal dimension to address the association problem, while relying on single frame predictions for the segmentation mask itself. We propose Prototypical Cross-Attention Network (PCAN), capable of leveraging rich spatio-temporal information for online multiple object tracking and segmentation. PCAN first distills a space-time memory into a set of prototypes and then employs cross-attention to retrieve rich information from the past frames. To segment each object, PCAN adopts a prototypical appearance module to learn a set of contrastive foreground and background prototypes, which are then propagated over time. Extensive experiments demonstrate that PCAN outperforms current video instance tracking and segmentation competition winners on both Youtube-VIS and BDD100K datasets, and shows efficacy to both one-stage and two-stage segmentation frameworks.

Prototypical Cross-Attention Networks (PCAN)

Main results

Results on BDD100K

Detector mMOTSA-val mIDF1-val ID Sw.-val Scores-val mMOTSA-test mIDF1-test ID Sw.-test Scores-test Config Weights Preds Visuals
ResNet-50 28.1 45.4 874 scores 31.9 50.4 845 scores config model | MD5 preds visuals

Installation

Please refer to INSTALL.md for installation instructions.

Usages

Please refer to GET_STARTED.md for dataset preparation and running instructions.

Citation

If you find PCAN useful in your research or refer to the provided baseline results, please star this repository and consider citing 📝 :

@inproceedings{pcan,
    author    = {Ke, Lei and Li, Xia and Danelljan, Martin and Tai, Yu-Wing and Tang, Chi-Keung and Yu, Fisher},
    booktitle = {Advances in Neural Information Processing Systems},
    title     = {Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation},
    year      = {2021}
}
Comments
  • TypeError: EMQuasiDenseFasterRCNN: __init__() missing 3 required positional arguments: 'channels', 'proto_num', and 'stage_num'

    TypeError: EMQuasiDenseFasterRCNN: __init__() missing 3 required positional arguments: 'channels', 'proto_num', and 'stage_num'

    Thanks for your excellent work. While I was running the train.py , I ran into an error

    Traceback (most recent call last):
      File "D:\ProgramData\Anaconda3\envs\PCAN\lib\site-packages\mmcv\utils\registry.py", line 179, in build_from_cfg
        return obj_cls(**args)
    TypeError: __init__() missing 3 required positional arguments: 'channels', 'proto_num', and 'stage_num'
    
    During handling of the above exception, another exception occurred:
        raise type(e)(f'{obj_cls.__name__}: {e}')
    TypeError: EMQuasiDenseFasterRCNN: __init__() missing 3 required positional arguments: 'channels', 'proto_num', and 'stage_num'
    

    What is the possible solution for the problem?

    opened by Zachein 13
  •  No such file or directory

    No such file or directory

    I train the model after I refer to GET_STARTED.md for preparation. However, the following problems have arisen: FileNotFoundError: [Errno 2] No such file or directory: 'data/bdd/images/10k/train/fee92217-63b3f87f.jpg'.

    I checked the json file and the dataset. I found that certain image names appear in the json file, but could not find the jpg file in the corresponding folder. Please help me!

    opened by laybebe 6
  • Why does this command cause the computer to crash

    Why does this command cause the computer to crash

    python ./tools/to_bdd100k.py configs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan.py --res eval_pcan_results_val.pkl --task seg_track --bdd-dir converted_results/ --nproc 2

    or

    python ./tools/to_bdd100k.py configs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan.py --res eval_pcan_results_val.pkl --task seg_track --bdd-dir converted_results/ --nproc 1

    or

    python ./tools/to_bdd100k.py configs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan.py --res eval_pcan_results_val.pkl --task seg_track --bdd-dir converted_results/

    Then,mouse and keyboard cannot be used

    opened by Alxx999 6
  • No BDD100K format jsons generated after running convert_to_bdd.sh

    No BDD100K format jsons generated after running convert_to_bdd.sh

    No BDD100K format jsons (for MOTS challenge submit) generated after running convert_to_bdd.sh. After running convert_to_bdd.sh, In pcan-main/converted_results/seg_track dir, png masks generated but I can not find json files.

    my convert_to_bdd.sh: python ./tools/to_bdd100k.py configs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan.py --res eval_result_pcan_test.pkl
    --task 'seg_track' --bdd-dir converted_results

    opened by f414158949 5
  • An error encountered while running test code

    An error encountered while running test code

    1、When I run the following test code, I get a graphical error

    python tools/test.py work_dirs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan.py work_dirs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan/latest.pth
    --out work_dirs/resnest/result.pkl --format-only --show

    error: Traceback (most recent call last): File "tools/test.py", line 164, in main() File "tools/test.py", line 153, in main dataset.format_results(outputs, **kwargs) File "/home/lin/anaconda3/envs/pcan/lib/python3.6/site-packages/mmdet/datasets/coco.py", line 350, in format_results assert isinstance(results, list), 'results must be a list' AssertionError: results must be a list

    My configuration file is as follows model = dict( type='EMQuasiDenseMaskRCNNRefine', pretrained=None, backbone=dict( type='ResNet', depth=50, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=1, norm_cfg=dict(type='BN', requires_grad=True), norm_eval=True, style='pytorch'), neck=dict( type='FPN', in_channels=[256, 512, 1024, 2048], out_channels=256, num_outs=5), rpn_head=dict( type='RPNHead', in_channels=256, feat_channels=256, anchor_generator=dict( type='AnchorGenerator', scales=[8], ratios=[0.5, 1.0, 2.0], strides=[4, 8, 16, 32, 64]), bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[1.0, 1.0, 1.0, 1.0]), loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0), loss_bbox=dict( type='SmoothL1Loss', beta=0.1111111111111111, loss_weight=1.0)), roi_head=dict( type='QuasiDenseSegRoIHeadRefine', bbox_roi_extractor=dict( type='SingleRoIExtractor', roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0), out_channels=256, featmap_strides=[4, 8, 16, 32]), bbox_head=dict( type='Shared2FCBBoxHead', in_channels=256, fc_out_channels=1024, roi_feat_size=7, num_classes=8, bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[0.1, 0.1, 0.2, 0.2]), reg_class_agnostic=False, loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)), track_roi_extractor=dict( type='SingleRoIExtractor', roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0), out_channels=256, featmap_strides=[4, 8, 16, 32]), track_head=dict( type='QuasiDenseEmbedHead', num_convs=4, num_fcs=1, embed_channels=256, norm_cfg=dict(type='GN', num_groups=32), loss_track=dict(type='MultiPosCrossEntropyLoss', loss_weight=0.25), loss_track_aux=dict( type='L2Loss', neg_pos_ub=3, pos_margin=0, neg_margin=0.3, hard_mining=True, loss_weight=1.0)), mask_roi_extractor=dict( type='SingleRoIExtractor', roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0), out_channels=256, featmap_strides=[4, 8, 16, 32]), mask_head=dict( type='FCNMaskHeadPlus', num_convs=4, in_channels=256, conv_out_channels=256, num_classes=8, loss_mask=dict( type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)), double_train=False, refine_head=dict( type='EMMatchHeadPlus', num_convs=4, in_channels=256, conv_kernel_size=3, conv_out_channels=256, upsample_method='deconv', upsample_ratio=2, num_classes=8, pos_proto_num=10, neg_proto_num=10, stage_num=6, conv_cfg=None, norm_cfg=None, mask_thr_binary=0.5, match_score_thr=0.5, with_mask_ref=False, with_mask_key=True, with_dilation=False, loss_mask=dict( type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0))), tracker=dict( type='QuasiDenseSegFeatEmbedTracker', init_score_thr=0.7, obj_score_thr=0.3, match_score_thr=0.5, memo_tracklet_frames=10, memo_backdrop_frames=1, memo_momentum=0.8, nms_conf_thr=0.5, nms_backdrop_iou_thr=0.3, nms_class_iou_thr=0.7, with_cats=True, match_metric='bisoftmax'), train_cfg=dict( rpn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.7, neg_iou_thr=0.3, min_pos_iou=0.3, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=256, pos_fraction=0.5, neg_pos_ub=-1, add_gt_as_proposals=False), allowed_border=-1, pos_weight=-1, debug=False), rpn_proposal=dict( nms_across_levels=False, nms_pre=2000, nms_post=1000, max_num=1000, nms_thr=0.7, min_bbox_size=0), rcnn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.5, neg_iou_thr=0.5, min_pos_iou=0.5, match_low_quality=False, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=512, pos_fraction=0.25, neg_pos_ub=-1, add_gt_as_proposals=True), pos_weight=-1, debug=False, mask_size=28), embed=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.7, neg_iou_thr=0.3, min_pos_iou=0.5, match_low_quality=False, ignore_iof_thr=-1), sampler=dict( type='CombinedSampler', num=256, pos_fraction=0.5, neg_pos_ub=3, add_gt_as_proposals=True, pos_sampler=dict(type='InstanceBalancedPosSampler'), neg_sampler=dict( type='IoUBalancedNegSampler', floor_thr=-1, floor_fraction=0, num_bins=3)))), test_cfg=dict( rpn=dict( nms_across_levels=False, nms_pre=1000, nms_post=1000, max_num=1000, nms_thr=0.7, min_bbox_size=0), rcnn=dict( score_thr=0.5, nms=dict(type='nms', iou_threshold=0.5), max_per_img=100, mask_thr_binary=0.5)), fixed=True) dataset_type = 'BDDVideoDataset' data_root = '' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadMultiImagesFromFile'), dict( type='SeqLoadAnnotations', with_bbox=True, with_ins_id=True, with_mask=True), dict(type='SeqResize', img_scale=(1296, 720), keep_ratio=True), dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.5), dict( type='SeqNormalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='SeqPad', size_divisor=32), dict(type='SeqDefaultFormatBundle'), dict( type='SeqCollect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_match_indices', 'gt_masks'], ref_prefix='ref') ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1296, 720), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='VideoCollect', keys=['img']) ]) ] data = dict( samples_per_gpu=8, workers_per_gpu=2, train=[ dict( type='BDDVideoDataset', ann_file='/media/lin/文件/bdd/labels/seg_track_train_cocoformat.json', img_prefix='/media/lin/文件/bdd/images/seg_track_20/train', key_img_sampler=dict(interval=1), ref_img_sampler=dict(num_ref_imgs=1, scope=3, method='uniform'), pipeline=[ dict(type='LoadMultiImagesFromFile'), dict( type='SeqLoadAnnotations', with_bbox=True, with_ins_id=True, with_mask=True), dict(type='SeqResize', img_scale=(1296, 720), keep_ratio=True), dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.5), dict( type='SeqNormalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='SeqPad', size_divisor=32), dict(type='SeqDefaultFormatBundle'), dict( type='SeqCollect', keys=[ 'img', 'gt_bboxes', 'gt_labels', 'gt_match_indices', 'gt_masks' ], ref_prefix='ref') ]) ], val=dict( type='BDDVideoDataset', ann_file='/media/lin/文件/bdd/labels/seg_track_val_cocoformat.json', img_prefix='/media/lin/文件/bdd/images/seg_track_20/val', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1296, 720), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='VideoCollect', keys=['img']) ]) ]), test=dict( type='BDDVideoDataset', ann_file='/media/lin/文件/bdd/labels/seg_track_test_cocoformat.json', img_prefix='/media/lin/文件/bdd/images/seg_track_20/test', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1296, 720), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='VideoCollect', keys=['img']) ]) ])) optimizer = dict(type='SGD', lr=0.0025, momentum=0.9, weight_decay=0.0001) optimizer_config = dict(grad_clip=None) lr_config = dict( policy='step', warmup='linear', warmup_iters=1000, warmup_ratio=0.001, step=[8, 11]) checkpoint_config = dict(interval=1) log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')]) total_epochs = 12 dist_params = dict(backend='nccl') log_level = 'INFO' load_from = './ckpts/segtrack-fixed-new.pth' resume_from = None workflow = [('train', 1)] evaluation = dict(metric=['bbox', 'segm', 'segtrack'], interval=12) work_dir = './work_dirs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan' gpu_ids = range(0, 1)

    2、When I run the following test code, I get a graphical error

    python tools/test.py work_dirs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan.py work_dirs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan/latest.pth --eval bbox segm segtrack

    Error Traceback (most recent call last): File "tools/test.py", line 164, in main() File "tools/test.py", line 160, in main print(dataset.evaluate(outputs, **eval_kwargs)) File "/home/lin/Desktop/pcan/pcan/datasets/coco_video_dataset.py", line 317, in evaluate class_average=mot_class_average) File "/home/lin/Desktop/pcan/pcan/core/evaluation/mots.py", line 30, in eval_mots preprocessResult(all_results, anns, cats_mapping) File "/home/lin/Desktop/pcan/pcan/core/evaluation/mot.py", line 48, in preprocessResult for i, bbox in enumerate(anns['annotations']): # 枚举,i is index ,line is content KeyError: 'annotations'

    3、When I changed ResNet to ResNest during the training, I did not modify the other parts, and the verification accuracy after the training was all 0

    After changing the backbone network, what other locations need to be changed....

    opened by fxooooooooo 5
  • cannot reproduce the results in README

    cannot reproduce the results in README

    bdd_results like what is showed in this image, after running the tools/test.py in your code, with these in README given config and pretrained weights, I cannot reproduce the results of the "Scores-val" in README. Did you also get this result on seg_track_val_cocoformat.json (which can be downloaded in the bdd100k website)? I am really confused about that.

    opened by jkd2021 4
  • Pretrained model

    Pretrained model

    dear author , this error occurred when I used the model you pprovided .

    _pickle.UnpicklingError: A load persistent id instruction was encountered, but no persistent_load function was specified.

    How should I solve it . What other models can I use.

    opened by fxooooooooo 4
  • Visualization Script

    Visualization Script

    Thank you for the great work!

    I actually have a custom dataset on which I would like to create MOTS visualizations. Additionally, I would like to perform performance evaluations (for the MOTS task) on my dataset.

    Can you let me know if a script is available to directly create visualizations on a set of images? Also, can you provide a brief description of how custom datasets need to be prepared, to run the MOTS scripts? Thanks in advance!

    opened by rahul1801 3
  • How should this error be resolved?

    How should this error be resolved?

    creating index... index created! 2022-04-17 21:40:15,020 - pcan - INFO - Start running, host: lin@lin, work_dir: /home/lin/Desktop/pcan/work_dirs/4.17 2022-04-17 21:40:15,020 - pcan - INFO - workflow: [('train', 1)], max: 12 epochs /home/lin/.conda/envs/pcan/lib/python3.6/site-packages/mmdet/models/dense_heads/rpn_head.py:180: UserWarning: In rpn_proposal or test_cfg, nms_thr has been moved to a dict named nms as iou_threshold, max_num has been renamed as max_per_img, name of original arguments and the way to specify iou_threshold of NMS will be deprecated. 'In rpn_proposal or test_cfg, ' Traceback (most recent call last): File "tools/train.py", line 168, in main() File "tools/train.py", line 164, in main meta=meta) File "/home/lin/Desktop/pcan/pcan/apis/train.py", line 123, in train_model runner.run(data_loaders, cfg.workflow, cfg.total_epochs) File "/home/lin/.conda/envs/pcan/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 125, in run epoch_runner(data_loaders[i], **kwargs) File "/home/lin/.conda/envs/pcan/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train self.run_iter(data_batch, train_mode=True) File "/home/lin/.conda/envs/pcan/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter **kwargs) File "/home/lin/.conda/envs/pcan/lib/python3.6/site-packages/mmcv/parallel/data_parallel.py", line 67, in train_step return self.module.train_step(*inputs[0], **kwargs[0]) File "/home/lin/.conda/envs/pcan/lib/python3.6/site-packages/mmdet/models/detectors/base.py", line 247, in train_step losses = self(**data) File "/home/lin/.conda/envs/pcan/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/lin/Desktop/pcan/pcan/models/mot/quasi_dense_pcan.py", line 45, in forward return self.forward_train(img, img_metas, **kwargs) File "/home/lin/Desktop/pcan/pcan/models/mot/quasi_dense_pcan.py", line 86, in forward_train ref_gt_bboxes_ignore, ref_gt_masks, **kwargs) File "/home/lin/Desktop/pcan/pcan/models/roi_heads/quasi_dense_roi_head.py", line 173, in forward_train if mask_results['loss_dice'] is not None: KeyError: 'loss_dice'

    opened by fxooooooooo 3
  • 用于测试的demo后续会更新吗?

    用于测试的demo后续会更新吗?

    我试着将mmdetection的demo中的video_demo.py拼接到您的网络中,显示如下问题: Traceback (most recent call last): File "demo/video_demo.py", line 60, in main() File "demo/video_demo.py", line 35, in main model = init_detector(args.config, args.checkpoint, device=args.device) File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmdet/apis/inference.py", line 39, in init_detector model = build_detector(config.model, test_cfg=config.get('test_cfg')) File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmdet/models/builder.py", line 77, in build_detector return build(cfg, DETECTORS, dict(train_cfg=train_cfg, test_cfg=test_cfg)) File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmdet/models/builder.py", line 34, in build return build_from_cfg(cfg, registry, default_args) File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmcv/utils/registry.py", line 172, in build_from_cfg f'{obj_type} is not in the {registry.name} registry') KeyError: 'QuasiDenseMaskRCNN is not in the detector registry' (pcan) zhy@king:~/pcan$ python test_video.py demo/demo.mp4 configs/segtrack-frcnn_r50_fpn_12e_bdd10k.py checkpoints/pcan_pretrained_model.pth --show configs/segtrack-frcnn_r50_fpn_12e_bdd10k.py Traceback (most recent call last): File "test_video.py", line 61, in main() File "test_video.py", line 36, in main model = init_detector(args.config, args.checkpoint, device=args.device) File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmdet/apis/inference.py", line 39, in init_detector model = build_detector(config.model, test_cfg=config.get('test_cfg')) File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmdet/models/builder.py", line 77, in build_detector return build(cfg, DETECTORS, dict(train_cfg=train_cfg, test_cfg=test_cfg)) File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmdet/models/builder.py", line 34, in build return build_from_cfg(cfg, registry, default_args) File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmcv/utils/registry.py", line 172, in build_from_cfg f'{obj_type} is not in the {registry.name} registry') KeyError: 'QuasiDenseMaskRCNN is not in the detector registry' 查找相关问题应该是处在setup.py那个文件。 希望您对我的疑惑进行解答谢谢

    opened by heuyu980817 3
  • The time it takes to train the model?

    The time it takes to train the model?

    In your paper, I find "Our model is trained with initial learning rate 0.0025 on 4 GPUs using SGD, and executes with a speed of 15.0 FPS on ResNet-50.". and could you tell me the type of GPU and the time it takes to train the PCAN model on the BDD100K segmentation tracking dataset? (Using the the initial model weights from BDD100k MOT tracking set and not using the the initial model weights) thanks

    opened by f414158949 2
  • Traing on Custom dataset

    Traing on Custom dataset

    What do I need to do if I want to train on custom dataset? I use CVAT to annotate in MOTS format, and I don't know what format the annotation file should be exported.

    opened by monster519 0
  • Where are the locations of prototype cross-attention in training?

    Where are the locations of prototype cross-attention in training?

    Hi, Thank you for your great jobs, I'm now training the model, but I just found that there is no prototype updated in training. They only work in the test. Is that means the cross-attention only appear in training?

    opened by jkd2021 0
  • Confusion on the ablation study in the paper

    Confusion on the ablation study in the paper

    Hi, I really appreciate your great job on PCAN. Now I'm reading the ablation study part of the paper, and fell really confused on the Table 3 and Table 4: Table 3 Table 4

    My confusions are 1. does this "varying temporal memory length" in Table 3 mean the "memo_tracklet_frames" in the tracker or something else? memo_tracklet_frames

    1. which part of the code does this "multi-layer prototypical feature fusion" in Table 4 mean? Does it represent the "memo_banks" as shown or something else? memo_bank

    I'll definitely appreciate it, if somebody can give me a help !!

    opened by jkd2021 0
Owner
ETH VIS Group
Visual Intelligence and Systems Group at ETH Zürich
ETH VIS Group
Implementation of the CVPR 2021 paper "Online Multiple Object Tracking with Cross-Task Synergy"

Online Multiple Object Tracking with Cross-Task Synergy This repository is the implementation of the CVPR 2021 paper "Online Multiple Object Tracking

null 54 Oct 15, 2022
Python package for multiple object tracking research with focus on laboratory animals tracking.

motutils is a Python package for multiple object tracking research with focus on laboratory animals tracking. Features loads: MOTChallenge CSV, sleap

Matěj Šmíd 2 Sep 5, 2022
[NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning

SoCo [NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning By Fangyun Wei*, Yue Gao*, Zhirong Wu, Han Hu,

Yue Gao 139 Dec 14, 2022
Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021)

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021, official Pytorch implementatio

Microsoft 247 Dec 25, 2022
Semi-Supervised Graph Prototypical Networks for Hyperspectral Image Classification, IGARSS, 2021.

Semi-Supervised Graph Prototypical Networks for Hyperspectral Image Classification, IGARSS, 2021. Bobo Xi, Jiaojiao Li, Yunsong Li and Qian Du. Code f

Bobo Xi 7 Nov 3, 2022
Codebase for "ProtoAttend: Attention-Based Prototypical Learning."

Codebase for "ProtoAttend: Attention-Based Prototypical Learning." Authors: Sercan O. Arik and Tomas Pfister Paper: Sercan O. Arik and Tomas Pfister,

47 2 May 17, 2022
Vanilla and Prototypical Networks with Random Weights for image classification on Omniglot and mini-ImageNet. Made with Python3.

vanilla-rw-protonets-project Vanilla Prototypical Networks and PNs with Random Weights for image classification on Omniglot and mini-ImageNet. Made wi

Giovani Candido 8 Aug 31, 2022
This is an official PyTorch implementation of Task-Adaptive Neural Network Search with Meta-Contrastive Learning (NeurIPS 2021, Spotlight).

NeurIPS 2021 (Spotlight): Task-Adaptive Neural Network Search with Meta-Contrastive Learning This is an official PyTorch implementation of Task-Adapti

Wonyong Jeong 15 Nov 21, 2022
PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".

Long Short-Term Transformer for Online Action Detection Introduction This is a PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short

null 77 Dec 16, 2022
[NeurIPS 2021 Spotlight] Code for Learning to Compose Visual Relations

Learning to Compose Visual Relations This is the pytorch codebase for the NeurIPS 2021 Spotlight paper Learning to Compose Visual Relations. Demo Imag

Nan Liu 88 Jan 4, 2023
Prototypical Networks for Few shot Learning in PyTorch

Prototypical Networks for Few shot Learning in PyTorch Simple alternative Implementation of Prototypical Networks for Few Shot Learning (paper, code)

Orobix 835 Jan 8, 2023
Quasi-Dense Similarity Learning for Multiple Object Tracking, CVPR 2021 (Oral)

Quasi-Dense Tracking This is the offical implementation of paper Quasi-Dense Similarity Learning for Multiple Object Tracking. We present a trailer th

ETH VIS Research Group 327 Dec 27, 2022
Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

This is the official PyTorch implementation of our paper: "Joint Object Detection and Multi-Object Tracking with Graph Neural Networks". Our project website and video demos are here.

Richard Wang 443 Dec 6, 2022
Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking

Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking (CVPR 2021) Pytorch implementation of the ArTIST motion model. In this repo

Fatemeh 38 Dec 12, 2022
Yolox-bytetrack-sample - Python sample of MOT (Multiple Object Tracking) using YOLOX and ByteTrack

yolox-bytetrack-sample YOLOXとByteTrackを用いたMOT(Multiple Object Tracking)のPythonサン

KazuhitoTakahashi 12 Nov 9, 2022
[ICLR 2021, Spotlight] Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

Large Scale Image Completion via Co-Modulated Generative Adversarial Networks, ICLR 2021 (Spotlight) Demo | Paper [NEW!] Time to play with our interac

Shengyu Zhao 373 Jan 2, 2023
Multiple-Object Tracking with Transformer

TransTrack: Multiple-Object Tracking with Transformer Introduction TransTrack: Multiple-Object Tracking with Transformer Models Training data Training

Peize Sun 537 Jan 4, 2023
This repository is an official implementation of the paper MOTR: End-to-End Multiple-Object Tracking with TRansformer.

MOTR: End-to-End Multiple-Object Tracking with TRansformer This repository is an official implementation of the paper MOTR: End-to-End Multiple-Object

null 348 Jan 7, 2023
Multiple Object Tracking with Yolov5!

Tracking with yolov5 This implementation is for who need to tracking multi-object only with detector. You can easily track mult-object with your well

null 9 Nov 8, 2022