OpenMMLab Video Perception Toolbox. It supports Video Object Detection (VID), Multiple Object Tracking (MOT), Single Object Tracking (SOT), Video Instance Segmentation (VIS) with a unified framework.

Overview

PyPI - Python Version PyPI docs badge codecov license

English | 简体中文

Documentation: https://mmtracking.readthedocs.io/

Introduction

MMTracking is an open source video perception toolbox based on PyTorch. It is a part of the OpenMMLab project.

The master branch works with PyTorch1.3+.

Major features

  • The First Unified Video Perception Platform

    We are the first open source toolbox that unifies versatile video perception tasks include video object detection, multiple object tracking, single object tracking and video instance segmentation.

  • Modular Design

    We decompose the video perception framework into different components and one can easily construct a customized method by combining different modules.

  • Simple, Fast and Strong

    Simple: MMTracking interacts with other OpenMMLab projects. It is built upon MMDetection that we can capitalize any detector only through modifying the configs.

    Fast: All operations run on GPUs. The training and inference speeds are faster than or comparable to other implementations.

    Strong: We reproduce state-of-the-art models and some of them even outperform the official implementations.

License

This project is released under the Apache 2.0 license.

Changelog

v0.8.0 was released in 03/10/2021. Please refer to changelog.md for details and release history.

Benchmark and model zoo

Results and models are available in the model zoo.

Supported methods of video object detection:

Supported methods of multi object tracking:

Supported methods of single object tracking:

Supported methods of video instance segmentation:

Installation

Please refer to install.md for install instructions.

Getting Started

Please see dataset.md and quick_run.md for the basic usage of MMTracking. We also provide usage tutorials, such as learning about configs, an example about detailed description of vid config, an example about detailed description of mot config, an example about detailed description of sot config, customizing dataset, customizing data pipeline, customizing vid model, customizing mot model, customizing sot model, customizing runtime settings and useful tools.

Contributing

We appreciate all contributions to improve MMTracking. Please refer to CONTRIBUTING.md for the contributing guideline.

Acknowledgement

MMTracking is an open source project that welcome any contribution and feedback. We wish that the toolbox and benchmark could serve the growing research community by providing a flexible as well as standardized toolkit to reimplement existing methods and develop their own new video perception methods.

Citation

If you find this project useful in your research, please consider cite:

@misc{mmtrack2020,
    title={{MMTracking: OpenMMLab} video perception toolbox and benchmark},
    author={MMTracking Contributors},
    howpublished = {\url{https://github.com/open-mmlab/mmtracking}},
    year={2020}
}

Projects in OpenMMLab

  • MMCV: OpenMMLab foundational library for computer vision.
  • MIM: MIM Installs OpenMMLab Packages.
  • MMClassification: OpenMMLab image classification toolbox and benchmark.
  • MMDetection: OpenMMLab detection toolbox and benchmark.
  • MMDetection3D: OpenMMLab's next-generation platform for general 3D object detection.
  • MMSegmentation: OpenMMLab semantic segmentation toolbox and benchmark.
  • MMAction2: OpenMMLab's next-generation action understanding toolbox and benchmark.
  • MMTracking: OpenMMLab video perception toolbox and benchmark.
  • MMPose: OpenMMLab pose estimation toolbox and benchmark.
  • MMEditing: OpenMMLab image and video editing toolbox.
  • MMOCR: OpenMMLab text detection, recognition and understanding toolbox.
  • MMGeneration: OpenMMLab Generative Model toolbox and benchmark.
  • MMFlow: OpenMMLab optical flow toolbox and benchmark.
Comments
  • Parameters and variables setting in DFF model

    Parameters and variables setting in DFF model

    In "/mmtracking/mmtrack/models/motion/flownet_simple.py," the init parameters "flow_img_norm_std=[255.0, 255.0, 255.0]" and "flow_img_norm_mean=[0.411, 0.432, 0.450]" . What's the meaning of these parameters? I'm using a type of data with 10 channels, how should I set these parameters?

    Also, in "prepare_imgs" method, "img_metas[0]['img_norm_cfg']['mean']" and "img_metas[0]['img_norm_cfg']['std']" are both initialized with 0.Is it necessary to reassign the value while training or testing? If necessary, how and what value should I assign to these variables?

    opened by yan811 13
  • training MOT dataset

    training MOT dataset

    hello everyone. thank you for your answers in advance. i am new here, forgive me if i can't explain myself. I am trying to train the mot dataset. but i have a problem with pytorch. torch.distributed.launch is giving me error. i need to change to torhcrun (Transitioning from torch.distributed.launch to torchrun)but i couldn't modify the train.py script. can you please help me with that? thanks again.

    opened by mehmetcanmitil 9
  • 在使用多卡训练VID模型时,验证到最后几张图篇时发生卡顿(等了一个小时都没有更新)。

    在使用多卡训练VID模型时,验证到最后几张图篇时发生卡顿(等了一个小时都没有更新)。

    我的日志如下:

    2021-12-26 16:04:19,060 - mmtrack - INFO - Environment info:

    sys.platform: linux Python: 3.8.12 (default, Oct 12 2021, 13:49:34) [GCC 7.5.0] CUDA available: True GPU 0,1,2: GeForce RTX 2080 Ti CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 10.0, V10.0.130 GCC: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609 PyTorch: 1.5.0 PyTorch compiling details: PyTorch built with:

    • GCC 7.3
    • C++ Version: 201402
    • Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
    • Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
    • OpenMP 201511 (a.k.a. OpenMP 4.5)
    • NNPACK is enabled
    • CPU capability usage: AVX2
    • CUDA Runtime 10.1
    • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
    • CuDNN 7.6.3
    • Magma 2.5.2
    • Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_INTERNAL_THREADPOOL_IMPL -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

    TorchVision: 0.6.0a0+82fd1c8 OpenCV: 4.5.4 MMCV: 1.4.1 MMCV Compiler: GCC 7.3 MMCV CUDA Compiler: 10.1 MMTracking: 0.8.0+

    2021-12-26 16:04:19,061 - mmtrack - INFO - Distributed training: True 2021-12-26 16:04:19,761 - mmtrack - INFO - Config: model = dict( detector=dict( type='FasterRCNN', backbone=dict( type='ResNet', depth=50, num_stages=4, out_indices=(3, ), strides=(1, 2, 2, 1), dilations=(1, 1, 1, 2), frozen_stages=1, norm_cfg=dict(type='BN', requires_grad=True), norm_eval=True, style='pytorch'), neck=dict( type='ChannelMapper', in_channels=[2048], out_channels=512, kernel_size=3), rpn_head=dict( type='RPNHead', in_channels=512, feat_channels=512, anchor_generator=dict( type='AnchorGenerator', scales=[4, 8, 16, 32], ratios=[0.5, 1.0, 2.0], strides=[16]), bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[1.0, 1.0, 1.0, 1.0]), loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0), loss_bbox=dict( type='SmoothL1Loss', beta=0.1111111111111111, loss_weight=1.0)), roi_head=dict( type='SelsaRoIHead', bbox_roi_extractor=dict( type='TemporalRoIAlign', roi_layer=dict( type='RoIAlign', output_size=7, sampling_ratio=2), out_channels=512, featmap_strides=[16], num_most_similar_points=2, num_temporal_attention_blocks=4), bbox_head=dict( type='SelsaBBoxHead', in_channels=512, fc_out_channels=1024, roi_feat_size=7, num_classes=30, bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[0.2, 0.2, 0.2, 0.2]), reg_class_agnostic=False, loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0), num_shared_fcs=3, aggregator=dict( type='SelsaAggregator', in_channels=1024, num_attention_blocks=16))), train_cfg=dict( rpn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.7, neg_iou_thr=0.3, min_pos_iou=0.3, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=256, pos_fraction=0.5, neg_pos_ub=-1, add_gt_as_proposals=False), allowed_border=0, pos_weight=-1, debug=False), rpn_proposal=dict( nms_pre=6000, max_per_img=600, nms=dict(type='nms', iou_threshold=0.7), min_bbox_size=0), rcnn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.5, neg_iou_thr=0.5, min_pos_iou=0.5, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=256, pos_fraction=0.25, neg_pos_ub=-1, add_gt_as_proposals=True), pos_weight=-1, debug=False)), test_cfg=dict( rpn=dict( nms_pre=6000, max_per_img=300, nms=dict(type='nms', iou_threshold=0.7), min_bbox_size=0), rcnn=dict( score_thr=0.0001, nms=dict(type='nms', iou_threshold=0.5), max_per_img=100))), type='SELSA') dataset_type = 'ImagenetVIDDataset' data_root = 'data/FALD_VID/' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadMultiImagesFromFile'), dict(type='SeqLoadAnnotations', with_bbox=True, with_track=True), dict(type='SeqResize', img_scale=(1000, 600), keep_ratio=True), dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.5), dict( type='SeqNormalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='SeqPad', size_divisor=16), dict( type='VideoCollect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_instance_ids']), dict(type='ConcatVideoReferences'), dict(type='SeqDefaultFormatBundle', ref_prefix='ref') ] test_pipeline = [ dict(type='LoadMultiImagesFromFile'), dict(type='SeqResize', img_scale=(1000, 600), keep_ratio=True), dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.0), dict( type='SeqNormalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='SeqPad', size_divisor=16), dict( type='VideoCollect', keys=['img'], meta_keys=('num_left_ref_imgs', 'frame_stride')), dict(type='ConcatVideoReferences'), dict(type='MultiImagesToTensor', ref_prefix='ref'), dict(type='ToList') ] data = dict( samples_per_gpu=1, workers_per_gpu=2, train=dict( type='ImagenetVIDDataset', ann_file= 'data/FALD_VID/COCOVIDannotations/imagenet_vid_train_every10frames.json', img_prefix='data/FALD_VID/Data/VID', ref_img_sampler=dict( num_ref_imgs=2, frame_range=9, filter_key_img=False, method='bilateral_uniform'), pipeline=[ dict(type='LoadMultiImagesFromFile'), dict(type='SeqLoadAnnotations', with_bbox=True, with_track=True), dict(type='SeqResize', img_scale=(1000, 600), keep_ratio=True), dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.5), dict( type='SeqNormalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='SeqPad', size_divisor=16), dict( type='VideoCollect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_instance_ids']), dict(type='ConcatVideoReferences'), dict(type='SeqDefaultFormatBundle', ref_prefix='ref') ]), val=dict( type='ImagenetVIDDataset', ann_file='data/FALD_VID/annotations/imagenet_vid_val.json', img_prefix='data/FALD_VID/Data/VID', ref_img_sampler=dict( num_ref_imgs=14, frame_range=[-7, 7], method='test_with_adaptive_stride'), pipeline=[ dict(type='LoadMultiImagesFromFile'), dict(type='SeqResize', img_scale=(1000, 600), keep_ratio=True), dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.0), dict( type='SeqNormalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='SeqPad', size_divisor=16), dict( type='VideoCollect', keys=['img'], meta_keys=('num_left_ref_imgs', 'frame_stride')), dict(type='ConcatVideoReferences'), dict(type='MultiImagesToTensor', ref_prefix='ref'), dict(type='ToList') ], test_mode=True), test=dict( type='ImagenetVIDDataset', ann_file='data/FALD_VID/annotations/imagenet_vid_val.json', img_prefix='data/FALD_VID/Data/VID', ref_img_sampler=dict( num_ref_imgs=14, frame_range=[-7, 7], method='test_with_adaptive_stride'), pipeline=[ dict(type='LoadMultiImagesFromFile'), dict(type='SeqResize', img_scale=(1000, 600), keep_ratio=True), dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.0), dict( type='SeqNormalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='SeqPad', size_divisor=16), dict( type='VideoCollect', keys=['img'], meta_keys=('num_left_ref_imgs', 'frame_stride')), dict(type='ConcatVideoReferences'), dict(type='MultiImagesToTensor', ref_prefix='ref'), dict(type='ToList') ], test_mode=True)) optimizer = dict(type='SGD', lr=0.005, momentum=0.9, weight_decay=0.0001) optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2)) checkpoint_config = dict(interval=1) log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')]) dist_params = dict(backend='nccl') log_level = 'INFO' load_from = None resume_from = None workflow = [('train', 1)] lr_config = dict( policy='step', warmup='linear', warmup_iters=500, warmup_ratio=0.3333333333333333, step=[2, 5]) total_epochs = 4 evaluation = dict(metric=['bbox'], interval=4) work_dir = './work_dirs/20211226_001_try3/' gpu_ids = range(0, 1)

    2021-12-26 16:04:24,438 - mmtrack - INFO - Set random seed to 2034425034, deterministic: False 2021-12-26 16:04:25,201 - mmtrack - INFO - initialize ResNet with init_cfg [{'type': 'Kaiming', 'layer': 'Conv2d'}, {'type': 'Constant', 'val': 1, 'layer': ['_BatchNorm', 'GroupNorm']}] 2021-12-26 16:04:25,466 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,467 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,468 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,470 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,471 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,472 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,473 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,475 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,477 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,479 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,481 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,482 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,484 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,490 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,496 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,500 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,523 - mmtrack - INFO - initialize ChannelMapper with init_cfg {'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'} 2021-12-26 16:04:25,583 - mmtrack - INFO - initialize RPNHead with init_cfg {'type': 'Normal', 'layer': 'Conv2d', 'std': 0.01} 2021-12-26 16:04:25,637 - mmtrack - INFO - initialize SelsaBBoxHead with init_cfg [{'type': 'Normal', 'std': 0.01, 'override': {'name': 'fc_cls'}}, {'type': 'Normal', 'std': 0.001, 'override': {'name': 'fc_reg'}}, {'type': 'Xavier', 'distribution': 'uniform', 'override': [{'name': 'shared_fcs'}, {'name': 'cls_fcs'}, {'name': 'reg_fcs'}]}] Name of parameter - Initialization information

    detector.backbone.conv1.weight - torch.Size([64, 3, 7, 7]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.bn1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.bn1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.0.conv1.weight - torch.Size([64, 64, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer1.0.bn1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.0.bn1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.0.conv2.weight - torch.Size([64, 64, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer1.0.bn2.weight - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.0.bn2.bias - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.0.conv3.weight - torch.Size([256, 64, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer1.0.bn3.weight - torch.Size([256]): ConstantInit: val=0, bias=0

    detector.backbone.layer1.0.bn3.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.0.downsample.0.weight - torch.Size([256, 64, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer1.0.downsample.1.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.0.downsample.1.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.1.conv1.weight - torch.Size([64, 256, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer1.1.bn1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.1.bn1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.1.conv2.weight - torch.Size([64, 64, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer1.1.bn2.weight - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.1.bn2.bias - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.1.conv3.weight - torch.Size([256, 64, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer1.1.bn3.weight - torch.Size([256]): ConstantInit: val=0, bias=0

    detector.backbone.layer1.1.bn3.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.2.conv1.weight - torch.Size([64, 256, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer1.2.bn1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.2.bn1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.2.conv2.weight - torch.Size([64, 64, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer1.2.bn2.weight - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.2.bn2.bias - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.2.conv3.weight - torch.Size([256, 64, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer1.2.bn3.weight - torch.Size([256]): ConstantInit: val=0, bias=0

    detector.backbone.layer1.2.bn3.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.0.conv1.weight - torch.Size([128, 256, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.0.bn1.weight - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.0.bn1.bias - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.0.conv2.weight - torch.Size([128, 128, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.0.bn2.weight - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.0.bn2.bias - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.0.conv3.weight - torch.Size([512, 128, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.0.bn3.weight - torch.Size([512]): ConstantInit: val=0, bias=0

    detector.backbone.layer2.0.bn3.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.0.downsample.0.weight - torch.Size([512, 256, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.0.downsample.1.weight - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.0.downsample.1.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.1.conv1.weight - torch.Size([128, 512, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.1.bn1.weight - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.1.bn1.bias - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.1.conv2.weight - torch.Size([128, 128, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.1.bn2.weight - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.1.bn2.bias - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.1.conv3.weight - torch.Size([512, 128, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.1.bn3.weight - torch.Size([512]): ConstantInit: val=0, bias=0

    detector.backbone.layer2.1.bn3.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.2.conv1.weight - torch.Size([128, 512, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.2.bn1.weight - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.2.bn1.bias - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.2.conv2.weight - torch.Size([128, 128, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.2.bn2.weight - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.2.bn2.bias - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.2.conv3.weight - torch.Size([512, 128, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.2.bn3.weight - torch.Size([512]): ConstantInit: val=0, bias=0

    detector.backbone.layer2.2.bn3.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.3.conv1.weight - torch.Size([128, 512, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.3.bn1.weight - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.3.bn1.bias - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.3.conv2.weight - torch.Size([128, 128, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.3.bn2.weight - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.3.bn2.bias - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.3.conv3.weight - torch.Size([512, 128, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.3.bn3.weight - torch.Size([512]): ConstantInit: val=0, bias=0

    detector.backbone.layer2.3.bn3.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.0.conv1.weight - torch.Size([256, 512, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.0.bn1.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.0.bn1.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.0.conv2.weight - torch.Size([256, 256, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.0.bn2.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.0.bn2.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.0.conv3.weight - torch.Size([1024, 256, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.0.bn3.weight - torch.Size([1024]): ConstantInit: val=0, bias=0

    detector.backbone.layer3.0.bn3.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.0.downsample.0.weight - torch.Size([1024, 512, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.0.downsample.1.weight - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.0.downsample.1.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.1.conv1.weight - torch.Size([256, 1024, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.1.bn1.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.1.bn1.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.1.conv2.weight - torch.Size([256, 256, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.1.bn2.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.1.bn2.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.1.conv3.weight - torch.Size([1024, 256, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.1.bn3.weight - torch.Size([1024]): ConstantInit: val=0, bias=0

    detector.backbone.layer3.1.bn3.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.2.conv1.weight - torch.Size([256, 1024, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.2.bn1.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.2.bn1.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.2.conv2.weight - torch.Size([256, 256, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.2.bn2.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.2.bn2.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.2.conv3.weight - torch.Size([1024, 256, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.2.bn3.weight - torch.Size([1024]): ConstantInit: val=0, bias=0

    detector.backbone.layer3.2.bn3.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.3.conv1.weight - torch.Size([256, 1024, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.3.bn1.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.3.bn1.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.3.conv2.weight - torch.Size([256, 256, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.3.bn2.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.3.bn2.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.3.conv3.weight - torch.Size([1024, 256, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.3.bn3.weight - torch.Size([1024]): ConstantInit: val=0, bias=0

    detector.backbone.layer3.3.bn3.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.4.conv1.weight - torch.Size([256, 1024, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.4.bn1.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.4.bn1.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.4.conv2.weight - torch.Size([256, 256, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.4.bn2.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.4.bn2.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.4.conv3.weight - torch.Size([1024, 256, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.4.bn3.weight - torch.Size([1024]): ConstantInit: val=0, bias=0

    detector.backbone.layer3.4.bn3.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.5.conv1.weight - torch.Size([256, 1024, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.5.bn1.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.5.bn1.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.5.conv2.weight - torch.Size([256, 256, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.5.bn2.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.5.bn2.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.5.conv3.weight - torch.Size([1024, 256, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.5.bn3.weight - torch.Size([1024]): ConstantInit: val=0, bias=0

    detector.backbone.layer3.5.bn3.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.0.conv1.weight - torch.Size([512, 1024, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer4.0.bn1.weight - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.0.bn1.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.0.conv2.weight - torch.Size([512, 512, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer4.0.bn2.weight - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.0.bn2.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.0.conv3.weight - torch.Size([2048, 512, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer4.0.bn3.weight - torch.Size([2048]): ConstantInit: val=0, bias=0

    detector.backbone.layer4.0.bn3.bias - torch.Size([2048]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.0.downsample.0.weight - torch.Size([2048, 1024, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer4.0.downsample.1.weight - torch.Size([2048]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.0.downsample.1.bias - torch.Size([2048]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.1.conv1.weight - torch.Size([512, 2048, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer4.1.bn1.weight - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.1.bn1.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.1.conv2.weight - torch.Size([512, 512, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer4.1.bn2.weight - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.1.bn2.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.1.conv3.weight - torch.Size([2048, 512, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer4.1.bn3.weight - torch.Size([2048]): ConstantInit: val=0, bias=0

    detector.backbone.layer4.1.bn3.bias - torch.Size([2048]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.2.conv1.weight - torch.Size([512, 2048, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer4.2.bn1.weight - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.2.bn1.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.2.conv2.weight - torch.Size([512, 512, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer4.2.bn2.weight - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.2.bn2.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.2.conv3.weight - torch.Size([2048, 512, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer4.2.bn3.weight - torch.Size([2048]): ConstantInit: val=0, bias=0

    detector.backbone.layer4.2.bn3.bias - torch.Size([2048]): The value is the same before and after calling init_weights of SELSA

    detector.neck.convs.0.conv.weight - torch.Size([512, 2048, 3, 3]): XavierInit: gain=1, distribution=uniform, bias=0

    detector.neck.convs.0.conv.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.rpn_head.rpn_conv.weight - torch.Size([512, 512, 3, 3]): NormalInit: mean=0, std=0.01, bias=0

    detector.rpn_head.rpn_conv.bias - torch.Size([512]): NormalInit: mean=0, std=0.01, bias=0

    detector.rpn_head.rpn_cls.weight - torch.Size([12, 512, 1, 1]): NormalInit: mean=0, std=0.01, bias=0

    detector.rpn_head.rpn_cls.bias - torch.Size([12]): NormalInit: mean=0, std=0.01, bias=0

    detector.rpn_head.rpn_reg.weight - torch.Size([48, 512, 1, 1]): NormalInit: mean=0, std=0.01, bias=0

    detector.rpn_head.rpn_reg.bias - torch.Size([48]): NormalInit: mean=0, std=0.01, bias=0

    detector.roi_head.bbox_roi_extractor.embed_network.conv.weight - torch.Size([512, 512, 3, 3]): Initialized by user-defined init_weights in ConvModule

    detector.roi_head.bbox_roi_extractor.embed_network.conv.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.fc_cls.weight - torch.Size([31, 1024]): NormalInit: mean=0, std=0.01, bias=0

    detector.roi_head.bbox_head.fc_cls.bias - torch.Size([31]): NormalInit: mean=0, std=0.01, bias=0

    detector.roi_head.bbox_head.fc_reg.weight - torch.Size([120, 1024]): NormalInit: mean=0, std=0.001, bias=0

    detector.roi_head.bbox_head.fc_reg.bias - torch.Size([120]): NormalInit: mean=0, std=0.001, bias=0

    detector.roi_head.bbox_head.shared_fcs.0.weight - torch.Size([1024, 25088]): XavierInit: gain=1, distribution=uniform, bias=0

    detector.roi_head.bbox_head.shared_fcs.0.bias - torch.Size([1024]): XavierInit: gain=1, distribution=uniform, bias=0

    detector.roi_head.bbox_head.shared_fcs.1.weight - torch.Size([1024, 1024]): XavierInit: gain=1, distribution=uniform, bias=0

    detector.roi_head.bbox_head.shared_fcs.1.bias - torch.Size([1024]): XavierInit: gain=1, distribution=uniform, bias=0

    detector.roi_head.bbox_head.shared_fcs.2.weight - torch.Size([1024, 1024]): XavierInit: gain=1, distribution=uniform, bias=0

    detector.roi_head.bbox_head.shared_fcs.2.bias - torch.Size([1024]): XavierInit: gain=1, distribution=uniform, bias=0

    detector.roi_head.bbox_head.aggregator.0.fc_embed.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.0.fc_embed.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.0.ref_fc_embed.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.0.ref_fc_embed.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.0.fc.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.0.fc.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.0.ref_fc.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.0.ref_fc.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.1.fc_embed.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.1.fc_embed.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.1.ref_fc_embed.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.1.ref_fc_embed.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.1.fc.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.1.fc.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.1.ref_fc.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.1.ref_fc.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.2.fc_embed.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.2.fc_embed.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.2.ref_fc_embed.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.2.ref_fc_embed.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.2.fc.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.2.fc.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.2.ref_fc.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.2.ref_fc.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA
    2021-12-26 16:04:28,460 - mmtrack - INFO - Start running, host: [email protected], work_dir: /data/yangjiahui/VIDProject/mmtracking/work_dirs/20211226_001_try3 2021-12-26 16:04:28,461 - mmtrack - INFO - Hooks will be executed in the following order: before_run: (VERY_HIGH ) StepLrUpdaterHook
    (NORMAL ) CheckpointHook
    (NORMAL ) DistEvalHook
    (VERY_LOW ) TextLoggerHook

    before_train_epoch: (VERY_HIGH ) StepLrUpdaterHook
    (NORMAL ) DistSamplerSeedHook
    (NORMAL ) DistEvalHook
    (LOW ) IterTimerHook
    (VERY_LOW ) TextLoggerHook

    before_train_iter: (VERY_HIGH ) StepLrUpdaterHook
    (NORMAL ) DistEvalHook
    (LOW ) IterTimerHook

    after_train_iter: (ABOVE_NORMAL) OptimizerHook
    (NORMAL ) CheckpointHook
    (NORMAL ) DistEvalHook
    (LOW ) IterTimerHook
    (VERY_LOW ) TextLoggerHook

    after_train_epoch: (NORMAL ) CheckpointHook
    (NORMAL ) DistEvalHook
    (VERY_LOW ) TextLoggerHook

    before_val_epoch: (NORMAL ) DistSamplerSeedHook
    (LOW ) IterTimerHook
    (VERY_LOW ) TextLoggerHook

    before_val_iter: (LOW ) IterTimerHook

    after_val_iter: (LOW ) IterTimerHook

    after_val_epoch: (VERY_LOW ) TextLoggerHook

    after_run: (VERY_LOW ) TextLoggerHook

    2021-12-26 16:04:28,461 - mmtrack - INFO - workflow: [('train', 1)], max: 4 epochs 2021-12-26 16:04:28,461 - mmtrack - INFO - Checkpoints will be saved to /data/yangjiahui/VIDProject/mmtracking/work_dirs/20211226_001_try3 by HardDiskBackend. 2021-12-26 16:05:00,501 - mmtrack - INFO - Saving checkpoint at 1 epochs 2021-12-26 16:05:32,658 - mmtrack - INFO - Saving checkpoint at 2 epochs 2021-12-26 16:06:04,769 - mmtrack - INFO - Saving checkpoint at 3 epochs 2021-12-26 16:06:37,068 - mmtrack - INFO - Saving checkpoint at 4 epochs

    opened by FarranYang 9
  • Many errors when training ReID of Tractor model on MOT17.

    Many errors when training ReID of Tractor model on MOT17.

    I ran successfully the official tractor repo, but I cannot run this repo. Same command (default training command of Reid model), but different errors on different days.

    image image image

    Checklist

    1. I have searched related issues but cannot get the expected help.
    2. The bug has not been fixed in the latest version.

    Describe the bug A clear and concise description of what the bug is.

    Reproduction

    1. What command or script did you run?
    A placeholder for the command.
    
    1. Did you make any modifications on the code or config? Did you understand what you have modified?
    2. What dataset did you use and what task did you run?

    Environment

    1. Please run python mmtrack/utils/collect_env.py to collect necessary environment information and paste it here.
    2. You may add addition that may be helpful for locating the problem, such as
      • How you installed PyTorch [e.g., pip, conda, source]
      • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

    Error traceback If applicable, paste the error trackback here.

    A placeholder for trackback.
    

    Bug fix If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!

    opened by sjtuytc 9
  • Person id changes when it's viewed from different camera using traktor

    Person id changes when it's viewed from different camera using traktor

    I was testing the traktor model on some videos and I found that when the camera changes,the id assigned to the object being tracked also changes. For example if we look at the last few seconds of the below inference video(obtained by running traktor code),it seems that the runners are now assigned a new ID. What could be the possible reason,is it because the same person is being viewed from a different camera angle.Or could it be that I need to retrain the re-id model dataset used was MOT 20,configuration was tracktor_faster-rcnn_r50_fpn_8e_mot20-public-half.

    Input video https://drive.google.com/file/d/1IVxcL3a5jUH3huJuyVzgDepIpBE62H3F/view?usp=sharing inference video https://drive.google.com/file/d/1Rcl3nrdTQznyPO4GQLm7_juYzsSZtLK4/view?usp=sharing

    opened by sparshgarg23 8
  • ReID training

    ReID training

    Thanks for your error report and we appreciate it a lot.

    Checklist

    1. I have searched related issues but cannot get the expected help.
    2. The bug has not been fixed in the latest version.

    Describe the bug A clear and concise description of what the bug is.

    Reproduction

    1. What command or script did you run?
    python3 ./tools/train.py configs/reid/resnet50_b32x8_MOT17.py --work-dir work_dirs/resnet50_b32x8_MOT17
    
    1. I did not make any modification on the code except dataset path
    2. Im running ReID training on MOT dataset

    Environment

    1. Please run python mmtrack/utils/collect_env.py to collect necessary environment information and paste it here. sys.platform: linux Python: 3.8.11 (default, Jul 3 2021, 17:53:42) [GCC 7.5.0] CUDA available: True GPU 0: TITAN Xp CUDA_HOME: None GCC: gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 PyTorch: 1.7.1+cu101 PyTorch compiling details: PyTorch built with:
    • GCC 7.3
    • C++ Version: 201402
    • Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
    • Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
    • OpenMP 201511 (a.k.a. OpenMP 4.5)
    • NNPACK is enabled
    • CPU capability usage: AVX2
    • CUDA Runtime 10.1
    • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75
    • CuDNN 7.6.3
    • Magma 2.5.2
    • Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

    TorchVision: 0.8.2+cu101 OpenCV: 4.5.3 MMCV: 1.3.11 MMCV Compiler: GCC 7.3 MMCV CUDA Compiler: 10.1 MMTracking: 0.6.0+4d78b77

    1. You may add addition that may be helpful for locating the problem, such as
      • How you installed PyTorch [e.g., pip, conda, source]
      • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

    Error traceback If applicable, paste the error trackback here.

    sys.platform: linux
    Python: 3.8.11 (default, Jul  3 2021, 17:53:42) [GCC 7.5.0]
    CUDA available: True
    GPU 0: TITAN Xp
    CUDA_HOME: None
    GCC: gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
    PyTorch: 1.7.1+cu101
    PyTorch compiling details: PyTorch built with:
      - GCC 7.3
      - C++ Version: 201402
      - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
      - Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
      - OpenMP 201511 (a.k.a. OpenMP 4.5)
      - NNPACK is enabled
      - CPU capability usage: AVX2
      - CUDA Runtime 10.1
      - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75
      - CuDNN 7.6.3
      - Magma 2.5.2
      - Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 
    
    TorchVision: 0.8.2+cu101
    OpenCV: 4.5.3
    MMCV: 1.3.11
    MMCV Compiler: GCC 7.3
    MMCV CUDA Compiler: 10.1
    MMTracking: 0.6.0+4d78b77
    ------------------------------------------------------------
    
    2021-08-17 11:24:25,348 - mmtrack - INFO - Distributed training: False
    2021-08-17 11:24:26,303 - mmtrack - INFO - Config:
    dataset_type = 'ReIDDataset'
    img_norm_cfg = dict(
        mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
    train_pipeline = [
        dict(type='LoadMultiImagesFromFile', to_float32=True),
        dict(
            type='SeqResize',
            img_scale=(128, 256),
            share_params=False,
            keep_ratio=False,
            bbox_clip_border=False,
            override=False),
        dict(
            type='SeqRandomFlip',
            share_params=False,
            flip_ratio=0.5,
            direction='horizontal'),
        dict(
            type='SeqNormalize',
            mean=[123.675, 116.28, 103.53],
            std=[58.395, 57.12, 57.375],
            to_rgb=True),
        dict(type='VideoCollect', keys=['img', 'gt_label']),
        dict(type='ReIDFormatBundle')
    ]
    test_pipeline = [
        dict(type='LoadImageFromFile'),
        dict(type='Resize', img_scale=(128, 256), keep_ratio=False),
        dict(
            type='Normalize',
            mean=[123.675, 116.28, 103.53],
            std=[58.395, 57.12, 57.375],
            to_rgb=True),
        dict(type='ImageToTensor', keys=['img']),
        dict(type='Collect', keys=['img'], meta_keys=[])
    ]
    data_root = '/projects/datasets/MOT/MOT17/'
    data = dict(
        samples_per_gpu=2,
        workers_per_gpu=2,
        train=dict(
            type='ReIDDataset',
            triplet_sampler=dict(num_ids=8, ins_per_id=4),
            data_prefix='/projects/datasets/MOT/MOT17/reid/imgs',
            ann_file='/projects/datasets/MOT/MOT17/reid/meta/train_80.txt',
            pipeline=[
                dict(type='LoadMultiImagesFromFile', to_float32=True),
                dict(
                    type='SeqResize',
                    img_scale=(128, 256),
                    share_params=False,
                    keep_ratio=False,
                    bbox_clip_border=False,
                    override=False),
                dict(
                    type='SeqRandomFlip',
                    share_params=False,
                    flip_ratio=0.5,
                    direction='horizontal'),
                dict(
                    type='SeqNormalize',
                    mean=[123.675, 116.28, 103.53],
                    std=[58.395, 57.12, 57.375],
                    to_rgb=True),
                dict(type='VideoCollect', keys=['img', 'gt_label']),
                dict(type='ReIDFormatBundle')
            ]),
        val=dict(
            type='ReIDDataset',
            triplet_sampler=None,
            data_prefix='/projects/datasets/MOT/MOT17/reid/imgs',
            ann_file='/projects/datasets/MOT/MOT17/reid/meta/val_20.txt',
            pipeline=[
                dict(type='LoadImageFromFile'),
                dict(type='Resize', img_scale=(128, 256), keep_ratio=False),
                dict(
                    type='Normalize',
                    mean=[123.675, 116.28, 103.53],
                    std=[58.395, 57.12, 57.375],
                    to_rgb=True),
                dict(type='ImageToTensor', keys=['img']),
                dict(type='Collect', keys=['img'], meta_keys=[])
            ]),
        test=dict(
            type='ReIDDataset',
            triplet_sampler=None,
            data_prefix='/projects/datasets/MOT/MOT17/reid/imgs',
            ann_file='/projects/datasets/MOT/MOT17/reid/meta/val_20.txt',
            pipeline=[
                dict(type='LoadImageFromFile'),
                dict(type='Resize', img_scale=(128, 256), keep_ratio=False),
                dict(
                    type='Normalize',
                    mean=[123.675, 116.28, 103.53],
                    std=[58.395, 57.12, 57.375],
                    to_rgb=True),
                dict(type='ImageToTensor', keys=['img']),
                dict(type='Collect', keys=['img'], meta_keys=[])
            ]))
    evaluation = dict(interval=1, metric='mAP')
    optimizer = dict(type='SGD', lr=0.0025, momentum=0.9, weight_decay=0.0001)
    optimizer_config = dict(grad_clip=None)
    checkpoint_config = dict(interval=1)
    log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')])
    dist_params = dict(backend='nccl')
    log_level = 'INFO'
    load_from = None
    resume_from = None
    workflow = [('train', 1)]
    USE_MMCLS = True
    model = dict(
        type='BaseReID',
        backbone=dict(
            type='ResNet',
            depth=50,
            num_stages=4,
            out_indices=(3, ),
            style='pytorch'),
        neck=dict(type='GlobalAveragePooling', kernel_size=(8, 4), stride=1),
        head=dict(
            type='LinearReIDHead',
            num_fcs=1,
            in_channels=2048,
            fc_channels=1024,
            out_channels=128,
            num_classes=378,
            loss=dict(type='CrossEntropyLoss', loss_weight=1.0),
            loss_pairwise=dict(type='TripletLoss', margin=0.3, loss_weight=1.0),
            norm_cfg=dict(type='BN1d'),
            act_cfg=dict(type='ReLU')),
        init_cfg=dict(
            type='Pretrained',
            checkpoint=
            'https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_batch256_imagenet_20200708-cfb998bf.pth'
        ))
    lr_config = dict(
        policy='step',
        warmup='linear',
        warmup_iters=1000,
        warmup_ratio=0.001,
        step=[5])
    total_epochs = 6
    work_dir = 'work_dirs/resnet50_b32x8_MOT17'
    gpu_ids = range(0, 1)
    
    2021-08-17 11:24:26,638 - mmtrack - INFO - initialize BaseReID with init_cfg {'type': 'Pretrained', 'checkpoint': 'https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_batch256_imagenet_20200708-cfb998bf.pth'}
    2021-08-17 11:24:26,638 - mmcv - INFO - load model from: https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_batch256_imagenet_20200708-cfb998bf.pth
    2021-08-17 11:24:26,638 - mmcv - INFO - Use load_from_http loader
    2021-08-17 11:24:26,844 - mmcv - WARNING - The model and loaded state dict do not match exactly
    
    unexpected key in source state_dict: head.fc.weight, head.fc.bias
    
    missing keys in source state_dict: head.fcs.0.fc.weight, head.fcs.0.fc.bias, head.fcs.0.bn.weight, head.fcs.0.bn.bias, head.fcs.0.bn.running_mean, head.fcs.0.bn.running_var, head.fc_out.weight, head.fc_out.bias, head.bn.weight, head.bn.bias, head.bn.running_mean, head.bn.running_var, head.classifier.weight, head.classifier.bias
    
    2021-08-17 11:24:33,803 - mmtrack - INFO - Start running, host: qljx17@gpu3, work_dir: /home2/qljx17/Open-MMLab/mmtracking/work_dirs/resnet50_b32x8_MOT17
    2021-08-17 11:24:33,803 - mmtrack - INFO - Hooks will be executed in the following order:
    before_run:
    (VERY_HIGH   ) StepLrUpdaterHook                  
    (NORMAL      ) CheckpointHook                     
    (NORMAL      ) EvalHook                           
    (VERY_LOW    ) TextLoggerHook                     
     -------------------- 
    before_train_epoch:
    (VERY_HIGH   ) StepLrUpdaterHook                  
    (NORMAL      ) EvalHook                           
    (LOW         ) IterTimerHook                      
    (VERY_LOW    ) TextLoggerHook                     
     -------------------- 
    before_train_iter:
    (VERY_HIGH   ) StepLrUpdaterHook                  
    (NORMAL      ) EvalHook                           
    (LOW         ) IterTimerHook                      
     -------------------- 
    after_train_iter:
    (ABOVE_NORMAL) OptimizerHook                      
    (NORMAL      ) CheckpointHook                     
    (NORMAL      ) EvalHook                           
    (LOW         ) IterTimerHook                      
    (VERY_LOW    ) TextLoggerHook                     
     -------------------- 
    after_train_epoch:
    (NORMAL      ) CheckpointHook                     
    (NORMAL      ) EvalHook                           
    (VERY_LOW    ) TextLoggerHook                     
     -------------------- 
    before_val_epoch:
    (LOW         ) IterTimerHook                      
    (VERY_LOW    ) TextLoggerHook                     
     -------------------- 
    before_val_iter:
    (LOW         ) IterTimerHook                      
     -------------------- 
    after_val_iter:
    (LOW         ) IterTimerHook                      
     -------------------- 
    after_val_epoch:
    (VERY_LOW    ) TextLoggerHook                     
     -------------------- 
    2021-08-17 11:24:33,803 - mmtrack - INFO - workflow: [('train', 1)], max: 6 epochs
    /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: ClassNLLCriterion_updateOutput_no_reduce_kernel: block: [0,0,0], thread: [44,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
    /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: ClassNLLCriterion_updateOutput_no_reduce_kernel: block: [0,0,0], thread: [45,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
    /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: ClassNLLCriterion_updateOutput_no_reduce_kernel: block: [0,0,0], thread: [46,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
    /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: ClassNLLCriterion_updateOutput_no_reduce_kernel: block: [0,0,0], thread: [47,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
    Traceback (most recent call last):
      File "./tools/train.py", line 174, in <module>
        main()
      File "./tools/train.py", line 163, in main
        train_model(
      File "/home2/qljx17/Open-MMLab/mmtracking/mmtrack/apis/train.py", line 136, in train_model
        runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
      File "/home2/qljx17/Open-MMLab/evenv/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
        epoch_runner(data_loaders[i], **kwargs)
      File "/home2/qljx17/Open-MMLab/evenv/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
        self.run_iter(data_batch, train_mode=True, **kwargs)
      File "/home2/qljx17/Open-MMLab/evenv/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 29, in run_iter
        outputs = self.model.train_step(data_batch, self.optimizer,
      File "/home2/qljx17/Open-MMLab/evenv/lib/python3.8/site-packages/mmcv/parallel/data_parallel.py", line 67, in train_step
        return self.module.train_step(*inputs[0], **kwargs[0])
      File "/home2/qljx17/Open-MMLab/mmclassification/mmcls/models/classifiers/base.py", line 146, in train_step
        loss, log_vars = self._parse_losses(losses)
      File "/home2/qljx17/Open-MMLab/mmclassification/mmcls/models/classifiers/base.py", line 97, in _parse_losses
        log_vars[loss_name] = loss_value.mean()
    RuntimeError: CUDA error: device-side assert triggered
    terminate called after throwing an instance of 'c10::Error'
      what():  CUDA error: device-side assert triggered
    Exception raised from create_event_internal at /pytorch/c10/cuda/CUDACachingAllocator.cpp:687 (most recent call first):
    frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7fc1479138b2 in /home2/qljx17/Open-MMLab/evenv/lib/python3.8/site-packages/torch/lib/libc10.so)
    frame #1: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0xad2 (0x7fc147b65952 in /home2/qljx17/Open-MMLab/evenv/lib/python3.8/site-packages/torch/lib/libc10_cuda.so)
    frame #2: c10::TensorImpl::release_resources() + 0x4d (0x7fc1478feb7d in /home2/qljx17/Open-MMLab/evenv/lib/python3.8/site-packages/torch/lib/libc10.so)
    frame #3: <unknown function> + 0x5fd7a2 (0x7fc1920fb7a2 in /home2/qljx17/Open-MMLab/evenv/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
    frame #4: <unknown function> + 0x5fd856 (0x7fc1920fb856 in /home2/qljx17/Open-MMLab/evenv/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
    frame #5: python3() [0x534ce6]
    frame #6: python3() [0x51c5d9]
    frame #7: python3() [0x52cb15]
    frame #8: python3() [0x52cb15]
    frame #9: python3() [0x500a2e]
    frame #10: python3() [0x57d905]
    frame #11: python3() [0x57d8bb]
    frame #12: python3() [0x57d8bb]
    frame #13: python3() [0x57d8bb]
    frame #14: python3() [0x57d8bb]
    frame #15: python3() [0x57d8bb]
    frame #16: python3() [0x57d8bb]
    frame #17: python3() [0x5f25e6]
    <omitting python frames>
    frame #23: __libc_start_main + 0xf3 (0x7fc1a2ef10b3 in /lib/x86_64-linux-gnu/libc.so.6)
    
    /var/spool/slurmd/job128755/slurm_script: line 21: 3941330 Aborted                 (core dumped) python3 ./tools/train.py configs/reid/resnet50_b32x8_MOT17.py --work-dir work_dirs/resnet50_b32x8_MOT17
    ^Z
    

    Bug fix From the error above, I can assume that its because of the number of classes. From the default config, num of class is being set as 378, which is taken from train_80.txt, hence the error appear. However, when I set the num of class as 512, which is the number of samples in imgs folder, Im able to run the training without any error. Is there something that I missed, or the number of classes could be the main problem here?

    opened by yonafalinie 8
  • What is the difference between load_from and pretrain?

    What is the difference between load_from and pretrain?

    Hello~ Thanks a lot for your awesome job and I appreciate for your effort! However, I have some problems hoping you to help me solve it. When I use the default configure at configs/det/faster-rcnn_r50_fpn_4e_mot17-half.py to train faster-rcnn detector by MMtracking, I got Nan losses. But when I change the downloaded state dict, which is pretrained faster-rcnn on COCO dataset, from 'load_from' entry to 'pretrain' entry of detector, the Nan losses disappears. I wonder how this happen? What's the difference between 'load_from' and 'pretrain', since both of them seem not to strictly load parameters? Thanks a lot again!

    I check again, finding that the 'pretrain' entry for detection seems NOT load pretrain dicts as I expected, and directly train from randomly initiated parameters. So how to use the pretrained faster rcnn model dicts anyway?

    opened by gsygsy96 8
  • Problem met when running MOT demo

    Problem met when running MOT demo

    Hi, I met a problem when I run MOT demo. It said that "IndexError: tensors used as indices must be long, byte or bool tensors /cluster/home/it_stu12/.conda/envs/gjj/lib/python3.7/tempfile.py:798: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmpd1jtqm1n'> _warnings.warn(warn_message, ResourceWarning)". Here's my error log.

    Error Log

    Traceback (most recent call last): File "demo/demo_mot.py", line 94, in main() File "demo/demo_mot.py", line 70, in main result = inference_mot(model, img, frame_id=i) File "/cluster/home/it_stu12/main/gjj/mmtracking/mmtrack/apis/inference.py", line 81, in inference_mot data = collate([data], samples_per_gpu=1) File "/cluster/home/it_stu12/main/gjj/mmcv/mmcv/parallel/collate.py", line 81, in collate for key in batch[0] File "/cluster/home/it_stu12/main/gjj/mmcv/mmcv/parallel/collate.py", line 81, in for key in batch[0] File "/cluster/home/it_stu12/main/gjj/mmcv/mmcv/parallel/collate.py", line 77, in collate return [collate(samples, samples_per_gpu) for samples in transposed] File "/cluster/home/it_stu12/main/gjj/mmcv/mmcv/parallel/collate.py", line 77, in return [collate(samples, samples_per_gpu) for samples in transposed] File "/cluster/home/it_stu12/main/gjj/mmcv/mmcv/parallel/collate.py", line 81, in collate for key in batch[0] File "/cluster/home/it_stu12/main/gjj/mmcv/mmcv/parallel/collate.py", line 81, in for key in batch[0] File "/cluster/home/it_stu12/main/gjj/mmcv/mmcv/parallel/collate.py", line 80, in key: collate([d[key] for d in batch], samples_per_gpu) IndexError: tensors used as indices must be long, byte or bool tensors /cluster/home/it_stu12/.conda/envs/gjj/lib/python3.7/tempfile.py:798: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmpd1jtqm1n'> _warnings.warn(warn_message, ResourceWarning)

    Environment

    No CUDA runtime is found, using CUDA_HOME='/cluster/apps/cuda/10.1' sys.platform: linux Python: 3.7.10 (default, Jun 4 2021, 14:48:32) [GCC 7.5.0] CUDA available: False GCC: gcc (GCC) 5.4.0 PyTorch: 1.6.0 PyTorch compiling details: PyTorch built with:

    • GCC 7.3
    • C++ Version: 201402
    • Intel(R) Math Kernel Library Version 2019.0.5 Product Build 20190808 for Intel(R) 64 architecture applications
    • Intel(R) MKL-DNN v1.5.0 (Git Hash e2ac1fac44c5078ca927cb9b90e1b3066a0b2ed0)
    • OpenMP 201511 (a.k.a. OpenMP 4.5)
    • NNPACK is enabled
    • CPU capability usage: AVX2
    • Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

    TorchVision: 0.7.0 OpenCV: 4.1.0 MMCV: 1.4.2 MMCV Compiler: GCC 5.4 MMCV CUDA Compiler: not available MMTracking: 0.8.0+603d6fe

    Some Other Problems

    • The doc of MMTracking 0.8 says that MMCV version should be mmcv-full>=1.3.8, <1.4.0. But when I install mmcv-full 1.3.9, it told me that my "mmcv-full is too old, please install mmcv-fulll >=1.3.16, <=1.5.0". Which one should I believe?
    • The Chinese version doc of MMTracking 0.8 gives a demo script python demo/demo_mot.py configs/mot/deepsort/sort_faster-rcnn_fpn_4e_mot17-private.py --input demo/demo.mp4 --output mot.mp4. But I didn't find demo_mot.py in folder demo but found demo_mot_vis.py. Maybe the Chinese doc should be updated?

    Thank you so much!

    opened by AndrewGuo0930 7
  • time estimation log export

    time estimation log export

    I found this library very helpful. Great work. I have to ask, Is it possible to keep export of the log while the video (time + weight of the tracking object) is running either live camera or playing recorded video? Please guide me if how to export that log using this library.

    opened by Tortoise17 7
  • pickle file has only det_bboxes

    pickle file has only det_bboxes

    Hello, I have tested on my custom dataset for VID and saved the results to a .pkl file. However, the pickle file seems to have only the det_bboxes and not the det_labels . Is there any way to add det_labels too? Any tips would be helpful!

    opened by godwinrayanc 6
  • How to select classes of which outputs from Detector model to be fed into reid model ?

    How to select classes of which outputs from Detector model to be fed into reid model ?

    I have trained the detector model in mmdetection with multiple classes , if i want to fed the "person" class outputs alone from the detector model to the reid model during inference , can i do that using config or any other method ?

    And also if i have to fed the mmdetection pretrained model into tracker , what are the config changes have to be done ?

    Thank you in advance

    opened by Balakumaran-kandula 6
  • I want to train the masktrackrcnn, but it occur :KeyError:

    I want to train the masktrackrcnn, but it occur :KeyError: "YouTubeVISDataset: 'image_id'"

    Hello! I want to train the masktrackrcnn by the official youtube_vis_dataset but it occur :KeyError: "YouTubeVISDataset: 'image_id'". Here is my datatree image

    Traceback (most recent call last):
      File "/home/music/miniconda3/envs/mmtrack/lib/python3.8/site-packages/mmcv/utils/registry.py", line 69, in build_from_cfg
        return obj_cls(**args)
      File "/home/music/Downloads/mmtracking-master/mmtrack/datasets/youtube_vis_dataset.py", line 44, in __init__
        super().__init__(*args, **kwargs)
      File "/home/music/Downloads/mmtracking-master/mmtrack/datasets/coco_video_dataset.py", line 46, in __init__
        super().__init__(*args, **kwargs)
      File "/home/music/miniconda3/envs/mmtrack/lib/python3.8/site-packages/mmdet/datasets/custom.py", line 97, in __init__
        self.data_infos = self.load_annotations(local_path)
      File "/home/music/Downloads/mmtracking-master/mmtrack/datasets/coco_video_dataset.py", line 61, in load_annotations
        data_infos = self.load_video_anns(ann_file)
      File "/home/music/Downloads/mmtracking-master/mmtrack/datasets/coco_video_dataset.py", line 73, in load_video_anns
        self.coco = CocoVID(ann_file)
      File "/home/music/Downloads/mmtracking-master/mmtrack/datasets/parsers/coco_video_parser.py", line 22, in __init__
        super(CocoVID, self).__init__(annotation_file=annotation_file)
      File "/home/music/miniconda3/envs/mmtrack/lib/python3.8/site-packages/mmdet/datasets/api_wrappers/coco_api.py", line 23, in __init__
        super().__init__(annotation_file=annotation_file)
      File "/home/music/miniconda3/envs/mmtrack/lib/python3.8/site-packages/pycocotools/coco.py", line 86, in __init__
        self.createIndex()
      File "/home/music/Downloads/mmtracking-master/mmtrack/datasets/parsers/coco_video_parser.py", line 57, in createIndex
        imgToAnns[ann['image_id']].append(ann)
    KeyError: 'image_id'
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "tools/train.py", line 213, in <module>
        main()
      File "tools/train.py", line 188, in main
        datasets = [build_dataset(cfg.data.train)]
      File "/home/music/miniconda3/envs/mmtrack/lib/python3.8/site-packages/mmdet/datasets/builder.py", line 82, in build_dataset
        dataset = build_from_cfg(cfg, DATASETS, default_args)
      File "/home/music/miniconda3/envs/mmtrack/lib/python3.8/site-packages/mmcv/utils/registry.py", line 72, in build_from_cfg
        raise type(e)(f'{obj_cls.__name__}: {e}')
    KeyError: "YouTubeVISDataset: 'image_id'"
    

    Thank you!

    opened by eatbreakfast111 2
  • IndexError: boolean index did not match indexed array along dimension 0; dimension is 10 but corresponding boolean dimension is 9

    IndexError: boolean index did not match indexed array along dimension 0; dimension is 10 but corresponding boolean dimension is 9

    Thanks for your error report and we appreciate it a lot.

    Checklist

    1. I have searched related issues but cannot get the expected help.
    2. The bug has not been fixed in the latest version.

    Describe the bug I was trying to run qdtrack model for MOT17 in dev-1.x, but it always got this error.

    Reproduction

    1. What command or script did you run?
    srun -p bigdata_s2 --quotatype=auto --gres=gpu:1 python tools/train.py configs/mot/qdtrack/qdtrack_faster-rcnn_r50_fpn_8xb2-4e_mot17halftrain_test-mot17halfval.py
    
    1. Did you make any modifications on the code or config? Did you understand what you have modified? No
    2. What dataset did you use and what task did you run? MOT17

    Environment

    1. Please run python mmtrack/utils/collect_env.py to collect necessary environment information and paste it here. Got error:
    Traceback (most recent call last):
      File "mmtrack/utils/collect_env.py", line 2, in <module>
        from mmcv.utils import collect_env as collect_base_env
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmcv/__init__.py", line 3, in <module>
        from .arraymisc import *
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmcv/arraymisc/__init__.py", line 2, in <module>
        from .quantization import dequantize, quantize
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmcv/arraymisc/quantization.py", line 2, in <module>
        from typing import Union
      File "/mnt/petrelfs/ouyanglinke/mmtracking/mmtrack/utils/typing.py", line 3, in <module>
        from typing import Dict, List, Optional, Tuple, Union
    ImportError: cannot import name 'Dict' from partially initialized module 'typing' (most likely due to a circular import) (/mnt/petrelfs/ouyanglinke/mmtracking/mmtrack/utils/typing.py)
    

    But I do successfully run SOT model. Python version is 3.8. Pytorch is 1.7.1.

    1. You may add addition that may be helpful for locating the problem, such as
      • How you installed PyTorch [e.g., pip, conda, source]
      • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

    Error traceback If applicable, paste the error trackback here.

    Traceback (most recent call last):
      File "tools/train.py", line 119, in <module>
        main()
      File "tools/train.py", line 115, in main
        runner.train()
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1684, in train
        model = self.train_loop.run()  # type: ignore
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/runner/loops.py", line 90, in run
        self.run_epoch()
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/runner/loops.py", line 105, in run_epoch
        for idx, data_batch in enumerate(self.dataloader):
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in __next__
        data = self._next_data()
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data
        return self._process_data(data)
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
        data.reraise()
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/_utils.py", line 428, in reraise
        raise self.exc_type(msg)
    IndexError: Caught IndexError in DataLoader worker process 0.
    Original Traceback (most recent call last):
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
        data = fetcher.fetch(index)
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
        data = [self.dataset[idx] for idx in possibly_batched_index]
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
        data = [self.dataset[idx] for idx in possibly_batched_index]
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/dataset/base_dataset.py", line 378, in __getitem__
        data = self.prepare_data(idx)
      File "/mnt/petrelfs/ouyanglinke/mmtracking/mmtrack/datasets/base_video_dataset.py", line 387, in prepare_data
        return self.pipeline(final_data_info)
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/dataset/base_dataset.py", line 55, in __call__
        data = t(data)
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmcv/transforms/base.py", line 12, in __call__
        return self.transform(results)
      File "/mnt/petrelfs/ouyanglinke/mmtracking/mmtrack/datasets/transforms/formatting.py", line 237, in transform
        key_anns[key_valid_idx])
    IndexError: boolean index did not match indexed array along dimension 0; dimension is 10 but corresponding boolean dimension is 9
    

    Bug fix If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated! This message might help: In mmtracking/configs/base/datasets/mot_challenge.py, just block "TransformBroadcaster" would work.

    # data pipeline
    train_pipeline = [
        dict(
            type='TransformBroadcaster',
            share_random_params=True,
            transforms=[
                dict(type='LoadImageFromFile'),
                dict(type='LoadTrackAnnotations', with_instance_id=True),
                dict(
                    type='mmdet.RandomResize',
                    scale=(1088, 1088),
                    ratio_range=(0.8, 1.2),
                    keep_ratio=True,
                    clip_object_border=False),
                dict(type='mmdet.PhotoMetricDistortion')
            ]),
        # dict(
        #     type='TransformBroadcaster',
        #     share_random_params=False,
        #     transforms=[
        #         dict(
        #             type='mmdet.RandomCrop',
        #             crop_size=(1088, 1088),
        #             bbox_clip_border=False)
        #     ]),
        dict(
            type='TransformBroadcaster',
            share_random_params=True,
            transforms=[
                dict(type='mmdet.RandomFlip', prob=0.5),
            ]),
        dict(type='PackTrackInputs', ref_prefix='ref', num_key_frames=1)
    ]
    
    opened by ouyanglinke 1
  • TypeError: forward_train() missing 4 required positional arguments: 'ref_img', 'ref_img_metas', 'ref_gt_bboxes', and 'ref_gt_labels'

    TypeError: forward_train() missing 4 required positional arguments: 'ref_img', 'ref_img_metas', 'ref_gt_bboxes', and 'ref_gt_labels'

    Hello, I want to train the masktrack_rcnn with the coco dataset. So i had reset the dataset of masktrack_rcnn_r50_fpn_12e_youtubevis2019.py------'../../_base_/datasets/coco_instance.py' and the num_classes=6. image

    By the way, i had reset the /home/music/Downloads/mmtracking-master/mmtrack/datasets/coco_video_dataset.py-------CLASSES = ('aircraft', 'buildings', 'electrical', 'person', 'tree', 'wire') and theload_as_video=False image

    And my env:

    ------------------------------------------------------------
    sys.platform: linux
    Python: 3.8.15 (default, Nov 24 2022, 15:19:38) [GCC 11.2.0]
    CUDA available: True
    GPU 0: NVIDIA GeForce RTX 3090 Ti
    CUDA_HOME: None
    GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
    PyTorch: 1.12.1
    PyTorch compiling details: PyTorch built with:
      - GCC 9.3
      - C++ Version: 201402
      - Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
      - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
      - OpenMP 201511 (a.k.a. OpenMP 4.5)
      - LAPACK is enabled (usually provided by MKL)
      - NNPACK is enabled
      - CPU capability usage: AVX2
      - CUDA Runtime 11.3
      - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
      - CuDNN 8.3.2  (built against CUDA 11.5)
      - Magma 2.5.2
      - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.12.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, 
    
    TorchVision: 0.13.1
    OpenCV: 4.6.0
    MMCV: 1.7.0
    MMCV Compiler: GCC 9.3
    MMCV CUDA Compiler: 11.3
    MMTracking: 0.14.0+
    

    This my config:

    2022-12-16 10:09:31,585 - mmtrack - INFO - Distributed training: False
    2022-12-16 10:09:32,054 - mmtrack - INFO - Config:
    model = dict(
        detector=dict(
            type='MaskRCNN',
            backbone=dict(
                type='ResNet',
                depth=50,
                num_stages=4,
                out_indices=(0, 1, 2, 3),
                frozen_stages=1,
                norm_cfg=dict(type='BN', requires_grad=True),
                norm_eval=True,
                style='pytorch',
                init_cfg=dict(
                    type='Pretrained', checkpoint='torchvision://resnet50')),
            neck=dict(
                type='FPN',
                in_channels=[256, 512, 1024, 2048],
                out_channels=256,
                num_outs=5),
            rpn_head=dict(
                type='RPNHead',
                in_channels=256,
                feat_channels=256,
                anchor_generator=dict(
                    type='AnchorGenerator',
                    scales=[8],
                    ratios=[0.5, 1.0, 2.0],
                    strides=[4, 8, 16, 32, 64]),
                bbox_coder=dict(
                    type='DeltaXYWHBBoxCoder',
                    target_means=[0.0, 0.0, 0.0, 0.0],
                    target_stds=[1.0, 1.0, 1.0, 1.0]),
                loss_cls=dict(
                    type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
                loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
            roi_head=dict(
                type='StandardRoIHead',
                bbox_roi_extractor=dict(
                    type='SingleRoIExtractor',
                    roi_layer=dict(
                        type='RoIAlign', output_size=7, sampling_ratio=0),
                    out_channels=256,
                    featmap_strides=[4, 8, 16, 32]),
                bbox_head=dict(
                    type='Shared2FCBBoxHead',
                    in_channels=256,
                    fc_out_channels=1024,
                    roi_feat_size=7,
                    num_classes=6,
                    bbox_coder=dict(
                        type='DeltaXYWHBBoxCoder',
                        target_means=[0.0, 0.0, 0.0, 0.0],
                        target_stds=[0.1, 0.1, 0.2, 0.2]),
                    reg_class_agnostic=False,
                    loss_cls=dict(
                        type='CrossEntropyLoss',
                        use_sigmoid=False,
                        loss_weight=1.0),
                    loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
                mask_roi_extractor=dict(
                    type='SingleRoIExtractor',
                    roi_layer=dict(
                        type='RoIAlign', output_size=14, sampling_ratio=0),
                    out_channels=256,
                    featmap_strides=[4, 8, 16, 32]),
                mask_head=dict(
                    type='FCNMaskHead',
                    num_convs=4,
                    in_channels=256,
                    conv_out_channels=256,
                    num_classes=6,
                    loss_mask=dict(
                        type='CrossEntropyLoss', use_mask=True, loss_weight=1.0))),
            train_cfg=dict(
                rpn=dict(
                    assigner=dict(
                        type='MaxIoUAssigner',
                        pos_iou_thr=0.7,
                        neg_iou_thr=0.3,
                        min_pos_iou=0.3,
                        match_low_quality=True,
                        ignore_iof_thr=-1),
                    sampler=dict(
                        type='RandomSampler',
                        num=64,
                        pos_fraction=0.5,
                        neg_pos_ub=-1,
                        add_gt_as_proposals=False),
                    allowed_border=-1,
                    pos_weight=-1,
                    debug=False),
                rpn_proposal=dict(
                    nms_pre=200,
                    max_per_img=200,
                    nms=dict(type='nms', iou_threshold=0.7),
                    min_bbox_size=0),
                rcnn=dict(
                    assigner=dict(
                        type='MaxIoUAssigner',
                        pos_iou_thr=0.5,
                        neg_iou_thr=0.5,
                        min_pos_iou=0.5,
                        match_low_quality=True,
                        ignore_iof_thr=-1),
                    sampler=dict(
                        type='RandomSampler',
                        num=128,
                        pos_fraction=0.25,
                        neg_pos_ub=-1,
                        add_gt_as_proposals=True),
                    mask_size=28,
                    pos_weight=-1,
                    debug=False)),
            test_cfg=dict(
                rpn=dict(
                    nms_pre=200,
                    max_per_img=200,
                    nms=dict(type='nms', iou_threshold=0.7),
                    min_bbox_size=0),
                rcnn=dict(
                    score_thr=0.01,
                    nms=dict(type='nms', iou_threshold=0.5),
                    max_per_img=100,
                    mask_thr_binary=0.5)),
            init_cfg=dict(
                type='Pretrained',
                checkpoint=
                'https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_fpn_1x_coco/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth'
            )),
        type='MaskTrackRCNN',
        track_head=dict(
            type='RoITrackHead',
            roi_extractor=dict(
                type='SingleRoIExtractor',
                roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
                out_channels=256,
                featmap_strides=[4, 8, 16, 32]),
            embed_head=dict(
                type='RoIEmbedHead',
                num_fcs=2,
                roi_feat_size=7,
                in_channels=256,
                fc_out_channels=1024),
            train_cfg=dict(
                assigner=dict(
                    type='MaxIoUAssigner',
                    pos_iou_thr=0.5,
                    neg_iou_thr=0.5,
                    min_pos_iou=0.5,
                    match_low_quality=True,
                    ignore_iof_thr=-1),
                sampler=dict(
                    type='RandomSampler',
                    num=128,
                    pos_fraction=0.25,
                    neg_pos_ub=-1,
                    add_gt_as_proposals=True),
                pos_weight=-1,
                debug=False)),
        tracker=dict(
            type='MaskTrackRCNNTracker',
            match_weights=dict(det_score=1.0, iou=2.0, det_label=10.0),
            num_frames_retain=20))
    dataset_type = 'CocoDataset'
    data_root = '/home/music/Downloads/mmtracking-master/data/coco/'
    img_norm_cfg = dict(
        mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
    train_pipeline = [
        dict(type='LoadImageFromFile'),
        dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
        dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
        dict(type='RandomFlip', flip_ratio=0.5),
        dict(
            type='Normalize',
            mean=[123.675, 116.28, 103.53],
            std=[58.395, 57.12, 57.375],
            to_rgb=True),
        dict(type='Pad', size_divisor=32),
        dict(type='DefaultFormatBundle'),
        dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks'])
    ]
    test_pipeline = [
        dict(type='LoadImageFromFile'),
        dict(
            type='MultiScaleFlipAug',
            img_scale=(1333, 800),
            flip=False,
            transforms=[
                dict(type='Resize', keep_ratio=True),
                dict(type='RandomFlip'),
                dict(
                    type='Normalize',
                    mean=[123.675, 116.28, 103.53],
                    std=[58.395, 57.12, 57.375],
                    to_rgb=True),
                dict(type='Pad', size_divisor=32),
                dict(type='ImageToTensor', keys=['img']),
                dict(type='Collect', keys=['img'])
            ])
    ]
    data = dict(
        samples_per_gpu=6,
        workers_per_gpu=2,
        train=dict(
            type='CocoDataset',
            ann_file=
            '/home/music/Downloads/mmtracking-master/data/coco/annotations/train.json',
            img_prefix=
            '/home/music/Downloads/mmtracking-master/data/coco/train2023/',
            pipeline=[
                dict(type='LoadImageFromFile'),
                dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
                dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
                dict(type='RandomFlip', flip_ratio=0.5),
                dict(
                    type='Normalize',
                    mean=[123.675, 116.28, 103.53],
                    std=[58.395, 57.12, 57.375],
                    to_rgb=True),
                dict(type='Pad', size_divisor=32),
                dict(type='DefaultFormatBundle'),
                dict(
                    type='Collect',
                    keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks'])
            ]),
        val=dict(
            type='CocoDataset',
            ann_file=
            '/home/music/Downloads/mmtracking-master/data/coco/annotations/val.json',
            img_prefix='/home/music/Downloads/mmtracking-master/data/coco/val2023/',
            pipeline=[
                dict(type='LoadImageFromFile'),
                dict(
                    type='MultiScaleFlipAug',
                    img_scale=(1333, 800),
                    flip=False,
                    transforms=[
                        dict(type='Resize', keep_ratio=True),
                        dict(type='RandomFlip'),
                        dict(
                            type='Normalize',
                            mean=[123.675, 116.28, 103.53],
                            std=[58.395, 57.12, 57.375],
                            to_rgb=True),
                        dict(type='Pad', size_divisor=32),
                        dict(type='ImageToTensor', keys=['img']),
                        dict(type='Collect', keys=['img'])
                    ])
            ]),
        test=dict(
            type='CocoDataset',
            ann_file=
            '/home/music/Downloads/mmtracking-master/data/coco/annotations/val.json',
            img_prefix='/home/music/Downloads/mmtracking-master/data/coco/val2023/',
            pipeline=[
                dict(type='LoadImageFromFile'),
                dict(
                    type='MultiScaleFlipAug',
                    img_scale=(1333, 800),
                    flip=False,
                    transforms=[
                        dict(type='Resize', keep_ratio=True),
                        dict(type='RandomFlip'),
                        dict(
                            type='Normalize',
                            mean=[123.675, 116.28, 103.53],
                            std=[58.395, 57.12, 57.375],
                            to_rgb=True),
                        dict(type='Pad', size_divisor=32),
                        dict(type='ImageToTensor', keys=['img']),
                        dict(type='Collect', keys=['img'])
                    ])
            ]))
    evaluation = dict(metric=['bbox', 'segm'], classwise=True)
    optimizer = dict(type='SGD', lr=0.00125, momentum=0.9, weight_decay=0.0001)
    optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
    checkpoint_config = dict(interval=1)
    log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')])
    dist_params = dict(backend='nccl')
    log_level = 'INFO'
    load_from = None
    resume_from = None
    workflow = [('train', 1)]
    opencv_num_threads = 0
    mp_start_method = 'fork'
    lr_config = dict(
        policy='step',
        warmup='linear',
        warmup_iters=500,
        warmup_ratio=0.3333333333333333,
        step=[8, 11])
    total_epochs = 12
    work_dir = 'work_dir/masktrack_coco'
    gpu_ids = [0]
    

    Best wish! Thank you!

    opened by lijoe123 7
Releases(v1.0.0rc1)
  • v1.0.0rc1(Oct 11, 2022)

    MMTracking 1.0.0rc1 is the 2-nd version of MMTracking 1.x, a part of the OpenMMLab 2.0 projects.

    Built upon the new training engine, MMTracking 1.x unifies the interfaces of datasets, models, evaluation, and visualization.

    And there are some BC-breaking changes. Please check the migration tutorial for more details.

    We also support more methods in MMTracking 1.x, such as StrongSORT for MOT, Mask2Former for VIS, PrDiMP for SOT.

    Source code(tar.gz)
    Source code(zip)
  • v0.14.0(Sep 19, 2022)

    Highlights

    • Introduce the 1.0.0rc0 version of MMTracking (#725)

    New Features

    • Support OC-SORT method for MOT (#545)

    • Support multi-class tracking in ByteTrack (#548)

    • Support DanceTrack dataset for MOT (#543)

    • Support TAO dataset for QDTrack (#585)

    Source code(tar.gz)
    Source code(zip)
  • v1.0.0rc0(Aug 31, 2022)

    We recommend you to use MMTracking v1.0.0rc1 version, since the v1.0.0rc0 version has some bugs about the limitation of minimum version of mmdet.

    Source code(tar.gz)
    Source code(zip)
  • v0.13.0(Apr 29, 2022)

    Highlights

    • Support tracking colab tutorial (#511)

    New Features

    • Refactor the training datasets of SiamRPN++ (#496), (#518)

    • Support loading data from ceph for SOT datasets (#494)

    • Support loading data from ceph for MOT challenge dataset (#517)

    • Support evaluation metric for VIS task (#501)

    Bug Fixes

    • Fix a bug in the LaSOT datasets and update the pretrained models of STARK (#483), (#503)

    • Fix a bug in the format_results function of VIS task (#504)

    Source code(tar.gz)
    Source code(zip)
  • v0.12.0(Apr 1, 2022)

  • v0.11.0(Mar 4, 2022)

  • v0.10.0(Feb 10, 2022)

  • v0.9.0(Jan 6, 2022)

    Highlights

    • Support arXiv 2021 manuscript 'ByteTrack: Multi-Object Tracking by Associating Every Detection Box' (#385), (#383), (#372)
    • Support ICCV 2019 paper 'Video Instance Segmentation' (#304), (#303), (#298), (#292)

    New Features

    • Support CrowdHuman dataset for MOT (#366)
    • Support VOT2018 dataset for SOT (#305)
    • Support YouTube-VIS dataset for VIS (#290)

    Bug Fixes

    • Fix two significant bugs in SOT and provide new SOT pretrained models (#349)

    Improvements

    • Refactor LaSOT, TrackingNet dataset and support GOT-10K datasets (#296)
    • Support persisitent workers (#348)
    Source code(tar.gz)
    Source code(zip)
  • v0.8.0(Oct 3, 2021)

    New Features

    • Support OTB100 dataset in SOT (#271)
    • Support TrackingNet dataset in SOT (#268)
    • Support UAV123 dataset in SOT (#260)

    Bug Fixes

    • Fix a bug in mot_param_search.py (#270)

    Improvements

    • Use PyTorch sphinx theme (#274)
    • Use pycocotools instead of mmpycocotools (#263)
    Source code(tar.gz)
    Source code(zip)
  • v0.7.0(Sep 3, 2021)

    Highlights

    • Release code of AAAI 2021 paper 'Temporal ROI Align for Video Object Recognition' (#247)
    • Refactor English documentations (#243)
    • Add Chinese documentations (#248), (#250)

    New Features

    • Support fp16 training and testing (#230)
    • Release model using ResNeXt-101 as backbone for all VID methods (#254)
    • Support the results of Tracktor on MOT15, MOT16 and MOT20 datasets (#217)
    • Support visualization for single gpu test (#216)

    Bug Fixes

    • Fix a bug in MOTP evaluation (#235)
    • Fix two bugs in reid training and testing (#249)

    Improvements

    • Refactor anchor in SiameseRPN++ (#229)
    • Unify model initialization (#235)
    • Refactor unittest (#231)
    Source code(tar.gz)
    Source code(zip)
  • v0.6.0(Jul 30, 2021)

    Highlights

    • Fix training bugs of all three tasks (#219), (#221)

    New Features

    • Support error visualization for mot task (#212)

    Bug Fixes

    • Fix a bug in SOT demo (#213)

    Improvements

    • Use MMCV registry (#220)
    • Add README.md for reid training (#210)
    • Modify dict keys of the outputs of SOT (#223)
    • Add Chinese docs including install.md, quick_run.md, model_zoo.md, dataset.md (#205), (#214)
    Source code(tar.gz)
    Source code(zip)
  • v0.5.3(Jul 2, 2021)

  • v0.5.2(Jun 3, 2021)

  • v0.5.1(Feb 1, 2021)

  • v0.5.0(Jan 5, 2021)

    Highlights

    • MMTracking is released! It is the first open source toolbox that unifies versatile video perception tasks include single object tracking, multiple object tracking, and video object detection.

    New Features

    Source code(tar.gz)
    Source code(zip)
TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

yifan liu 147 Dec 3, 2022
OpenMMLab Semantic Segmentation Toolbox and Benchmark.

Documentation: https://mmsegmentation.readthedocs.io/ English | 简体中文 Introduction MMSegmentation is an open source semantic segmentation toolbox based

OpenMMLab 5k Dec 31, 2022
OpenMMLab Detection Toolbox and Benchmark

MMDetection is an open source object detection toolbox based on PyTorch. It is a part of the OpenMMLab project.

OpenMMLab 22.5k Jan 5, 2023
OpenMMLab Image and Video Editing Toolbox

Introduction MMEditing is an open source image and video editing toolbox based on PyTorch. It is a part of the OpenMMLab project. The master branch wo

OpenMMLab 3.9k Jan 4, 2023
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark

Introduction English | 简体中文 MMAction2 is an open-source toolbox for video understanding based on PyTorch. It is a part of the OpenMMLab project. The m

OpenMMLab 2.7k Jan 7, 2023
VID-Fusion: Robust Visual-Inertial-Dynamics Odometry for Accurate External Force Estimation

VID-Fusion VID-Fusion: Robust Visual-Inertial-Dynamics Odometry for Accurate External Force Estimation Authors: Ziming Ding , Tiankai Yang, Kunyi Zhan

ZJU FAST Lab 86 Nov 18, 2022
OpenMMLab Pose Estimation Toolbox and Benchmark.

Introduction English | 简体中文 MMPose is an open-source toolbox for pose estimation based on PyTorch. It is a part of the OpenMMLab project. The master b

OpenMMLab 2.8k Dec 31, 2022
Unified tracking framework with a single appearance model

Paper: Do different tracking tasks require different appearance model? [ArXiv] (comming soon) [Project Page] (comming soon) UniTrack is a simple and U

ZhongdaoWang 300 Dec 24, 2022
Object detection on multiple datasets with an automatically learned unified label space.

Simple multi-dataset detection An object detector trained on multiple large-scale datasets with a unified label space; Winning solution of E

Xingyi Zhou 407 Dec 30, 2022
FairMOT for Multi-Class MOT using YOLOX as Detector

FairMOT-X Project Overview FairMOT-X is a multi-class multi object tracker, which has been tailored for training on the BDD100K MOT Dataset. It makes

Jonathan Tan 33 Dec 28, 2022
Mmrotate - OpenMMLab Rotated Object Detection Benchmark

OpenMMLab website HOT OpenMMLab platform TRY IT OUT ?? Documentation | ??️ Insta

OpenMMLab 1.2k Jan 4, 2023
Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

Language: 简体中文 | English Introduction This is the code for Multiple Instance Active Learning for Object Detection, CVPR 2021. Installation A Linux pla

Tianning Yuan 269 Dec 21, 2022
Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

MI-AOD Language: 简体中文 | English Introduction This is the code for Multiple Instance Active Learning for Object Detection (The PDF is not available tem

Tianning Yuan 269 Dec 21, 2022
Models Supported: AlbUNet [18, 34, 50, 101, 152] (1D and 2D versions for Single and Multiclass Segmentation, Feature Extraction with supports for Deep Supervision and Guided Attention)

AlbUNet-1D-2D-Tensorflow-Keras This repository contains 1D and 2D Signal Segmentation Model Builder for AlbUNet and several of its variants developed

Sakib Mahmud 1 Nov 15, 2021
Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation This paper has been accepted and early accessed

Yun Liu 39 Sep 20, 2022
Python package for multiple object tracking research with focus on laboratory animals tracking.

motutils is a Python package for multiple object tracking research with focus on laboratory animals tracking. Features loads: MOTChallenge CSV, sleap

Matěj Šmíd 2 Sep 5, 2022
Autonomous Perception: 3D Object Detection with Complex-YOLO

Autonomous Perception: 3D Object Detection with Complex-YOLO LiDAR object detect

Thomas Dunlap 2 Feb 18, 2022
Real-time Object Detection for Streaming Perception, CVPR 2022

StreamYOLO Real-time Object Detection for Streaming Perception Jinrong Yang, Songtao Liu, Zeming Li, Xiaoping Li, Sun Jian Real-time Object Detection

Jinrong Yang 237 Dec 27, 2022
An example to implement a new backbone with OpenMMLab framework.

Backbone example on OpenMMLab framework English | 简体中文 Introduction This is an template repo about how to use OpenMMLab framework to develop a new bac

Ma Zerun 22 Dec 29, 2022