CVPR2022 paper "Dense Learning based Semi-Supervised Object Detection"

Bhchen

Last update: Dec 8, 2022

Related tags

Deep Learning DSL

Overview

[CVPR2022] DSL: Dense Learning based Semi-Supervised Object Detection

DSL is the first work on Anchor-Free detector for Semi-Supervised Object Detection (SSOD).

This code is established on mmdetection and is only used for research.

Instruction

Install dependencies

pytorch>=1.8.0
cuda 10.2
python>=3.8
mmcv-full 1.3.10

Download ImageNet pre-trained models

Download resnet50_rla_2283.pth (Google) resnet50_rla_2283.pth (Baidu, extract code: 5lf1) for later DSL training.

Training

For dynamically labeling the unlabeled images, original COCO dataset and VOC dataset will be converted to (DSL-style) datasets where annotations are saved in different json files and each image has its own annotation file. In addition, this implementation is slightly different from the original paper, where we clean the code, merge some data flow for speeding up training, add PatchShuffle also to the labeled images, and remove MetaNet for speeding up training as well, the final performance is similar as the original paper.

Clone this project & Create data root dir

cd ${project_root_dir}
git clone https://github.com/chenbinghui1/DSL.git
mkdir data
mkdir ori_data

#resulting format
#${project_root_dir}
#      - ori_data
#      - data
#      - DSL
#        - configs
#        - ...

For COCO Partially Labeled Data protocol

1. Download coco dataset and unzip it

mkdir ori_data/coco
cd ori_data/coco

wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
wget http://images.cocodataset.org/zips/train2017.zip
wget http://images.cocodataset.org/zips/val2017.zip
wget http://images.cocodataset.org/zips/unlabeled2017.zip

unzip annotations_trainval2017.zip -d .
unzip -q train2017.zip -d .
unzip -q val2017.zip -d .
unzip -q unlabeled2017.zip -d .

# resulting format
# ori_data/coco
#   - train2017
#     - xxx.jpg
#   - val2017
#     - xxx.jpg
#   - unlabled2017
#     - xxx.jpg
#   - annotations
#     - xxx.json
#     - ...

2. Convert coco to semicoco dataset

Use (tools/coco_convert2_semicoco_json.py) to generate the DSL-style coco data dir, i.e., semicoco/, which matches the code of unlabel training and pseudo-label update.

cd ${project_root_dir}/DSL
python3 tools/coco_convert2_semicoco_json.py --input ${project_root_dir}/ori_data/coco --output ${project_root_dir}/data/semicoco

You will obtain ${project_root_dir}/data/semicoco/ dir

3. Prepare partially labeled data

Use (data_list/coco_semi/prepare_dta.py) to generate the partially labeled data list_file. Now we take 10% labeled data as example

cd data_list/coco_semi/
python3 prepare_dta.py --percent 10 --root ${project_root_dir}/ori_data/coco --seed 2

You will obtain (data_list/coco_semi/semi_supervised/instances_train2017.${seed}@${percent}.json) (data_list/coco_semi/semi_supervised/instances_train2017.${seed}@${percent}-unlabel.json) (data_list/coco_semi/semi_supervised/instances_train2017.json) (data_list/coco_semi/semi_supervised/instances_val2017.json)

These above files are only used as image_list.

4. Train supervised baseline model

Train base model via (demo/model_train/baseline_coco.sh); configs are in dir (configs/fcos_semi/); Before running this script please change the corresponding file path in both script and config files.

cd ${project_root_dir}/DSL
./demo/model_train/baseline_coco.sh

5. Generate initial pseudo-labels for unlabeled images(1/2)

Generate the initial pseudo-labels for unlabeled images via (tools/inference_unlabeled_coco_data.sh): please change the corresponding list file path of unlabeled data in the config file, and the model path in tools/inference_unlabeled_coco_data.sh.

./tools/inference_unlabeled_coco_data.sh

Then you will obtain (workdir_coco/xx/epoch_xxx.pth-unlabeled.bbox.json) which contains the pseudo-labels.

6. Generate initial pseudo-labels for unlabeled images(2/2)

Use (tools/generate_unlabel_annos_coco.py) to convert the produced (epoch_xxx.pth-unlabeled.bbox.json) above to DSL-style annotations

python3 tools/generate_unlabel_annos_coco.py \ 
          --input_path workdir_coco/xx/epoch_xxx.pth-unlabeled.bbox.json \
          --input_list data_list/coco_semi/semi_supervised/instances_train2017.${seed}@${percent}-unlabeled.json \
          --cat_info ${project_root_dir}/data/semicoco/mmdet_category_info.json \
          --thres 0.1

You will obtain (workdir_coco/xx/epoch_xxx.pth-unlabeled.bbox.json_thres0.1_annos/) dir which contains the DSL-style annotations.

7. DSL Training

Use (demo/model_train/unlabel_train.sh) to train our semi-supervised algorithm. Before training, please change the corresponding paths in config file and shell script.

./demo/model_train/unlabel_train.sh

For COCO Fully Labeled Data protocol

The overall steps are similar as steps in above Partially Labeled Data guaidline. The additional steps to do is to download and organize the new unlabeled data.

1. Organize the new images

Put all the jpg images into the generated DSL-style semicoco data dir like: semicoco/unlabel_images/full/xx.jpg;

cd ${project_root_dir}
cp ori_data/coco/unlabled2017/* data/semicoco/unlabel_images/full/

2. Download the corresponding files

Download (STAC_JSON.tar.gz) and unzip it; move (coco/annotations/instances_unlabeled2017.json) to (data_list/coco_semi/semi_supervised/) dir

cd ${project_root_dir}/ori_data
wget https://storage.cloud.google.com/gresearch/ssl_detection/STAC_JSON.tar
tar -xf STAC_JSON.tar.gz

# resulting files
# coco/annotations/instances_unlabeled2017.json
# coco/annotations/semi_supervised/instances_unlabeledtrainval20class.json
# voc/VOCdevkit/VOC2007/instances_diff_test.json
# voc/VOCdevkit/VOC2007/instances_diff_trainval.json
# voc/VOCdevkit/VOC2007/instances_test.json
# voc/VOCdevkit/VOC2007/instances_trainval.json
# voc/VOCdevkit/VOC2012/instances_diff_trainval.json
# voc/VOCdevkit/VOC2012/instances_trainval.json

cp coco/annotations/instances_unlabeled2017.json ${project_root_dir}/DSL/data_list/coco_semi/semi_supervised/

3. Train as steps4-steps7 which are used in Partially Labeled data protocol

Change the corresponding paths before training.

For VOC dataset

1. Download VOC data

Download VOC dataset to dir xx and unzip it, we will get (VOCdevkit/)

cd ${project_root_dir}/ori_data
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
tar -xf VOCtrainval_06-Nov-2007.tar
tar -xf VOCtest_06-Nov-2007.tar
tar -xf VOCtrainval_11-May-2012.tar

# resulting format
# ori_data/
#   - VOCdevkit
#     - VOC2007
#       - Annotations
#       - JPEGImages
#       - ...
#     - VOC2012
#       - Annotations
#       - JPEGImages
#       - ...

2. Convert voc to semivoc dataset

Use (tools/voc_convert2_semivoc_json.py) to generate DSL-style voc data dir, i.e., semivoc/, which matches the code of unlabel training and pseudo-label update.

cd ${project_root_dir}/DSL
python3 tools/voc_convert2_semivoc_json.py --input ${project_root_dir}/ori_data/VOCdevkit --output ${project_root_dir}/data/semivoc

And then use (tools/dataset_converters/pascal_voc.py) to convert the original voc list file to coco style file for evaluating VOC performances under COCO 'bbox' metric.

python3 tools/dataset_converters/pascal_voc.py ${project_root_dir}/ori_data/VOCdevkit -o data_list/voc_semi/ --out-format coco

You will obtain the list files in COCO-Style in dir: data_list/voc_semi/. These files are only used as val files, please refer to (configs/fcos_semi/voc/xx.py)

3. Combine with coco20class images

Copy (instances_unlabeledtrainval20class.json) to (data_list/voc_semi/) dir; and then run script (data_list/voc_semi/combine_coco20class_voc12.py) to produce the additional unlabel set with coco20classes.

cp ${project_root_dir}/ori_data/coco/annotations/semi_supervised/instances_unlabeledtrainval20class.json data_list/voc_semi/
cd data_list/voc_semi
python3 data_list/voc_semi/combine_coco20class_voc12.py \
                --cocojson instances_unlabeledtrainval20class.json \
                --vocjson voc12_trainval.json \
                --cocoimage_path ${project_root_dir}/data/semicoco/images/full \
                --outtxt_path ${project_root_dir}/data/semivoc/unlabel_prepared_annos/Industry/ \
                --outimage_path ${project_root_dir}/data/semivoc/unlabel_images/full
cd ../..

You will obtain the corresponding list file(.json): (voc12_trainval_coco20class.json), and the corresponding coco20classes images will be copyed to (${project_root_dir}/data/semivoc/unlabeled_images/full/) and the list file(.txt) will also be generated at (${project_root_dir}/data/semivoc/unlabel_prepared_annos/Industry/voc12_trainval_coco20class.txt)

4. Train as steps4-steps7 which are used in Partially Labeled data protocol

Please change the corresponding paths before training, and refer to configs/fcos_semi/voc/xx.py.

Testing

Please refer to (tools/semi_dist_test.sh).

./tools/semi_dist_test.sh

Acknowledgement

Comments

Paper release

http://www4.comp.polyu.edu.hk/~cslzhang/paper/DSL_cvpr22.pdf was linked from Google Scholar but unfortunately returns 404 now. Would be keen to read the paper in better rendering than Google cache :)

opened by vadimkantorov 6
关于baseline

您好，感谢您非常有意义的工作，我关于baseline有些问题想和您请教：在10%labeled data设置下，意味着90%的unlabled data需要和10%labeled data进行匹配，在一个epoch中，如果unlabled data全部使用的话，则labled data需要重复使用9次，这样来构成一个epoch。这样在论文baseline的结果中，在一个epoch中同样使用9倍的labled data还是只使用1倍的labled data呢？(在semi-supervised和baseline跑相同迭代数的情况下)

opened by lzhhha 5
How the scale invariant implement?
非常感谢您能开源您的代码！不过我在阅读您的代码的时候有几个地方不是很明白，所以想请教您一下：

关于Adaptive Threshold，我看到您的实现代码里稍微跟Paper有一些不一样的地方，可以麻烦您稍微解释一下吗？

关于Scale Invariant Learning，我没有看懂您这一部分的实现原理，您是对unlabel image进行缩放，然后一起加载进来的吗？一个batch的数据是怎么组织的呢？我看您的代码只看到如何组织label image和unlabel image，不知道这个缩放的图片是如何加载进来和组织的呢？这里的flatten_As_labels是用来干什么的呢？

关于这个Ignore，我理解这里应该是想把那些原本被那些高质量的伪标签分配为背景的Proposal/ Anchor Box，通过ignore gt box，分配得到ignore label，从而不计算这些Proposal/Anchor Box的loss. 但是看您这里的逻辑如果一个Proposal/Anchor Box被ignore gt box或者gt box分配了背景类标签，即flatten_ig_labels - self.num_classes = 0，flatten_labels - self.num_classes = 0，这个Proposal/Anchor Box对应的sample wise的权重则会置为0，这是不是不合理呢？
opened by Zhangjiacheng144 4
train debug

换用自己数据集训练时报错：

Traceback (most recent call last): File "./tools/train.py", line 202, in main() File "./tools/train.py", line 190, in main train_detector( File "/secret/ZLW/Codes/SSOD/DSL/mmdet/apis/train.py", line 218, in train_detector runner.run(data_loaders, cfg.workflow) File "/secret/ZLW/Codes/SSOD/DSL/mmdet/runner/hooks/semi_epoch_based_runner.py", line 344, in run epoch_runner(data_loaders[i], **kwargs) File "/secret/ZLW/Codes/SSOD/DSL/mmdet/runner/hooks/semi_epoch_based_runner.py", line 265, in train self.run_iter(data_batch, train_mode=True, **kwargs) File "/secret/ZLW/Codes/SSOD/DSL/mmdet/runner/hooks/semi_epoch_based_runner.py", line 155, in run_iter outputs = self.model.train_step(data_batch, self.optimizer, File "/usr/local/lib/python3.8/site-packages/mmcv/parallel/distributed.py", line 52, in train_step output = self.module.train_step(*inputs[0], **kwargs[0]) File "/secret/ZLW/Codes/SSOD/DSL/mmdet/models/detectors/base.py", line 237, in train_step losses = self(**data) File "/usr/local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/usr/local/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 97, in new_func return old_func(*args, **kwargs) File "/secret/ZLW/Codes/SSOD/DSL/mmdet/models/detectors/base.py", line 171, in forward return self.forward_train(img, img_metas, **kwargs) File "/secret/ZLW/Codes/SSOD/DSL/mmdet/models/detectors/single_stage.py", line 82, in forward_train losses = self.bbox_head.forward_train(x, img_metas, gt_bboxes, File "/secret/ZLW/Codes/SSOD/DSL/mmdet/models/dense_heads/base_dense_head.py", line 54, in forward_train losses = self.loss(*loss_inputs, gt_bboxes_ignore=gt_bboxes_ignore) File "/usr/local/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 185, in new_func return old_func(*args, **kwargs) File "/secret/ZLW/Codes/SSOD/DSL/mmdet/models/dense_heads/fcos_head.py", line 309, in loss loss_cls = self.loss_cls( File "/usr/local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/secret/ZLW/Codes/SSOD/DSL/mmdet/models/losses/focal_loss.py", line 170, in forward loss_cls = self.loss_weight * calculate_loss_func( File "/secret/ZLW/Codes/SSOD/DSL/mmdet/models/losses/focal_loss.py", line 85, in sigmoid_focal_loss loss = _sigmoid_focal_loss(pred.contiguous(), target, gamma, alpha, None, File "/usr/local/lib/python3.8/site-packages/mmcv/ops/focal_loss.py", line 39, in forward assert input.size(0) == target.size(0) AssertionError

batch设为8，输入分辨率设为512x512，debug了一下，发现在semi_epoch_based_runner.py第186行开始， data_batch['img_metas']、data_batch['gt_bboxes']data_batch['gt_labels']添加了一个元素，而data_batch['img'] cat了一个batch-1的图像tensor。导致网络的模型输入tensor维度变成(15,3,512,512)，而label相关的信息为9张图像的，进而在计算loss时出现了AssertionError。请问大佬这里是我代码没理解对还是确实有bug呢？

opened by weyoung0 4
some questions

Hi, binghui, thank you for sharing your work, I also work on ssod area, I found some differences between your work and other opensource ssod architecher, since the problem I will describe should be very specific, to prevent misunderstanding, I write them in chinese: 1.我仔细看了你代码的实现，在做10%standard任务时，大概的状态是你会在半监督那部分开始之前，用10%数据量训练好的模型来生成一份伪标签，然后在半监督的时候，你每个epoch的的迭代次数其实是按照这个伪标签的图片数量来的，是这样吗？ 2.你的伪标签在每一个epoch结束后利用pred_hook机制又重新生成一次，同时是每个epoch来更新你的af模块，如果我没有理解错的话，你的ema模型是每一个迭代都会更新，但是只会在epoch结束来ema离线生成伪标签，这个与我们自己实现的方案有点不同，我们是利用ema每个迭代生成伪标签而且ema也每个迭代更新，道理上来说我感觉你这种方式貌似更加鲁棒，这样的实现有什么原因吗？感谢你的工作并期待你的回复。

opened by BowieHsu 3
关于半监督训练的一些问题

您好，看了您的工作，提出了很多新的想法，感觉收益很多。我想要问一下，对于半监督训练，在监督训练baseline的时候需要将模型训练至完全收敛吗，还是要留存出一定的空间至没有完全收敛状态，因为在半监督阶段也会使用到标注的数据，从而防止这部分数据的过拟合而影响到模型的整体效果。因为我看您说一般baseline在55epoch达到较好效果，我用voc数据训练了60epoch后到达63.8AP50，而您论文中给出的supervised的AP50是69.6。所以说是不是在监督阶段留一部分余量会更好？

opened by Skyninth 2
Question about dsl

Must the supervised model be trained in advance? Can I use DSL to train a model both with labeled samples and unlabeled samples from the begining(just with pretrained model of backbone)?

opened by heiyuxiaokai 2
The inference of VOC data

hello, I have a problem when inference the data of VOC. Run the script of ./tools/inference_unlabeled_coco_data.sh, I don't get any inference results at specified folder. Is there a script for inferencing and generating pseudo-labels for VOC data？ Thanks！

opened by wuhandashuaibi 2

RuntimeError: Address already in use

大佬们好，我折腾了一个月，终于在WSL的ubuntu18.04.5上配好了环境但是它在运行 readme里的这条语句时，运行了一段时间然后报错如下：

*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
*****************************************
Traceback (most recent call last):
  File "./tools/train.py", line 202, in <module>
    main()
  File "./tools/train.py", line 120, in main
    init_dist(args.launcher, **cfg.dist_params)
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/dist_utils.py", line 18, in init_dist
    _init_dist_pytorch(backend, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/dist_utils.py", line 35, in _init_dist_pytorch
    dist.init_process_group(backend=backend, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/distributed/distributed_c10d.py", line 500, in init_process_group
    store, rank, world_size = next(rendezvous_iterator)
  File "/usr/local/lib/python3.8/dist-packages/torch/distributed/rendezvous.py", line 190, in _env_rendezvous_handler
    store = TCPStore(master_addr, master_port, world_size, start_daemon, timeout)
RuntimeError: Address already in use
Traceback (most recent call last):
  File "./tools/train.py", line 202, in <module>
    main()
  File "./tools/train.py", line 120, in main
    init_dist(args.launcher, **cfg.dist_params)
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/dist_utils.py", line 18, in init_dist
    _init_dist_pytorch(backend, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/dist_utils.py", line 35, in _init_dist_pytorch
    dist.init_process_group(backend=backend, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/distributed/distributed_c10d.py", line 500, in init_process_group
    store, rank, world_size = next(rendezvous_iterator)
  File "/usr/local/lib/python3.8/dist-packages/torch/distributed/rendezvous.py", line 190, in _env_rendezvous_handler
    store = TCPStore(master_addr, master_port, world_size, start_daemon, timeout)
RuntimeError: Address already in use
Traceback (most recent call last):
  File "./tools/train.py", line 202, in <module>
    main()
  File "./tools/train.py", line 120, in main
    init_dist(args.launcher, **cfg.dist_params)
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/dist_utils.py", line 18, in init_dist
    _init_dist_pytorch(backend, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/dist_utils.py", line 35, in _init_dist_pytorch
    dist.init_process_group(backend=backend, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/distributed/distributed_c10d.py", line 500, in init_process_group
    store, rank, world_size = next(rendezvous_iterator)
  File "/usr/local/lib/python3.8/dist-packages/torch/distributed/rendezvous.py", line 190, in _env_rendezvous_handler
    store = TCPStore(master_addr, master_port, world_size, start_daemon, timeout)
RuntimeError: Address already in use
Traceback (most recent call last):
  File "./tools/train.py", line 202, in <module>
    main()
  File "./tools/train.py", line 120, in main
    init_dist(args.launcher, **cfg.dist_params)
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/dist_utils.py", line 18, in init_dist
    _init_dist_pytorch(backend, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/dist_utils.py", line 35, in _init_dist_pytorch
    dist.init_process_group(backend=backend, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/distributed/distributed_c10d.py", line 500, in init_process_group
    store, rank, world_size = next(rendezvous_iterator)
  File "/usr/local/lib/python3.8/dist-packages/torch/distributed/rendezvous.py", line 190, in _env_rendezvous_handler
    store = TCPStore(master_addr, master_port, world_size, start_daemon, timeout)
RuntimeError: Address already in use
Traceback (most recent call last):
  File "./tools/train.py", line 202, in <module>
    main()
  File "./tools/train.py", line 120, in main
    init_dist(args.launcher, **cfg.dist_params)
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/dist_utils.py", line 18, in init_dist
    _init_dist_pytorch(backend, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/dist_utils.py", line 35, in _init_dist_pytorch
    dist.init_process_group(backend=backend, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/distributed/distributed_c10d.py", line 500, in init_process_group
    store, rank, world_size = next(rendezvous_iterator)
  File "/usr/local/lib/python3.8/dist-packages/torch/distributed/rendezvous.py", line 190, in _env_rendezvous_handler
    store = TCPStore(master_addr, master_port, world_size, start_daemon, timeout)
RuntimeError: Address already in use
Traceback (most recent call last):
  File "./tools/train.py", line 202, in <module>
    main()
  File "./tools/train.py", line 120, in main
    init_dist(args.launcher, **cfg.dist_params)
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/dist_utils.py", line 18, in init_dist
    _init_dist_pytorch(backend, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/dist_utils.py", line 35, in _init_dist_pytorch
    dist.init_process_group(backend=backend, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/distributed/distributed_c10d.py", line 500, in init_process_group
    store, rank, world_size = next(rendezvous_iterator)
  File "/usr/local/lib/python3.8/dist-packages/torch/distributed/rendezvous.py", line 190, in _env_rendezvous_handler
    store = TCPStore(master_addr, master_port, world_size, start_daemon, timeout)
RuntimeError: Address already in use
Killing subprocess 1941
Killing subprocess 1942
Killing subprocess 1943
Killing subprocess 1944
Killing subprocess 1945
Killing subprocess 1946
Killing subprocess 1947
Killing subprocess 1948
Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 192, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.8/dist-packages/torch/distributed/launch.py", line 340, in <module>
    main()
  File "/usr/local/lib/python3.8/dist-packages/torch/distributed/launch.py", line 326, in main
    sigkill_handler(signal.SIGTERM, None)  # not coming back
  File "/usr/local/lib/python3.8/dist-packages/torch/distributed/launch.py", line 301, in sigkill_handler
    raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python', '-u', './tools/train.py', '--local_rank=7', 'configs/fcos_semi/r50_caffe_mslonger_tricks_0.Xdata.py', '--launcher', 'pytorch', '--work-dir', 'workdir_coco/r50_caffe_mslonger_tricks_0.1data']' returned non-zero exit status 1.

主要是 RuntimeError: Address already in use 报错然后我尝试运行了它报错的/usr/bin/python -u ./tools/train.py --local_rank=7 configs/fcos_semi/r50_caffe_mslonger_tricks_0.Xdata.py --launcher pytorch --work-dir workdir_coco/r50_caffe_mslonger_tricks_0.1data这条命令，发现是可以运行的，我去网上搜了一下，应该是pytorch分布式在单机多任务时使用了GPU的同一个端口而报错，然后我就修改了所有DSL项目中的master_port参数，如下但是还是报错RuntimeError: Address already in use,,,,,好折磨啊555 大佬求教一教，已经给了star

opened by Lost-little-dinosaur 7

关于Aggregated Teacher

您好，请问哪里能找到您关于Aggregated Teacher更新参数的代码，我在mmdet/runner/hooks/semi_epoch_based_runner.py里的SemiEpochBasedRunner.EMA中找到更新参数的代码，但是好像和ema策略是一样的？我不太确定是否是找错了，麻烦您了

opened by Dopamine0717 0
unlabel_pred error and cannot find the images

batch_mlvl_bboxes /= batch_mlvl_bboxes.new_tensor( [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 5142/5142, 19.1 task/s, elapsed: 269s, ETA: 0s2022-11-24 10:22:08,173 - mmdet - INFO - [INFO] Unlabel pred Done! Traceback (most recent call last): File "tools/train.py", line 202, in main() File "tools/train.py", line 190, in main train_detector( File "/home/hello/PycharmProjects/pythonProject/new_DSL/DSL/mmdet/apis/train.py", line 218, in train_detector runner.run(data_loaders, cfg.workflow) File "/home/hello/PycharmProjects/pythonProject/new_DSL/DSL/mmdet/runner/hooks/semi_epoch_based_runner.py", line 345, in run epoch_runner(data_loaders[i], **kwargs) File "/home/hello/PycharmProjects/pythonProject/new_DSL/DSL/mmdet/runner/hooks/semi_epoch_based_runner.py", line 267, in train self.call_hook('after_train_iter') File "/home/hello/anaconda3/envs/Torch-DSL/lib/python3.8/site-packages/mmcv/runner/base_runner.py", line 307, in call_hook getattr(hook, fn_name)(self) File "/home/hello/PycharmProjects/pythonProject/new_DSL/DSL/mmdet/runner/hooks/unlabel_pred_hook.py", line 460, in after_train_iter self.after_train_iter_func(runner) File "/home/hello/PycharmProjects/pythonProject/new_DSL/DSL/mmdet/runner/hooks/unlabel_pred_hook.py", line 517, in after_train_iter_func assert len(runner.imagefiles) == 2 AssertionError ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 16269) of binary: /home/hello/anaconda3/envs/Torch-DSL/bin/python

when i want to debug and set preload=1,start_point=2 to reduce training time.It occur another error.

loading annotations into memory... Done (t=0.00s) creating index... index created! loading annotations into memory... Done (t=0.13s) creating index... index created! loading annotations into memory... Done (t=0.02s) creating index... index created! loading annotations into memory... Done (t=0.20s) creating index... index created! [ERROR][ModelInfer] Found no image in /home/hello/PycharmProjects/pythonProject/new_DSL/DSL/mydata/semicoco/images/full

but this document has images,i dont know what happen? anybody can help me?Thanks a lot.

opened by progincline 4
关于lscale的一些问题

您好，请问在lscale部分，我的理解是通过下采样图片，然后错位对相邻层同尺寸的分数图进行MSEloss计算，这个分数图是怎么得到的呢？对FPN网络输出的特征图进行激活函数sigmoid这样的？还有，在进行MSEloss时，相邻层的channel是不相同的，应该也是个2倍关系，请问这部分是怎么处理的？

opened by Skyninth 1
Where is "Adaptive Filtering Strategy" source code?

Hi, Thank you for your great work!

In your paper, you mentioned that you applied a AF strategy to improve the quality of pseudo-labels, but I didn't find the code of this part. could you release the code of this part?

opened by TaeHoon-Jin 1

Owner

Bhchen

GitHub

Code for the CVPR2022 paper "Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity"

Introduction This is an official release of the paper "Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity" (arxiv link). Abstrac

21 Nov 23, 2022

The official codes of our CVPR2022 paper: A Differentiable Two-stage Alignment Scheme for Burst Image Reconstruction with Large Shift

TwoStageAlign The official codes of our CVPR2022 paper: A Differentiable Two-stage Alignment Scheme for Burst Image Reconstruction with Large Shift Pa