Weakly-supervised object detection.

Overview

Wetectron

Wetectron is a software system that implements state-of-the-art weakly-supervised object detection algorithms.

Wetectron

Project CVPR'20, ECCV'20 | Paper CVPR'20, ECCV'20

Installation

Check INSTALL.md for installation instructions.

Partial labels

The simulated partial labels (points and scribbles) of COCO can be found at Google-drive or Dropbox.

Please check tools/vis_partial_labels.ipynb for a visualization example.

Model zoo

Check MODEL_ZOO.md for detailed instructions.

Getting started

Check GETTING_STARTED for detailed instrunctions.

New dataset

If you want to run on your own dataset or use other pre-computed proposals (e.g., Edge Boxes), please check USE_YOUR_OWN_DATA for some tips.

Misc

Please also check the documentation of maskrcnn-benchmark for things like abstractions and troubleshooting. If your issues are not present there, feel free to open a new issue.

Todo:

  1. Sequential back-prop and ResNet models.

Citations

Please consider citing following papers in your publications if they help your research.

@inproceedings{ren-cvpr020,
  title = {Instance-aware, Context-focused, and Memory-efficient Weakly Supervised Object Detection},
  author = {Zhongzheng Ren and Zhiding Yu and Xiaodong Yang and Ming-Yu Liu and Yong Jae Lee and Alexander G. Schwing and Jan Kautz},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2020}
}

@inproceedings{ren-eccv2020,
  title = {UFO$^2$: A Unified Framework towards Omni-supervised Object Detection},
  author = {Zhongzheng Ren and Zhiding Yu and Xiaodong Yang and Ming-Yu Liu and Alexander G. Schwing and Jan Kautz},
  booktitle = {European Conference on Computer Vision (ECCV)},
  year = {2020}
}

License

This code is released under the Nvidia Source Code License.

This project is built upon maskrcnn-benchmark, which is released under MIT License.

Comments
  • When could we expect to see code

    When could we expect to see code

    Hey,

    Super keen to see code for this. It's very interesting and I've started trying to reimplement it, however, the paper doesn't include all the details and says to reference the code. Any idea when we could expect to see it (so I don't keep coming back every day to check if it's updated)?

    Cheers

    opened by bradezard131 24
  • Reproducing MIST function and results

    Reproducing MIST function and results

    Struggling to reproduce the results "MIST w/o Reg" from Table 5. From my understanding, this should be the same network structure as OICR but using the MIST algorithm (Algorithm 1) rather than the typical top-1 box. I have a working implementation of the OICR network structure, it achieves ~41% with OICR and ~44% with PCL using the dilated VGG-16 backbone.

    I have tried using the original OICR and PCL hyperparameters (LR=1e-3, WD=5e-4, BS=2 or 4, 5 scales) as well as the new ones in the appendix (LR=1e-2, WD=1e-4, BS=8, 6 scales) and have been unable to break 25% with p=15%. My implementation of this function is included below:

    @torch.no_grad()
    def mist_label(preds, rois, label, reg_preds=None, p=0.15, tau=0.2):    
        preds = (preds if preds.shape[-1] == label.shape[-1] else preds[:,1:]).clone()  # remove background class if present
        keep_count = int(ceil(p * preds.size(0)))
        all_overlaps = ops.box_iou(rois, rois)  # more efficient to just compute all overlaps once and index in, than computing overlaps each time
        klasses = label.nonzero(as_tuple=True)[0]
        gt_labels = -torch.ones((preds.size(0),), dtype=torch.long, device=preds.device)
        gt_weights = -torch.ones((preds.size(0),), dtype=preds.dtype, device=preds.device)
    
        for c in klasses:
            cls_prob_tmp = preds[:,c]
            sort_idx = cls_prob_tmp.argsort(descending=True)[:keep_count]  # top p percent of proposals
            
            # add the top scoring
            keep_idxs = [sort_idx[0].item()]
    
            # add the rest
            for idx in sort_idx[1:]:
                if (all_overlaps[idx, keep_idxs] < tau).all():
                    keep_idxs.append(idx.item())
    
            # add them to the GT set unless they're already selected with a higher score
            is_higher_scoring_class = cls_prob_tmp[keep_idxs] > gt_weights[keep_idxs]
            keep_idxs = torch.tensor(keep_idxs)[is_higher_scoring_class]
            gt_labels[keep_idxs] = c+1
            gt_weights[keep_idxs] = cls_prob_tmp[keep_idxs]
    
        kept = gt_labels > 0
        
        gt_boxes, gt_labels, gt_weights = rois[kept], gt_labels[kept], gt_weights[kept]
        
        # Adjust boxes with regression, if available
        if reg_preds is not None:
            box_tfms = reg_preds[kept, gt_labels]
            gt_boxes[:,:2] += gt_boxes[:,2:] * box_tfms[:,:2]  # G_x = P_x + P_w * t_x
            gt_boxes[:,2:] *= box_tfms[:,2:].exp()  # G_w = P_w * e^{t_w}
        
        return gt_boxes, gt_labels, gt_weights
    
    opened by bradezard131 18
  • WARNING:root:NaN or Inf found in input tensor.

    WARNING:root:NaN or Inf found in input tensor.

    Training form scratch with "V_16_voc07.yaml" with batch size of 1 on one GForce 1080Ti GPU, after 6000 iters, the logger gave this warming, I have no idea where got things wrong. Could you help me with some clue

    opened by liaorongfan 14
  • (oicrbaseline+mist wi reg) setup map50 result 41.05

    (oicrbaseline+mist wi reg) setup map50 result 41.05

    Thanks for your reply.

    As our computing resources are limited, there is no way to use 8 gpus. In order to get the performance of the paper, I setup my experiments as follows (oicrbaseline+mist wi reg): removed the cbd module BATCH_SIZE=2 base lr = 0.0025 (0.01/4) torch = 1.5.0 cuda=9.2 The result of voc_2007_test data is map50 = 41.5. There is still a big gap from the results 51.4 in the paper. Do you have any suggestions for further improvement?

    opened by liz6688 10
  • nan in training

    nan in training

    Thanks for your code. There is an nan during training. Can you give some suggestions to solve this problem? I set IMS_PER_BATCH = 1, ITER_SIZE = 1, and keep other parameters unchanged.

    opened by liz6688 9
  • About training time / Slow training schedule

    About training time / Slow training schedule

    Hi, thanks for your great work. I am going to reproduce your results but encounter long training time issue. I've tried on 8 M40 GPU, and process would finish after more than 2 days 15 hours later. I'm not sure whether something go wrong?
    Could you provide your training log and detailed environments?
    I train this repo using command python -m torch.distributed.launch --nproc_per_node=8 tools/train_net.py --config-file "configs/voc/V_16_voc07.yaml" OUTPUT_DIR ./experiment_voc, and pytorch==1.5.1, cudatoolkit==10.1.243
    Hope you could help me figure out the issue, thanks in advance.

    question 
    opened by AlphaGoMK 8
  • CUDA device-side assert, image scores passed to BCE loss is nan during early training

    CUDA device-side assert, image scores passed to BCE loss is nan during early training

    Thanks for you amazing work! I got RuntimeError: CUDA error: device-side assert triggered around ~200 steps during training. This error always occurs even after I rerunning the program multiple times or according to #22 setting a higher epsilon (I've tried 1e-8 and 1e-6).

    This is the command I use for training. I have tried pytorch 1.6 and 1.7 with cuda 10.1.

    CUDA_LAUNCH_BLOCKING=1 python tools/train_net.py --config-file "configs/voc/V_16_voc07.yaml" --use-tensorboard  \
    OUTPUT_DIR output \
    SOLVER.IMS_PER_BATCH 1 \
    SOLVER.ITER_SIZE 8 \
    DB.METHOD none
    

    Here is the logging

    2020-12-08 00:13:05,575 wetectron.trainer INFO: eta: 1 day, 5:33:33  iter: 180  loss: 0.4550 (0.6005)  loss_img: 0.2575 (0.2831)  loss_ref_cls0: 0.0003 (0.0011)  loss_ref_reg0: 0.0000 (0.0002)  loss_ref_cls1: 0.1219 (0.1517)  loss_ref_reg1: 0.0277 (0.0274)  loss_ref_cls2: 0.0612 (0.1122)  loss_ref_reg2: 0.0119 (0.0247)  acc_img: 0.0000 (0.2319)  acc_ref0: 0.0000 (0.0690)  acc_ref1: 0.0000 (0.2546)  acc_ref2: 0.0000 (0.2713)  time: 0.4167 (0.4437)  data: 0.0097 (0.0117)  lr: 0.004100  max mem: 4047
    tensor([0.0061, 0.0272, 0.0212, 0.0003, 0.0143, 0.0203, 0.0304, 0.2264, 0.0059,
            0.2383, 0.0125, 0.0261, 0.0525, 0.1852, 0.0306, 0.0003, 0.0211, 0.0074,
            0.0092, 0.0183, 0.0403], device='cuda:0', grad_fn=<ClampBackward>)
    tensor([3.4809e-03, 2.0176e-02, 1.5387e-02, 3.3157e-05, 7.3178e-03, 2.7314e-02,
            1.9246e-02, 4.2303e-01, 1.8498e-03, 2.4402e-01, 8.2913e-03, 2.3048e-02,
            4.3761e-02, 1.8561e-01, 1.7174e-02, 5.2509e-05, 1.1843e-02, 3.0689e-03,
            5.3479e-03, 8.2327e-03, 2.8905e-02], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([0.0063, 0.0345, 0.0294, 0.0005, 0.0227, 0.0191, 0.0276, 0.1232, 0.0156,
            0.2002, 0.0143, 0.0268, 0.0588, 0.2178, 0.0280, 0.0011, 0.0264, 0.0082,
            0.0126, 0.0231, 0.0392], device='cuda:0', grad_fn=<ClampBackward>)
    tensor([4.6997e-03, 2.4954e-02, 1.9726e-02, 8.2294e-05, 1.1063e-02, 2.8040e-02,
            2.3942e-02, 3.1750e-01, 4.0102e-03, 2.3422e-01, 1.1085e-02, 2.5569e-02,
            4.9987e-02, 1.9260e-01, 2.2549e-02, 1.3269e-04, 1.6148e-02, 5.0528e-03,
            7.5706e-03, 1.2965e-02, 3.4329e-02], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([1.0000e-08, 5.4938e-07, 3.5194e-06, 1.3429e-08, 4.6343e-08, 1.0000e-08,
            1.0000e-08, 3.2079e-05, 1.9538e-08, 4.8927e-03, 9.9997e-05, 1.4685e-07,
            3.2431e-01, 7.7275e-02, 1.0000e-08, 9.3841e-01, 1.4319e-03, 1.0000e-08,
            1.0000e-08, 1.9370e-08, 1.7341e-06], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([1.0000e-08, 2.8905e-08, 9.4311e-06, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 6.9164e-04, 1.0000e-08, 1.7446e-02, 2.6333e-08, 2.8428e-07,
            1.0017e-01, 7.1074e-02, 1.0000e-08, 9.8742e-01, 1.4429e-06, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 6.2333e-07], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([1.0000e-08, 5.2896e-06, 6.3474e-07, 8.0780e-07, 4.2772e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 2.0732e-02, 1.0000e-08, 2.8691e-08,
            2.5120e-01, 1.0660e-02, 1.0000e-08, 8.6331e-01, 2.0186e-03, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 3.8909e-07], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([1.0000e-08, 5.0810e-07, 1.7225e-06, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.9051e-07, 2.6197e-02, 5.0034e-05, 1.4151e-07,
            4.4598e-01, 7.7825e-02, 1.0000e-08, 2.2080e-01, 9.6862e-03, 1.0000e-08,
            1.0000e-08, 2.3176e-08, 1.8984e-06], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([1.0000e-08, 1.0000e-08, 2.9017e-08, 1.0000e-08, 1.0000e-08, 4.3676e-07,
            1.0000e-08, 9.8482e-01, 1.0000e-08, 2.6416e-03, 1.7355e-06, 5.0040e-08,
            3.8472e-01, 9.1854e-03, 1.0000e-08, 9.9960e-01, 5.8315e-05, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 4.1070e-07], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([1.0000e-08, 3.2842e-06, 2.7481e-06, 1.5629e-07, 3.9651e-06, 1.0004e-07,
            9.6509e-08, 1.3372e-04, 1.6091e-08, 1.6981e-02, 2.7259e-04, 1.5076e-05,
            1.5756e-01, 6.3610e-02, 3.7470e-07, 9.4090e-01, 2.4577e-04, 1.0000e-08,
            1.0000e-08, 4.2767e-08, 1.5077e-04], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 7.5319e-04, 1.0000e-08, 1.0000e-08,
            1.7518e-03, 1.7722e-01, 1.0000e-08, 9.8997e-01, 6.3139e-02, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([1.0000e-08, 4.1463e-05, 5.7913e-04, 1.0175e-05, 8.5911e-06, 1.0000e-08,
            2.1342e-07, 6.7830e-02, 1.5353e-06, 2.2693e-02, 1.1492e-07, 1.2851e-05,
            3.3217e-01, 1.1930e-01, 4.0176e-06, 8.4664e-01, 4.7693e-03, 1.0000e-08,
            1.0000e-08, 1.9446e-06, 7.9586e-05], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([1.0000e-08, 1.0000e-08, 1.0000e+00, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([1.0000e-08, 1.0000e-08, 1.0000e+00, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e+00, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([1.0000e-08, 1.0000e-08, 1.0000e+00, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([1.0000e-08, 1.0000e-08, 1.0000e+00, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([1.0000e-08, 1.0000e-08, 1.0000e+00, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([1.0000e-08, 1.0000e-08, 1.0000e+00, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([1.0000e-08, 1.0000e-08, 1.0000e+00, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([1.0000e-08, 1.0000e-08, 1.0000e+00, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e+00, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08], device='cuda:0',
           grad_fn=<ClampBackward>)
    2020-12-08 00:13:14,823 wetectron.trainer INFO: eta: 1 day, 5:40:51  iter: 200  loss: 5.6391 (16.8278)  loss_img: 0.9144 (0.5902)  loss_ref_cls0: 0.0000 (0.0021)  loss_ref_reg0: 0.0000 (0.0008)  loss_ref_cls1: 0.0425 (0.1522)  loss_ref_reg1: 0.0017 (0.0262)  loss_ref_cls2: 0.0000 (14.2085)  loss_ref_reg2: 0.0000 (1.8478)  acc_img: 0.0000 (0.2213)  acc_ref0: 0.0000 (0.0704)  acc_ref1: 0.0000 (0.2392)  acc_ref2: 0.0000 (0.2625)  time: 0.4264 (0.4456)  data: 0.0115 (0.0117)  lr: 0.004167  max mem: 4047
    tensor([1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e+00, 1.0000e+00,
            1.0000e-08, 1.0000e-08, 1.0000e-08], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e+00,
            1.0000e-08, 1.0000e-08, 1.0000e-08], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e+00, 1.0000e+00,
            1.0000e-08, 1.0000e-08, 1.0000e-08], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e+00, 1.0000e+00,
            1.0000e-08, 1.0000e-08, 1.0000e-08], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e+00, 1.0000e+00, 1.0000e+00,
            1.0000e-08, 1.0000e-08, 1.0000e-08], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e+00,
            1.0000e-08, 1.0000e-08, 1.0000e-08], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e+00,
            1.0000e-08, 1.0000e-08, 1.0000e-08], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e-08,
            1.0000e-08, 1.0000e-08, 1.0000e-08, 1.0000e+00, 1.0000e-08, 1.0000e+00,
            1.0000e-08, 1.0000e-08, 1.0000e-08], device='cuda:0',
           grad_fn=<ClampBackward>)
    tensor([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan],
           device='cuda:0', grad_fn=<ClampBackward>)
    /opt/conda/conda-bld/pytorch_1603729006826/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [0,0,0], thread: [0,0,0] Assertion `input_val >= zero && input_val <= one` failed.
    /opt/conda/conda-bld/pytorch_1603729006826/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [0,0,0], thread: [1,0,0] Assertion `input_val >= zero && input_val <= one` failed.
    /opt/conda/conda-bld/pytorch_1603729006826/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [0,0,0], thread: [2,0,0] Assertion `input_val >= zero && input_val <= one` failed.
    /opt/conda/conda-bld/pytorch_1603729006826/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [0,0,0], thread: [3,0,0] Assertion `input_val >= zero && input_val <= one` failed.
    /opt/conda/conda-bld/pytorch_1603729006826/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [0,0,0], thread: [4,0,0] Assertion `input_val >= zero && input_val <= one` failed.
    /opt/conda/conda-bld/pytorch_1603729006826/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [0,0,0], thread: [5,0,0] Assertion `input_val >= zero && input_val <= one` failed.
    /opt/conda/conda-bld/pytorch_1603729006826/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [0,0,0], thread: [6,0,0] Assertion `input_val >= zero && input_val <= one` failed.
    /opt/conda/conda-bld/pytorch_1603729006826/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [0,0,0], thread: [7,0,0] Assertion `input_val >= zero && input_val <= one` failed.
    /opt/conda/conda-bld/pytorch_1603729006826/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [0,0,0], thread: [8,0,0] Assertion `input_val >= zero && input_val <= one` failed.
    /opt/conda/conda-bld/pytorch_1603729006826/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [0,0,0], thread: [9,0,0] Assertion `input_val >= zero && input_val <= one` failed.
    /opt/conda/conda-bld/pytorch_1603729006826/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [0,0,0], thread: [10,0,0] Assertion `input_val >= zero && input_val <= one` failed.
    /opt/conda/conda-bld/pytorch_1603729006826/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [0,0,0], thread: [11,0,0] Assertion `input_val >= zero && input_val <= one` failed.
    /opt/conda/conda-bld/pytorch_1603729006826/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [0,0,0], thread: [12,0,0] Assertion `input_val >= zero && input_val <= one` failed.
    /opt/conda/conda-bld/pytorch_1603729006826/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [0,0,0], thread: [13,0,0] Assertion `input_val >= zero && input_val <= one` failed.
    /opt/conda/conda-bld/pytorch_1603729006826/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [0,0,0], thread: [14,0,0] Assertion `input_val >= zero && input_val <= one` failed.
    /opt/conda/conda-bld/pytorch_1603729006826/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [0,0,0], thread: [15,0,0] Assertion `input_val >= zero && input_val <= one` failed.
    /opt/conda/conda-bld/pytorch_1603729006826/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [0,0,0], thread: [16,0,0] Assertion `input_val >= zero && input_val <= one` failed.
    /opt/conda/conda-bld/pytorch_1603729006826/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [0,0,0], thread: [17,0,0] Assertion `input_val >= zero && input_val <= one` failed.
    /opt/conda/conda-bld/pytorch_1603729006826/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [0,0,0], thread: [18,0,0] Assertion `input_val >= zero && input_val <= one` failed.
    /opt/conda/conda-bld/pytorch_1603729006826/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [0,0,0], thread: [19,0,0] Assertion `input_val >= zero && input_val <= one` failed.
    /opt/conda/conda-bld/pytorch_1603729006826/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [0,0,0], thread: [20,0,0] Assertion `input_val >= zero && input_val <= one` failed.
    Traceback (most recent call last):
      File "tools/train_net.py", line 301, in <module>
        main()
      File "tools/train_net.py", line 280, in main
        use_tensorboard=args.use_tensorboard
      File "tools/train_net.py", line 92, in train
        meters
      File "/home/unnc/Desktop/sota/wetectron/wetectron/engine/trainer.py", line 94, in do_train
        loss_dict, metrics = model(images, targets, rois)
      File "/home/unnc/anaconda3/envs/wetectron1.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/home/unnc/anaconda3/envs/wetectron1.7/lib/python3.7/site-packages/apex/amp/_initialize.py", line 197, in new_fwd
        **applier(kwargs, input_caster))
      File "/home/unnc/Desktop/sota/wetectron/wetectron/modeling/detector/generalized_rcnn.py", line 61, in forward
        x, result, detector_losses, accuracy = self.roi_heads(features, proposals, targets, model_cdb)
      File "/home/unnc/anaconda3/envs/wetectron1.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/home/unnc/Desktop/sota/wetectron/wetectron/modeling/roi_heads/weak_head/weak_head.py", line 106, in forward
        loss_img, accuracy_img = self.loss_evaluator([cls_score], [det_score], ref_scores, ref_bbox_preds, proposals, targets)
      File "/home/unnc/Desktop/sota/wetectron/wetectron/modeling/roi_heads/weak_head/loss.py", line 254, in __call__
        return_loss_dict['loss_img'] += F.binary_cross_entropy(img_score_per_im, labels_per_im.clamp(0, 1))
      File "/home/unnc/anaconda3/envs/wetectron1.7/lib/python3.7/site-packages/torch/nn/functional.py", line 2526, in binary_cross_entropy
        input, target, weight, reduction_enum)
    RuntimeError: CUDA error: device-side assert triggered
    
    
    opened by xiaaoo-zz 8
  • results are different between roi pooling and roi align

    results are different between roi pooling and roi align

    Thanks for sharing such wonderful works! I have used your codes to train OICR-vgg16, the mAP is 43% when I use roi-pooling. But when I use roi align, the mAP just have 23%. Difference between this two config is only 'POOLER_METHOD', Have you meets this problem in experiment?

    bug 
    opened by UcanSee 8
  • About results:MIST+cdb,multi scale and flip testing 2007 test mAP50 is 52.2.

    About results:MIST+cdb,multi scale and flip testing 2007 test mAP50 is 52.2.

    Thanks for you sharing wonderful work; but we tried to retrain your model, config: 8GPU, base lr:0.01, multi-scale training etc. we have tested all middle model including final model , top map is 52.2, we don't kown the reason. hope your reply, thanks.

    opened by hu5tao 8
  •  r50-c4 config

    r50-c4 config

    Thanks for sharing such wonderful work! Recently I want to try experiments on coco with r50-c4 backbone but I don't know how to set the configuration, can you share the config you used? looking forward your reply!

    question 
    opened by UcanSee 7
  • About setting of WSDDN

    About setting of WSDDN

    we followed the setting in detectron2.

    Originally posted by @jason718 in https://github.com/NVlabs/wetectron/issues/46#issuecomment-812829986

    Fine... This is really a good job that the Mist algorithm works well when I transfer the design to my codebase with slightly modification. But it's really strange that the performance could drop such a lot when using gpus less than 8 with the same batch_size settings and the performance is influnced by the random seed seriously in my experiments. I compare this codebase with others and I do not find any serious problem, so I guess maybe influnced by the large base_lr or the initialization strategy. I believe wetectron will be a popular framework for WSOD if you could refine details to make it more stable~

    By the way, could you please offer a setting about the vanilla WSDDN, i have tried several times and the performance is not good. Thanks a lot~

    opened by suilin0432 5
  • WARNING:root:NaN or Inf found in input tensor.

    WARNING:root:NaN or Inf found in input tensor.

    Training Resnet34 (pretrained) for hand and object pose and it's training and validation loss is -4, -6, 5, 7, .. or sometime 150+, 190+...,
    100%|██████████| 12328/12328 [1:19:17<00:00, 2.59it/s] WARNING:root:NaN or Inf found in input tensor. WARNING:root:NaN or Inf found in input tensor. 100%|██████████| 12328/12328 [49:31<00:00, 4.15it/s] WARNING:root:NaN or Inf found in input tensor. WARNING:root:NaN or Inf found in input tensor. Epoch : 0 finished. Training Loss: nan. Validation Loss: nan 100%|██████████| 12328/12328 [1:20:41<00:00, 2.55it/s] WARNING:root:NaN or Inf found in input tensor. WARNING:root:NaN or Inf found in input tensor. 100%|██████████| 12328/12328 [49:48<00:00, 4.13it/s] WARNING:root:NaN or Inf found in input tensor. WARNING:root:NaN or Inf found in input tensor. Epoch : 1 finished. Training Loss: nan. Validation Loss: nan 100%|██████████| 12328/12328 [1:20:53<00:00, 2.54it/s] WARNING:root:NaN or Inf found in input tensor. WARNING:root:NaN or Inf found in input tensor. 100%|██████████| 12328/12328 [56:47<00:00, 3.62it/s] WARNING:root:NaN or Inf found in input tensor. WARNING:root:NaN or Inf found in input tensor. Epoch : 2 finished. Training Loss: nan. Validation Loss: nan 100%|██████████| 12328/12328 [1:19:47<00:00, 2.58it/s] WARNING:root:NaN or Inf found in input tensor. WARNING:root:NaN or Inf found in input tensor. 90%|█████████ | 11156/12328 [44:38<04:41, 4.16it/s]

    opened by iammuhammad41 0
  • Could you please provide the evaluation result file predictions.pth of coco14?

    Could you please provide the evaluation result file predictions.pth of coco14?

    I want to get the evaluation result file predictions.pth of coco14 by the command of python -m torch.distributed.launch --nproc_per_node=8 tools/test_net.py --config-file configs/coco/V_16_coco14.yaml TEST.IMS_PER_BATCH 8 OUTPUT_DIR output/ MODEL.WEIGHT coco14_vgg16.pth. Just the evaluation result file is enough for me, but I have difficulties running the code. If you have the predictions.pth of coco val dataset, could you please provide it? Thank you very much.

    opened by THUeeY 0
  • Question about training with own data

    Question about training with own data

    I follow the instructions to install the model. And I train the model in voc2007 for 30,000 iteration with ' IMS_PER_BATCH: 2', the result seems normal. mAP: 0.3279 aeroplane : 0.5760 bicycle : 0.5919 bird : 0.2792 boat : 0.1606 bottle : 0.2195 bus : 0.4927 car : 0.7099 cat : 0.0822 chair : 0.1200 cow : 0.3114 diningtable : 0.0533 dog : 0.1096 horse : 0.0607 motorbike : 0.6474 person : 0.3175 pottedplant : 0.2002 sheep : 0.3854 sofa : 0.2155 train : 0.4436 tvmonitor : 0.5824 It did not reach the best point, but at least it proves it can works properly. But when I use my own data to train the network, the map is 0.0005%. My dataset has only 2 classes, and about 12000 images in total. One class exist in every image, and this class can not be recognized. The map of this class is 0. I modify some of the training settings in config file. ROI_BOX_HEAD: NUM_CLASSES: 3 Since I have only one GPU, I met the OOM problem, and I follow the comment, and change the settings to SOLVER: IMS_PER_BATCH: 1 BASE_LR: 0.0025 WEIGHT_DECAY: 0.0001 WARMUP_ITERS: 200 STEPS: (0, 30000, 40000) MAX_ITER: 60000 CHECKPOINT_PERIOD: 1000 I change the "CLASSES" in datasets/voc.py to the classes of my dataset as following CLASSES = ( "__background__ ", "trafficePolice", "pedestrian", ) The spelling is copied from annotation file. And I remove the lower() in name = obj.find("name").text.lower().strip() in lines136 of voc.py because the unrecognized class has a upper class letter. Except I point out the location of regions proposal and dataset, I do not make any other changes on the code. This overall training process seems to be normal. The data was feed into model, and the trend of loss keeps decreasing and every element of loss has value. Could you give me some advice?

    opened by ghZHM 3
  • Sequential backprop impl sketch

    Sequential backprop impl sketch

    Should something like below work for wrapping ResNet's last layer (Neck)? (https://gist.github.com/vadimkantorov/67fe785ed0bf31727af29a3584b87be1)

    import torch
    import torch.nn as nn
    
    class SequentialBackprop(nn.Module):
        def __init__(self, module, batch_size = 1):
            super().__init__()
            self.module = module
            self.batch_size = batch_size
    
        def forward(self, x):
            y = self.module(x.detach())
            return self.Function.apply(x, y, self.batch_size, self.module)
    
        class Function(torch.autograd.Function):
            @staticmethod
            def forward(ctx, x, y, batch_size, module):
                ctx.save_for_backward(x)
                ctx.batch_size = batch_size
                ctx.module = module
                return y
    
            @staticmethod
            def backward(ctx, grad_output):
                (x,) = ctx.saved_tensors
                grads = []
                for x_mini, g_mini in zip(x.split(ctx.batch_size), grad_output.split(ctx.batch_size)):
                    with torch.enable_grad():
                        x_mini = x_mini.detach().requires_grad_()
                        x_mini.retain_grad()
                        y_mini = ctx.module(x_mini)
                    torch.autograd.backward(y_mini, g_mini)
                    grads.append(x_mini.grad)
                return torch.cat(grads), None, None, None
    
    if __name__ == '__main__':
        backbone = nn.Linear(3, 6)
        neck = nn.Linear(6, 12)
        head = nn.Linear(12, 1)
    
        model = nn.Sequential(backbone, SequentialBackprop(neck, batch_size = 16), head)
    
        print('before', neck.weight.grad)
    
        x = torch.rand(512, 3)
        model(x).sum().backward()
        print('after', neck.weight.grad)
    
    opened by vadimkantorov 1
  • More questions about ROI heads and pseudo-label generation

    More questions about ROI heads and pseudo-label generation

    1. In what part of code do you handle this? In practice, conflicts happen when we force the yˆ(·, r) to be a one-hot vector since the same region can be chosen to be positive for different ground-truth classes, especially in the early stages of training. Our solution is to use that class for pseudo-label rˆ which has a higher predicted score s(c, rˆ).

    2. What scores are used for generating supervision for student branches? It seems to me that you normalize scores across classes. Is it true? https://github.com/NVlabs/wetectron/blob/44e6fa95aee07d6722a62af56f016a3ae99bd8a6/wetectron/modeling/roi_heads/weak_head/loss.py#L257

    Thanks!

    opened by vadimkantorov 0
  • Question about ROI heads

    Question about ROI heads

    1. Do I understand correctly that in practice for existing configs, only ROIWeakRegHead ends up used? (so box_head isn't used)

    2. Do I understand correctly that ROI-pooling is done only once and that student models reuse the computed ROI-features?

    3. Do I udnerstand correctly that only a single Pooler scale used at both train and test time?

    4. How do you handle background class in WSDDN computation? Do you take into account background scores in softmax over classes in classification branch?

    Thank you!

    opened by vadimkantorov 0
Owner
NVIDIA Research Projects
NVIDIA Research Projects
Weakly Supervised 3D Object Detection from Point Cloud with Only Image Level Annotation

SCCKTIM Weakly Supervised 3D Object Detection from Point Cloud with Only Image-Level Annotation Our code will be available soon. The class knowledge t

null 1 Nov 12, 2021
Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022)

Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022) By Shilong Zhang*, Zhuoran Yu*, Liyang Liu*, Xinjiang Wang, Aojun Zhou,

Shilong Zhang 129 Dec 24, 2022
Codes for TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization.

TS-CAM: Token Semantic Coupled Attention Map for Weakly SupervisedObject Localization This is the official implementaion of paper TS-CAM: Token Semant

vasgaowei 112 Jan 2, 2023
Normalization Matters in Weakly Supervised Object Localization (ICCV 2021)

Normalization Matters in Weakly Supervised Object Localization (ICCV 2021) 99% of the code in this repository originates from this link. ICCV 2021 pap

Jeesoo Kim 10 Feb 1, 2022
Project code for weakly supervised 3D object detectors using wide-baseline multi-view traffic camera data: WIBAM.

WIBAM (Work in progress) Weakly Supervised Training of Monocular 3D Object Detectors Using Wide Baseline Multi-view Traffic Camera Data 3D object dete

Matthew Howe 10 Aug 24, 2022
PyTorch implementation of ''Background Activation Suppression for Weakly Supervised Object Localization''.

Background Activation Suppression for Weakly Supervised Object Localization PyTorch implementation of ''Background Activation Suppression for Weakly S

null 35 Jan 6, 2023
Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

CCAM (Unsupervised) Code repository for our paper "CCAM: Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localizati

Computer Vision Insitute, SZU 113 Dec 27, 2022
Code for "FGR: Frustum-Aware Geometric Reasoning for Weakly Supervised 3D Vehicle Detection", ICRA 2021

FGR This repository contains the python implementation for paper "FGR: Frustum-Aware Geometric Reasoning for Weakly Supervised 3D Vehicle Detection"(I

Yi Wei 31 Dec 8, 2022
Yolo object detection - Yolo object detection with python

How to run download required files make build_image make download Docker versio

null 3 Jan 26, 2022
Weakly Supervised Learning of Rigid 3D Scene Flow

Weakly Supervised Learning of Rigid 3D Scene Flow This repository provides code and data to train and evaluate a weakly supervised method for rigid 3D

Zan Gojcic 124 Dec 27, 2022
Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation This paper has been accepted and early accessed

Yun Liu 39 Sep 20, 2022
A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains (IJCV submission)

wsss-analysis The code of: A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains, arXiv pre-print 2019 paper.

Lyndon Chan 48 Dec 18, 2022
Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation

Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation The code of: Context Decoupling Augmentation for Weakly Supervised Semanti

null 54 Dec 12, 2022
Weakly supervised medical named entity classification

Trove Trove is a research framework for building weakly supervised (bio)medical named entity recognition (NER) and other entity attribute classifiers

null 60 Nov 18, 2022
Discriminative Region Suppression for Weakly-Supervised Semantic Segmentation

Discriminative Region Suppression for Weakly-Supervised Semantic Segmentation (AAAI 2021) Official pytorch implementation of our paper: Discriminative

Beom 74 Dec 27, 2022
Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation (CVPR 2021)

Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation Input Image Initial CAM Successive Maps with adversar

Jungbeom Lee 110 Dec 7, 2022
Code for weakly supervised segmentation of a single class

SingleClassRL Implementation of weak single object segmentation from paper "Regularized Loss for Weakly Supervised Single Class Semantic Segmentation"

null 16 Nov 14, 2022
Code for the paper One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation, CVPR 2021.

One Thing One Click One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation (CVPR2021) Code for the paper One Thi

null 44 Dec 12, 2022
Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019). A PyTorch implementation.

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set —— PyTorch implementation This is an unofficial offici

Sicheng Xu 833 Dec 28, 2022