Code release for Hu et al., Learning to Segment Every Thing. in CVPR, 2018.

Overview

Learning to Segment Every Thing

This repository contains the code for the following paper:

  • R. Hu, P. Dollár, K. He, T. Darrell, R. Girshick, Learning to Segment Every Thing. in CVPR, 2018. (PDF)
@inproceedings{hu2018learning,
  title={Learning to Segment Every Thing},
  author={Hu, Ronghang and Dollár, Piotr and He, Kaiming and Darrell, Trevor and Girshick, Ross},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2018}
}

Project Page: http://ronghanghu.com/seg_every_thing

Note: this repository is built upon the Detectron codebase for object detection and segmentation (https://github.com/facebookresearch/Detectron), based on Detectron commit 3c4c7f67d37eeb4ab15a87034003980a1d259c94. Please see README_DETECTRON.md for details.

Installation

The installation procedure follows Detectron.

Please find installation instructions for Caffe2 and Detectron in INSTALL.md.

Note: all the experiments below run on 8 GPUs on a single machine. If you have less than 8 GPU available, please modify the yaml config files according to the linear scaling rule. For example, if you only have 4 GPUs, set NUM_GPUS to 4, downscale SOLVER.BASE_LR by 0.5x and multiply SOLVER.STEPS and SOLVER.MAX_ITER by 2x.

Part 1: Controlled Experiments on the COCO dataset

In this work, we explore our approach in two settings. First, we use the COCO dataset to simulate the partially supervised instance segmentation task as a means of establishing quantitative results on a dataset with high-quality annotations and evaluation metrics. Specifically, we split the full set of COCO categories into a subset with mask annotations and a complementary subset for which the system has access to only bounding box annotations. Because the COCO dataset involves only a small number (80) of semantically well-separated classes, quantitative evaluation is precise and reliable.

In our experiments, we split COCO into either

  • VOC Split: 20 PASCAL-VOC classes v.s. 60 non-PASCAL-VOC classes. We experiment with 1) VOC -> non-VOC, where set A={VOC} and 2) non-VOC -> VOC, where set A={non-VOC}.
  • Random Splits: randomly partitioned two subsets A and B of the 80 COCO classes.

and experiment with two training setups:

  • Stage-wise training, where first a Faster R-CNN detector is trained and kept frozen, and then the mask branch (including the weight transfer function) is added later.
  • End-to-end training, where the RPN, the box head, the mask head and the weight transfer function are trained together.

Please refer to Section 4 of our paper for details on the COCO experiments.

COCO Installation: To run the COCO experiments, first download the COCO dataset and install it according to the dataset guide.

Evaluation

The following experiments correspond to the results in Section 4.2 and Table 2 of our paper.

To run the experiments:

  1. Split the COCO dataset into VOC / non-VOC classes:
    python2 lib/datasets/bbox2mask_dataset_processing/coco/split_coco_dataset_voc_nonvoc.py.
  2. Set the training split using SPLIT variable:
  • To train on VOC -> non-VOC, where set A={VOC}, use export SPLIT=voc2nonvoc.
  • To train on non-VOC -> VOC, where set A={non-VOC}, use export SPLIT=nonvoc2voc.

Then use tools/train_net.py to run the following yaml config files for each experiment with ResNet-50-FPN backbone or ResNet-101-FPN backbone.

Please follow the instruction in GETTING_STARTED.md to train with the config files. The training scripts automatically test the trained models and print the bbox and mask APs on the VOC ('coco_split_voc_2014_minival') and non-VOC splits ('coco_split_nonvoc_2014_minival').

Using ResNet-50-FPN backbone:

  1. Class-agnostic (baseline): configs/bbox2mask_coco/${SPLIT}/eval_e2e/e2e_baseline.yaml
  2. MaskX R-CNN (ours, tansfer+MLP): configs/bbox2mask_coco/${SPLIT}/eval_e2e/e2e_clsbox_2_layer_mlp_nograd.yaml
  3. Fully-supervised (oracle): configs/bbox2mask_coco/oracle/e2e_mask_rcnn_R-50-FPN_1x.yaml

Using ResNet-101-FPN backbone:

  1. Class-agnostic (baseline): configs/bbox2mask_coco/${SPLIT}/eval_e2e_R101/e2e_baseline.yaml
  2. MaskX R-CNN (ours, tansfer+MLP): configs/bbox2mask_coco/${SPLIT}/eval_e2e_R101/e2e_clsbox_2_layer_mlp_nograd.yaml
  3. Fully-supervised (oracle): configs/bbox2mask_coco/oracle/e2e_mask_rcnn_R-101-FPN_1x.yaml

Ablation Study

This section runs ablation studies on the VOC Split (20 PASCAL-VOC classes v.s. 60 non-PASCAL-VOC classes) using ResNet-50-FPN backbone. The results correspond to Section 4.1 and Table 1 of our paper.

To run the experiments:

  1. (If you haven't done so in the above section) Split the COCO dataset into VOC / non-VOC classes:
    python2 lib/datasets/bbox2mask_dataset_processing/coco/split_coco_dataset_voc_nonvoc.py.
  2. For Study 1, 2, 3 and 5, download the pre-trained Faster R-CNN model with ResNet-50-FPN by running
    bash lib/datasets/data/trained_models/fetch_coco_faster_rcnn_model.sh.
    (Alternatively, you can train it yourself using configs/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_1x.yaml and copy it to lib/datasets/data/trained_models/28594643_model_final.pkl.)
  3. For Study 1, add the GloVe and random embeddings of the COCO class names to the Faster R-CNN weights with
    python2 lib/datasets/bbox2mask_dataset_processing/coco/add_embeddings_to_weights.py.
  4. Set the training split using SPLIT variable:
  • To train on VOC -> non-VOC, where set A={VOC}, use export SPLIT=voc2nonvoc.
  • To train on non-VOC -> VOC, where set A={non-VOC}, use export SPLIT=nonvoc2voc.

Then use tools/train_net.py to run the following yaml config files for each experiment.

Study 1: Ablation on the input to the weight transfer function (Table 1a)

  • transfer w/ randn: configs/bbox2mask_coco/${SPLIT}/ablation_input/randn_2_layer.yaml
  • transfer w/ GloVe: configs/bbox2mask_coco/${SPLIT}/ablation_input/glove_2_layer.yaml
  • transfer w/ cls: configs/bbox2mask_coco/${SPLIT}/ablation_input/cls_2_layer.yaml
  • transfer w/ box: configs/bbox2mask_coco/${SPLIT}/ablation_input/box_2_layer.yaml
  • transfer w/ cls+box: configs/bbox2mask_coco/${SPLIT}/eval_sw/clsbox_2_layer.yaml
  • class-agnostic (baseline): configs/bbox2mask_coco/${SPLIT}/eval_sw/baseline.yaml
  • fully supervised (oracle): configs/bbox2mask_coco/oracle/mask_rcnn_frozen_features_R-50-FPN_1x.yaml

Study 2: Ablation on the structure of the weight transfer function (Table 1b)

  • transfer w/ 1-layer, none: configs/bbox2mask_coco/${SPLIT}/ablation_structure/clsbox_1_layer.yaml
  • transfer w/ 2-layer, ReLU: configs/bbox2mask_coco/${SPLIT}/ablation_structure/relu/clsbox_2_layer_relu.yaml
  • transfer w/ 2-layer, LeakyReLU: same as 'transfer w/ cls+box' in Study 1
  • transfer w/ 3-layer, ReLU: configs/bbox2mask_coco/${SPLIT}/ablation_structure/relu/clsbox_3_layer_relu.yaml
  • transfer w/ 3-layer, LeakyReLU: configs/bbox2mask_coco/${SPLIT}/ablation_structure/clsbox_3_layer.yaml

Study 3: Impact of the MLP mask branch (Table 1c)

  • class-agnostic: same as 'class-agnostic (baseline)' in Study 1
  • class-agnostic+MLP: configs/bbox2mask_coco/${SPLIT}/ablation_mlp/baseline_mlp.yaml
  • transfer: same as 'transfer w/ cls+box' in Study 1
  • transfer+MLP: configs/bbox2mask_coco/${SPLIT}/ablation_mlp/clsbox_2_layer_mlp.yaml

Study 4: Ablation on the training strategy (Table 1d)

  • class-agnostic + sw: same as 'class-agnostic (baseline)' in Study 1
  • transfer + sw: same as 'transfer w/ cls+box' in Study 1
  • class-agnostic + e2e: configs/bbox2mask_coco/${SPLIT}/eval_e2e/e2e_baseline.yaml
  • transfer + e2e: configs/bbox2mask_coco/${SPLIT}/ablation_e2e_stopgrad/e2e_clsbox_2_layer.yaml
  • transfer + e2e + stopgrad: configs/bbox2mask_coco/${SPLIT}/ablation_e2e_stopgrad/e2e_clsbox_2_layer_nograd.yaml

Study 5: Comparison of random A/B splits (Figure 3)

Note: this ablation study takes a HUGE amount of computation power. It consists of 50 training experiments (= 5 trials * 5 class-number in set A (20/30/40/50/60) * 2 settings (ours/baseline) ), and each training experiment takes approximately 9 hours to complete on 8 GPUs.

Before running Study 5:

  1. Split the COCO dataset into random class splits (This should take a while):
    python2 lib/datasets/bbox2mask_dataset_processing/coco/split_coco_dataset_randsplits.py.
  2. Set the training split using SPLIT variable (e.g. export SPLIT=E1_A20B60). The split has the format E%d_A%dB%d for example, E1_A20B60 is trial No. 1 with 20 random classes in set A and 60 random classes in set B. There are 5 trials (E1 to E5), with 20/30/40/50/60 random classes in set A (A20B60 to A60B20), yielding altogether 25 splits from E1_A20B60 to E5_A60B20.

Then use tools/train_net.py to run the following yaml config files for each experiment.

  • class-agnostic (baseline): configs/bbox2mask_coco/randsplits/eval_sw/${SPLIT}_baseline.yaml
  • tansfer w/ cls+box, 2-layer, LeakyReLU: configs/bbox2mask_coco/randsplits/eval_sw/${SPLIT}_clsbox_2_layer.yaml

Part 2: Large-scale Instance Segmentation on the Visual Genome dataset

In our second setting, we train a large-scale instance segmentation model on 3000 categories using the Visual Genome (VG) dataset. On the Visual Genome dataset, set A (w/ mask data) is the 80 COCO classes, while set B (w/o mask data, only bbox) is the remaining Visual Genome classes that are not in COCO.

Please refer to Section 5 of our paper for details on the Visual Genome experiments.

Inference

To run inference, download the pre-trained final model weights by running:
bash lib/datasets/data/trained_models/fetch_vg3k_final_model.sh
(Alternatively, you may train these weights yourself following the training section below.)

Then, use tools/infer_simple.py for prediction. Note: due to the large number of classes and the model loading overhead, prediction on the first image can take a while.

Using ResNet-50-FPN backbone:

python2 tools/infer_simple.py \
    --cfg configs/bbox2mask_vg/eval_sw/runtest_clsbox_2_layer_mlp_nograd.yaml \
    --output-dir /tmp/detectron-visualizations-vg3k \
    --image-ext jpg \
    --thresh 0.5 --use-vg3k \
    --wts lib/datasets/data/trained_models/33241332_model_final_coco2vg3k_seg.pkl \
    demo_vg3k

Using ResNet-101-FPN backbone:

python2 tools/infer_simple.py \
    --cfg configs/bbox2mask_vg/eval_sw_R101/runtest_clsbox_2_layer_mlp_nograd_R101.yaml \
    --output-dir /tmp/detectron-visualizations-vg3k-R101 \
    --image-ext jpg \
    --thresh 0.5 --use-vg3k \
    --wts lib/datasets/data/trained_models/33219850_model_final_coco2vg3k_seg.pkl \
    demo_vg3k

Training

Visual Genome Installation: To run the Visual Genome experiments, first download the Visual Genome dataset and install it according to the dataset guide. Then download the converted Visual Genome json dataset files (in COCO-format) by running:
bash lib/datasets/data/vg3k_bbox2mask/fetch_vg3k_json.sh.
(Alternatively, you may build the COCO-format json dataset files yourself using the scripts in lib/datasets/bbox2mask_dataset_processing/vg/)

Here, we adopt the stage-wise training strategy as mentioned in Section 5 of our paper. First in Stage 1, a Faster R-CNN detector is trained on all the 3k Visual Genome classes (set A+B). Then in Stage 2, the mask branch (with the weight transfer function) is added and trained on the mask data of the 80 COCO classes (set A). Finally, the mask branch is applied on all 3k Visual Genome classes (set A+B).

Before training on the mask data of the 80 COCO classes (set A) in Stage 2, a "surgery" is done to convert the 3k VG detection weights to 80 COCO detection weights, so that the mask branch only predicts mask outputs of the 80 COCO classes (as the weight transfer function only takes as input 80 classes) to save GPU memory. After training, another "surgery" is done to convert the 80 COCO detection weights back to the 3k VG detection weights.

To run the experiments, use tools/train_net.py to run the following yaml config files for each experiment with ResNet-50-FPN backbone or ResNet-101-FPN backbone.

Using ResNet-50-FPN backbone:

  1. Stage 1 (bbox training on 3k VG classes): run tools/train_net.py with configs/bbox2mask_vg/eval_sw/stage1_e2e_fast_rcnn_R-50-FPN_1x_1im.yaml
  2. Weights "surgery" 1: convert 3k VG detection weights to 80 COCO detection weights:
    python2 tools/vg3k_training/convert_coco_seg_to_vg3k.py --input_model /path/to/model_1.pkl --output_model /path/to/model_1_vg3k2coco_det.pkl
    where /path/to/model_1.pkl is the path to the final model trained in Stage 1 above.
  3. Stage 2 (mask training on 80 COCO classes): run tools/train_net.py with configs/bbox2mask_vg/eval_sw/stage2_cocomask_clsbox_2_layer_mlp_nograd.yaml
    IMPORTANT: when training Stage 2, set TRAIN.WEIGHTS to /path/to/model_1_vg3k2coco_det.pkl (the output of convert_coco_seg_to_vg3k.py) in tools/train_net.py.
  4. Weights "surgery" 2: convert 80 COCO detection weights back to 3k VG detection weights:
    python2 tools/vg3k_training/convert_vg3k_det_to_coco.py --input_model /path/to/model_2.pkl --output_model /path/to/model_2_coco2vg3k_seg.pkl
    where /path/to/model_2.pkl is the path to the final model trained in Stage 2 above. The output /path/to/model_2_coco2vg3k_seg.pkl can be used for VG 3k instance segmentation.

Using ResNet-101-FPN backbone:

  1. Stage 1 (bbox training on 3k VG classes): run tools/train_net.py with configs/bbox2mask_vg/eval_sw_R101/stage1_e2e_fast_rcnn_R-101-FPN_1x_1im.yaml
  2. Weights "surgery" 1: convert 3k VG detection weights to 80 COCO detection weights:
    python2 tools/vg3k_training/convert_coco_seg_to_vg3k.py --input_model /path/to/model_1.pkl --output_model /path/to/model_1_vg3k2coco_det.pkl
    where /path/to/model_1.pkl is the path to the final model trained in Stage 1 above.
  3. Stage 2 (mask training on 80 COCO classes): run tools/train_net.py with configs/bbox2mask_vg/eval_sw_R101/stage2_cocomask_clsbox_2_layer_mlp_nograd_R101.yaml
    IMPORTANT: when training Stage 2, set TRAIN.WEIGHTS to /path/to/model_1_vg3k2coco_det.pkl (the output of convert_coco_seg_to_vg3k.py) in tools/train_net.py.
  4. Weights "surgery" 2: convert 80 COCO detection weights back to 3k VG detection weights:
    python2 tools/vg3k_training/convert_vg3k_det_to_coco.py --input_model /path/to/model_2.pkl --output_model /path/to/model_2_coco2vg3k_seg.pkl
    where /path/to/model_2.pkl is the path to the final model trained in Stage 2 above. The output /path/to/model_2_coco2vg3k_seg.pkl can be used for VG 3k instance segmentation.

(Alternatively, you may skip Stage 1 and Weights "surgery" 1 by directly downloading the pre-trained VG 3k detection weights by running bash lib/datasets/data/trained_models/fetch_vg3k_faster_rcnn_model.sh, and leaving TRAIN.WEIGHTS to the specified values in the yaml configs in Stage 2.)

Comments
  • KeyError: u'Non-existent config key: MRCNN.BBOX2MASK'

    KeyError: u'Non-existent config key: MRCNN.BBOX2MASK'

    I was trying to test the inference section. My computer is Ubuntu16.04 with 1080TI (CUDA9.0 + CUDNN7.1). The installation of Caffe2 and Detectron all seemed work (no error in test, except some warnings). Then I ran

    python2 tools/infer_simple.py \
        --cfg configs/bbox2mask_vg/eval_sw/runtest_clsbox_2_layer_mlp_nograd.yaml \
        --output-dir /tmp/detectron-visualizations-vg3k \
        --image-ext jpg \
        --thresh 0.5 --use-vg3k \
        --wts lib/datasets/data/trained_models/33241332_model_final_coco2vg3k_seg.pkl \
        demo_vg3k
    

    It aborted with error KeyError: u'Non-existent config key: MRCNN.BBOX2MASK'. Followed is the full logging.

    /home/meng/.local/lib/python2.7/site-packages/scipy/sparse/lil.py:19: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
      from . import _csparsetools
    /home/meng/.local/lib/python2.7/site-packages/scipy/sparse/csgraph/__init__.py:165: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
      from ._shortest_path import shortest_path, floyd_warshall, dijkstra,\
    /home/meng/.local/lib/python2.7/site-packages/scipy/sparse/csgraph/_validation.py:5: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
      from ._tools import csgraph_to_dense, csgraph_from_dense,\
    /home/meng/.local/lib/python2.7/site-packages/scipy/sparse/csgraph/__init__.py:167: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
      from ._traversal import breadth_first_order, depth_first_order, \
    /home/meng/.local/lib/python2.7/site-packages/scipy/sparse/csgraph/__init__.py:169: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
      from ._min_spanning_tree import minimum_spanning_tree
    /home/meng/.local/lib/python2.7/site-packages/scipy/sparse/csgraph/__init__.py:170: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
      from ._reordering import reverse_cuthill_mckee, maximum_bipartite_matching, \
    Found Detectron ops lib: /usr/local/lib/libcaffe2_detectron_ops_gpu.so
    E0802 21:35:37.893579 20333 init_intrinsics_check.cc:43] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
    E0802 21:35:37.893594 20333 init_intrinsics_check.cc:43] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
    E0802 21:35:37.893596 20333 init_intrinsics_check.cc:43] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
    Traceback (most recent call last):
      File "tools/infer_simple.py", line 162, in <module>
        main(args)
      File "tools/infer_simple.py", line 108, in main
        merge_cfg_from_file(args.cfg)
      File "/home/meng/lab-mill/Detectron/lib/core/config.py", line 1091, in merge_cfg_from_file
        _merge_a_into_b(yaml_cfg, __C)
      File "/home/meng/lab-mill/Detectron/lib/core/config.py", line 1149, in _merge_a_into_b
        _merge_a_into_b(v, b[k], stack=stack_push)
      File "/home/meng/lab-mill/Detectron/lib/core/config.py", line 1139, in _merge_a_into_b
        raise KeyError('Non-existent config key: {}'.format(full_key))
    KeyError: u'Non-existent config key: MRCNN.BBOX2MASK'
    

    I checked with some similar errors online, which is about "indentation mismatch", but seems not our case. Anyone knows how to solve this, and does the warning information matter? Thanks.

    opened by mengyuest 2
  • GPU out of memory during inference.

    GPU out of memory during inference.

    Hi, I have been trying to run the code for inference for several days. However, when I try to run the code for 'Resnet 101-FPN'. It shows gpu out of memory after inferring several images. What I don't get is that the number of inferring images change with the total number of images in dataset I put as input. For example: when I try to infer my whole dataset of 4K+ images, it infers around 44 images and then when I tried with 44 images to infer, it only inferred 18 images. I am still a beginner. Help is much appreciated. Again, I also tried to run the code with 'Resnet 50-FPN'. It shows me the following error: self.append(rep.decode("string-escape")) ValueError: invalid \x escape I searched about these. I learned that it is because there is a '\x' character. Is there any way that I could remove it?

    For 50 FPN: python2 tools/infer_simple.py
    --cfg configs/bbox2mask_vg/eval_sw/runtest_clsbox_2_layer_mlp_nograd.yaml
    --output-dir /tmp/detectron-visualizations-vg3k
    --image-ext jpg
    --thresh 0.5 --use-vg3k
    --wts lib/datasets/data/trained_models/33241332_model_final_coco2vg3k_seg.pkl
    demo_vg3k

    For 101 FPN python2 tools/infer_simple.py
    --cfg configs/bbox2mask_vg/eval_sw_R101/runtest_clsbox_2_layer_mlp_nograd_R101.yaml
    --output-dir /tmp/detectron-visualizations-vg3k-R101
    --image-ext jpg
    --thresh 0.5 --use-vg3k
    --wts lib/datasets/data/trained_models/33219850_model_final_coco2vg3k_seg.pkl
    demo_vg3k

    System information

    • Operating system: Ubuntu 18.04
    • CUDA version: 10
    • cuDNN version: 7.4.1
    • NVIDIA driver version: 410
    • GPU models (for all devices if they are not all the same): ? Nvidia GeForce GTX 1060
    • python --version output: 2.7
    opened by rupaai 1
  • COCO-aligned annotations for VisualGenome

    COCO-aligned annotations for VisualGenome

    Dear authors,

    I am trying to download the COCO-aligned annotations for VisualGenome from link by running lib/datasets/data/vg3k_bbox2mask/fetch_vg3k_json.sh, but I'm getting the 404 error.

    Could you please re-upload the annotation files or provide some other reference?

    opened by vlfom 0
  • Error while running inference for Part 2

    Error while running inference for Part 2

    I am trying to run inference for Part 2 of your code. I am trying to run inference on Google Colab. I followed all the steps and install caffe2 and Detectron and all are running fine. I ran :

    python2 tools/infer_simple.py \ --cfg configs/bbox2mask_vg/eval_sw/runtest_clsbox_2_layer_mlp_nograd.yaml \ --output-dir /tmp/detectron-visualizations-vg3k \ --image-ext jpg \ --thresh 0.5 --use-vg3k \ --wts lib/datasets/data/trained_models/33241332_model_final_coco2vg3k_seg.pkl \ demo_vg3k

    The code aborted with error: KeyError: u'Non-existent config key: MRCNN.BBOX2MASK

    The logging information is:

    Found Detectron ops lib: /usr/local/lib/python2.7/dist-packages/torch/lib/libcaffe2_detectron_ops_gpu.so [E init_intrinsics_check.cc:43] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU. [E init_intrinsics_check.cc:43] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU. [E init_intrinsics_check.cc:43] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU. Traceback (most recent call last): File "tools/infer_simple.py", line 162, in <module> main(args) File "tools/infer_simple.py", line 108, in main merge_cfg_from_file(args.cfg) File "/content/drive/My Drive/seg_every_thing/detectron/detectron/core/config.py", line 1152, in merge_cfg_from_file _merge_a_into_b(yaml_cfg, __C) File "/content/drive/My Drive/seg_every_thing/detectron/detectron/core/config.py", line 1212, in _merge_a_into_b _merge_a_into_b(v, b[k], stack=stack_push) File "/content/drive/My Drive/seg_every_thing/detectron/detectron/core/config.py", line 1202, in _merge_a_into_b raise KeyError('Non-existent config key: {}'.format(full_key)) KeyError: u'Non-existent config key: MRCNN.BBOX2MASK'

    I was looking for this error and came across another same error issue on your page "https://github.com/ronghanghu/seg_every_thing/issues/2" which is been closed. I read the solution there. However, I am guessing my problem is due to the new version of Detectron. Now available Detectron folder does not contain the lib folder in it. I have not gone thoroughly into the codes so, please help me to solve this. Thanks in advance.

    • PYTHONPATH environment variable: /env/python:/content/drive/My Drive/seg_every_thing/detectron/detectron

    • python --version output: 2.7

    opened by ashishverma03 0
  • Dockerfile error:

    Dockerfile error:

    i am getting the following issue when i am trying to run the following command COMMAND :
    nvidia-docker run --rm -it 5de59a4a77bd python2 tools/train_net.py --multi-gpu-testing --cfg configs/getting_started/tutorial_2gpu_e2e_faster_rcnn_R-50-FPN.yaml OUTPUT_DIR /tmp/detectron-output

    ERROR : satish@vader:~/Revision/DETECTRON/seg_every_thing$ nvidia-docker run --rm -it 5de59a4a77bd python2 tools/train_net.py --multi-gpu-testing --cfg configs/getting_started/tutorial_2gpu_e2e_faster_rcnn_R-50-FPN.yaml OUTPUT_DIR /tmp/detectron-output Found Detectron ops lib: /usr/local/caffe2_build/lib/libcaffe2_detectron_ops_gpu.so E0929 00:50:01.726465 1 init_intrinsics_check.cc:54] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU. E0929 00:50:01.726497 1 init_intrinsics_check.cc:54] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU. E0929 00:50:01.726505 1 init_intrinsics_check.cc:54] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU. INFO train_net.py: 95: Called with args: INFO train_net.py: 96: Namespace(cfg_file='configs/getting_started/tutorial_2gpu_e2e_faster_rcnn_R-50-FPN.yaml', multi_gpu_testing=True, opts=['OUTPUT_DIR', '/tmp/detectron-output'], skip_test=False) /detectron/lib/core/config.py:1090: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. yaml_cfg = AttrDict(yaml.load(f)) INFO io.py: 67: Downloading remote file https://s3-us-west-2.amazonaws.com/detectron/ImageNetPretrained/MSRA/R-50.pkl to /tmp/detectron-download-cache/ImageNetPretrained/MSRA/R-50.pkl Traceback (most recent call last): File "tools/train_net.py", line 128, in main() File "tools/train_net.py", line 101, in main assert_and_infer_cfg() File "/detectron/lib/core/config.py", line 1054, in assert_and_infer_cfg cache_cfg_urls() File "/detectron/lib/core/config.py", line 1063, in cache_cfg_urls __C.TRAIN.WEIGHTS = cache_url(__C.TRAIN.WEIGHTS, __C.DOWNLOAD_CACHE) File "/detectron/lib/utils/io.py", line 68, in cache_url download_url(url, cache_file_path) File "/detectron/lib/utils/io.py", line 114, in download_url response = urllib2.urlopen(url) File "/usr/lib/python2.7/urllib2.py", line 154, in urlopen return opener.open(url, data, timeout) File "/usr/lib/python2.7/urllib2.py", line 435, in open response = meth(req, response) File "/usr/lib/python2.7/urllib2.py", line 548, in http_response 'http', request, response, code, msg, hdrs) File "/usr/lib/python2.7/urllib2.py", line 473, in error return self._call_chain(*args) File "/usr/lib/python2.7/urllib2.py", line 407, in _call_chain result = func(*args) File "/usr/lib/python2.7/urllib2.py", line 556, in http_error_default raise HTTPError(req.get_full_url(), code, msg, hdrs, fp) urllib2.HTTPError: HTTP Error 301: Moved Permanently

    I even changed the url in lib. utils io.py from : https://s3-us-west-2.amazonaws.com/detectron/ImageNetPretrained/MSRA/R-50.pkl to : https://dl.fbaipublicfiles.com/detectron/ImageNetPretrained/MSRA/R-50.pkl

    But still detectron is some how pulling the cached url

    Then to fix this i pulled the latest github repo of detectron with the updated "url". But after the as mentioned in the other issue.

    "RUN make ops " fails in the dockerfile with the below error Can you please suggest the fix or work around :

    Step 13/13 : RUN make ops ---> Running in cfe37fc1b57b mkdir -p build && cd build && cmake .. && make -j28 -- The C compiler identification is GNU 5.4.0 -- The CXX compiler identification is GNU 5.4.0 -- Check for working C compiler: /usr/bin/cc -- Check for working C compiler: /usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /usr/bin/c++ -- Check for working CXX compiler: /usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Cannot find gflags with config files. Using legacy find. CMake Warning at /usr/local/caffe2_build/share/cmake/Caffe2/public/gflags.cmake:2 (find_package): By not providing "Findgflags.cmake" in CMAKE_MODULE_PATH this project has asked CMake to find a package configuration file provided by "gflags", but CMake did not find one.

    Could not find a package configuration file provided by "gflags" with any of the following names:

    gflagsConfig.cmake
    gflags-config.cmake
    

    Add the installation prefix of "gflags" to CMAKE_PREFIX_PATH or set "gflags_DIR" to a directory containing one of the above files. If "gflags" provides a separate development package or SDK, be sure it has been installed. Call Stack (most recent call first): /usr/local/caffe2_build/share/cmake/Caffe2/Caffe2Config.cmake:16 (include) CMakeLists.txt:8 (find_package)

    -- Found gflags: /usr/include -- Found gflags (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libgflags.so) -- Cannot find glog. Using legacy find. CMake Warning at /usr/local/caffe2_build/share/cmake/Caffe2/public/glog.cmake:2 (find_package): By not providing "Findglog.cmake" in CMAKE_MODULE_PATH this project has asked CMake to find a package configuration file provided by "glog", but CMake did not find one.

    Could not find a package configuration file provided by "glog" with any of the following names:

    glogConfig.cmake
    glog-config.cmake
    

    Add the installation prefix of "glog" to CMAKE_PREFIX_PATH or set "glog_DIR" to a directory containing one of the above files. If "glog" provides a separate development package or SDK, be sure it has been installed. Call Stack (most recent call first): /usr/local/caffe2_build/share/cmake/Caffe2/Caffe2Config.cmake:30 (include) CMakeLists.txt:8 (find_package)

    -- Found glog: /usr/include CMake Warning at CMakeLists.txt:13 (message): You are using an older version of Caffe2 (version 0.8.1). Please consider moving to a newer version.

    -- Looking for pthread.h -- Looking for pthread.h - found -- Looking for pthread_create -- Looking for pthread_create - not found -- Looking for pthread_create in pthreads -- Looking for pthread_create in pthreads - not found -- Looking for pthread_create in pthread -- Looking for pthread_create in pthread - found -- Found Threads: TRUE -- CUDA detected: 9.0 -- Added CUDA NVCC flags for: sm_30 sm_35 sm_50 sm_52 sm_60 sm_61 sm_70 -- Found libcuda: /usr/local/cuda/lib64/stubs/libcuda.so -- Found libnvrtc: /usr/local/cuda/lib64/libnvrtc.so -- Found CUDNN: /usr/include -- Found cuDNN: v7.0.5 (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libcudnn.so) -- Summary: -- CMake version : 3.5.1 -- CMake command : /usr/bin/cmake -- System name : Linux -- C++ compiler : /usr/bin/c++ -- C++ compiler version : 5.4.0 -- CXX flags : -std=c++11 -O2 -fPIC -Wno-narrowing -- Caffe2 version : 0.8.1 -- Caffe2 include path : /usr/local/caffe2_build/include -- Have CUDA : TRUE -- CUDA version : 9.0 -- CuDNN version : 7.0.5 -- Configuring done -- Generating done -- Build files have been written to: /detectron/build make[1]: Entering directory '/detectron/build' make[2]: Entering directory '/detectron/build' make[3]: Entering directory '/detectron/build' make[3]: Entering directory '/detectron/build' [ 20%] Building NVCC (Device) object CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/detectron/ops/caffe2_detectron_custom_ops_gpu_generated_zero_even_op.cu.o Scanning dependencies of target caffe2_detectron_custom_ops make[3]: Leaving directory '/detectron/build' make[3]: Entering directory '/detectron/build' [ 40%] Building CXX object CMakeFiles/caffe2_detectron_custom_ops.dir/detectron/ops/zero_even_op.cc.o In file included from /usr/local/caffe2_build/include/caffe2/core/allocator.h:22:0, from /usr/local/caffe2_build/include/caffe2/core/context.h:25, from /detectron/detectron/ops/zero_even_op.h:20, from /detectron/detectron/ops/zero_even_op.cc:17: /detectron/detectron/ops/zero_even_op.cc: In member function 'bool caffe2::ZeroEvenOp<T, Context>::RunOnDevice() [with T = float; Context = caffe2::CPUContext]': /detectron/detectron/ops/zero_even_op.cc:25:23: error: no matching function for call to 'caffe2::Tensorcaffe2::CPUContext::dim() const' CAFFE_ENFORCE(X.dim() == 1); ^ In file included from /usr/local/caffe2_build/include/caffe2/core/net.h:34:0, from /usr/local/caffe2_build/include/caffe2/core/operator.h:29, from /detectron/detectron/ops/zero_even_op.h:21, from /detectron/detectron/ops/zero_even_op.cc:17: /usr/local/caffe2_build/include/caffe2/core/tensor.h:687:17: note: candidate: caffe2::TIndex caffe2::Tensor::dim(int) const [with Context = caffe2::CPUContext; caffe2::TIndex = long int] inline TIndex dim(const int i) const { ^ /usr/local/caffe2_build/include/caffe2/core/tensor.h:687:17: note: candidate expects 1 argument, 0 provided /detectron/detectron/ops/zero_even_op.cc:33:27: error: 'class caffe2::Tensorcaffe2::CPUContext' has no member named 'numel' for (auto i = 0; i < Y->numel(); i += 2) { ^ CMakeFiles/caffe2_detectron_custom_ops.dir/build.make:62: recipe for target 'CMakeFiles/caffe2_detectron_custom_ops.dir/detectron/ops/zero_even_op.cc.o' failed make[3]: Leaving directory '/detectron/build' make[3]: *** [CMakeFiles/caffe2_detectron_custom_ops.dir/detectron/ops/zero_even_op.cc.o] Error 1 make[2]: *** [CMakeFiles/caffe2_detectron_custom_ops.dir/all] Error 2 make[2]: *** Waiting for unfinished jobs.... CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/caffe2_detectron_custom_ops.dir/all' failed Scanning dependencies of target caffe2_detectron_custom_ops_gpu make[3]: Leaving directory '/detectron/build' make[3]: Entering directory '/detectron/build' [ 60%] Building CXX object CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/detectron/ops/zero_even_op.cc.o In file included from /usr/local/caffe2_build/include/caffe2/core/allocator.h:22:0, from /usr/local/caffe2_build/include/caffe2/core/context.h:25, from /detectron/detectron/ops/zero_even_op.h:20, from /detectron/detectron/ops/zero_even_op.cc:17: /detectron/detectron/ops/zero_even_op.cc: In member function 'bool caffe2::ZeroEvenOp<T, Context>::RunOnDevice() [with T = float; Context = caffe2::CPUContext]': /detectron/detectron/ops/zero_even_op.cc:25:23: error: no matching function for call to 'caffe2::Tensorcaffe2::CPUContext::dim() const' CAFFE_ENFORCE(X.dim() == 1); ^ In file included from /usr/local/caffe2_build/include/caffe2/core/net.h:34:0, from /usr/local/caffe2_build/include/caffe2/core/operator.h:29, from /detectron/detectron/ops/zero_even_op.h:21, from /detectron/detectron/ops/zero_even_op.cc:17: /usr/local/caffe2_build/include/caffe2/core/tensor.h:687:17: note: candidate: caffe2::TIndex caffe2::Tensor::dim(int) const [with Context = caffe2::CPUContext; caffe2::TIndex = long int] inline TIndex dim(const int i) const { ^ /usr/local/caffe2_build/include/caffe2/core/tensor.h:687:17: note: candidate expects 1 argument, 0 provided /detectron/detectron/ops/zero_even_op.cc:33:27: error: 'class caffe2::Tensorcaffe2::CPUContext' has no member named 'numel' for (auto i = 0; i < Y->numel(); i += 2) { ^ make[3]: *** [CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/detectron/ops/zero_even_op.cc.o] Error 1 CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/build.make:69: recipe for target 'CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/detectron/ops/zero_even_op.cc.o' failed make[3]: Leaving directory '/detectron/build' CMakeFiles/Makefile2:104: recipe for target 'CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/all' failed make[2]: Leaving directory '/detectron/build' make[2]: *** [CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/all] Error 2 make[1]: *** [all] Error 2 Makefile:127: recipe for target 'all' failed make[1]: Leaving directory '/detectron/build' make: *** [ops] Error 2 Makefile:13: recipe for target 'ops' failed The command '/bin/sh -c make ops' returned a non-zero code: 2

    opened by satish1901 0
  • Does not run infer_simple.py

    Does not run infer_simple.py

    Expected results

    What did you expect to see? Expected some output related to the first training with the demo.

    Actual results

    What did you observe instead?

    Found Detectron ops lib: /usr/local/caffe2_build/lib/libcaffe2_detectron_ops_gpu.so
    E0607 14:02:35.360527   260 init_intrinsics_check.cc:54] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
    E0607 14:02:35.360548   260 init_intrinsics_check.cc:54] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
    E0607 14:02:35.360565   260 init_intrinsics_check.cc:54] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
    /detectron/lib/core/config.py:1090: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
      yaml_cfg = AttrDict(yaml.load(f))
    INFO io.py:  67: Downloading remote file https://s3-us-west-2.amazonaws.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl to /tmp/detectron-download-cache/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl
    Traceback (most recent call last):
      File "tools/infer_simple.py", line 148, in <module>
        main(args)
      File "tools/infer_simple.py", line 98, in main
        args.weights = cache_url(args.weights, cfg.DOWNLOAD_CACHE)
      File "/detectron/lib/utils/io.py", line 68, in cache_url
        download_url(url, cache_file_path)
      File "/detectron/lib/utils/io.py", line 114, in download_url
        response = urllib2.urlopen(url)
      File "/usr/lib/python2.7/urllib2.py", line 154, in urlopen
        return opener.open(url, data, timeout)
      File "/usr/lib/python2.7/urllib2.py", line 435, in open
        response = meth(req, response)
      File "/usr/lib/python2.7/urllib2.py", line 548, in http_response
        'http', request, response, code, msg, hdrs)
      File "/usr/lib/python2.7/urllib2.py", line 473, in error
        return self._call_chain(*args)
      File "/usr/lib/python2.7/urllib2.py", line 407, in _call_chain
        result = func(*args)
      File "/usr/lib/python2.7/urllib2.py", line 556, in http_error_default
        raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
    urllib2.HTTPError: HTTP Error 301: Moved Permanently
    
    

    Detailed steps to reproduce

    python2 tools/infer_simple.py \
    >     --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml \
    >     --output-dir /tmp/detectron-visualizations \
    >     --image-ext jpg \
    >     --wts https://s3-us-west-2.amazonaws.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl \
    >     demo
    

    My version is from the dockers images that you guys have. Takin in consideration that I had to modify it to remove an error . Everything is the same except :

    Clone the Detectron repository

    RUN git clone https://github.com/facebookresearch/detectron /detectron
    && cd /detectron
    && git reset --hard 3c4c7f67d37eeb4ab15a87034003980a1d259c94

    System information

    • Operating system: Linux
    • Compiler version: ?
    • CUDA version: 9.0
    • cuDNN version: ?
    • NVIDIA driver version: ?
    • GPU models (for all devices if they are not all the same): ?
    • PYTHONPATH environment variable: ?
    • python --version output: 2.7.12
    • Anything else that seems relevant: ?
    opened by lexdyel-mendez 0
  • KeyError: u'Non-existent config key: BODY_UV_RCNN Error

    KeyError: u'Non-existent config key: BODY_UV_RCNN Error

    Hi guys: I have some trouble debugging the following error. I tried some ways to fix it such as checking detectron PATHONPATH, but still doesn't doesn't work me. Can someone help me find out what is the problem here?

    (caffe2_env) (/data1/wdyli/miniconda3) [wdyli@TENCENT64 ~/densepose]$ python tools/infer_simple.py     --cfg configs/DensePose_ResNet101_FPN_s1x-e2e.yaml     --output-dir /data1/wdyli/densepose/DensePoseData/sample_output/     --image-ext jpg     --wts https://dl.fbaipublicfiles.com/densepose/DensePose_ResNet101_FPN_s1x-e2e.pkl     /data1/wdyli/densepose/DensePoseData/demo_data/demo_im.jpg
    Found Detectron ops lib: /data1/wdyli/miniconda2/lib/python2.7/site-packages/torch/lib/libcaffe2_detectron_ops_gpu.so
    [E init_intrinsics_check.cc:43] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
    [E init_intrinsics_check.cc:43] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
    [E init_intrinsics_check.cc:43] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
    Traceback (most recent call last):
      File "tools/infer_simple.py", line 140, in <module>
        main(args)
      File "tools/infer_simple.py", line 87, in main
        merge_cfg_from_file(args.cfg)
      File "/data1/wdyli/densepose/detectron/detectron/core/config.py", line 1135, in merge_cfg_from_file
        _merge_a_into_b(yaml_cfg, __C)
      File "/data1/wdyli/densepose/detectron/detectron/core/config.py", line 1185, in _merge_a_into_b
        raise KeyError('Non-existent config key: {}'.format(full_key))
    KeyError: u'Non-existent config key: BODY_UV_RCNN'
    
    
    opened by tiandiao123 1
Owner
Ronghang Hu
Research Scientist, Facebook AI Research (FAIR)
Ronghang Hu
Deep Learning Chinese Word Segment

引用 本项目模型BiLSTM+CRF参考论文:http://www.aclweb.org/anthology/N16-1030 ,IDCNN+CRF参考论文:https://arxiv.org/abs/1702.02098 构建 安装好bazel代码构建工具,安装好tensorflow(目前本项目需

null 2.1k Dec 23, 2022
Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text Recognition"

SEE: Towards Semi-Supervised End-to-End Scene Text Recognition Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text

Christian Bartz 572 Jan 5, 2023
Forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE

EAST_ICPR: EAST for ICPR MTWI 2018 CHALLENGE Introduction This is a repository forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE. Origin Reposi

Haozheng Li 157 Aug 23, 2022
EAST for ICPR MTWI 2018 Challenge II (Text detection of network images)

EAST_ICPR2018: EAST for ICPR MTWI 2018 Challenge II (Text detection of network images) Introduction This is a repository forked from argman/EAST for t

QichaoWu 49 Dec 24, 2022
Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo Thomas Kollar, Michael Laskey, Kevin Stone, Brijen Thananjeyan

null 68 Dec 14, 2022
A simple Security Camera created using Opencv in Python where images gets saved in realtime in your Dropbox account at every 5 seconds

Security Camera using Opencv & Dropbox This is a simple Security Camera created using Opencv in Python where images gets saved in realtime in your Dro

Arpit Rath 1 Jan 31, 2022
Select range and every time the screen changes, OCR is activated.

ASOCR(Auto Screen OCR) Select range and every time you press Space key, OCR is activated. 範囲を選ぶと、あなたがスペースキーを押すたびに、画面が変わる度にOCRが起動します。 usage1: simple OC

null 1 Feb 13, 2022
Source code of our TPAMI'21 paper Dual Encoding for Video Retrieval by Text and CVPR'19 paper Dual Encoding for Zero-Example Video Retrieval.

Dual Encoding for Video Retrieval by Text Source code of our TPAMI'21 paper Dual Encoding for Video Retrieval by Text and CVPR'19 paper Dual Encoding

null 81 Dec 1, 2022
Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Head Detector Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd. The head_detection mod

Ramana Subramanyam 76 Dec 6, 2022
Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.

SynthText Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Ved

Ankush Gupta 1.8k Dec 28, 2022
Code for CVPR 2022 paper "SoftGroup for Instance Segmentation on 3D Point Clouds"

SoftGroup We provide code for reproducing results of the paper SoftGroup for 3D Instance Segmentation on Point Clouds (CVPR 2022) Author: Thang Vu, Ko

Thang Vu 231 Dec 27, 2022
Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).

Bridging Video-text Retrieval with Multiple Choice Questions, CVPR 2022 (Oral) Paper | Project Page | Pre-trained Model | CLIP-Initialized Pre-trained

Applied Research Center (ARC), Tencent PCG 99 Jan 6, 2023
Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

PPE ✨ Repository for our CVPR'2022 paper: Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-

Zipeng Xu 34 Nov 28, 2022
Official code for ROCA: Robust CAD Model Retrieval and Alignment from a Single Image (CVPR 2022)

ROCA: Robust CAD Model Alignment and Retrieval from a Single Image (CVPR 2022) Code release of our paper ROCA. Check out our video, paper, and website

null 123 Dec 25, 2022
Code for CVPR 2022 paper "Bailando: 3D dance generation via Actor-Critic GPT with Choreographic Memory"

Bailando Code for CVPR 2022 (oral) paper "Bailando: 3D dance generation via Actor-Critic GPT with Choreographic Memory" [Paper] | [Project Page] | [Vi

Li Siyao 237 Dec 29, 2022
When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework (CVPR 2021 oral)

MTLFace This repository contains the PyTorch implementation and the dataset of the paper: When Age-Invariant Face Recognition Meets Face Age Synthesis

Hzzone 120 Jan 5, 2023
CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

LED2-Net This is PyTorch implementation of our CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering". Y

Fu-En Wang 83 Jan 4, 2023
(CVPR 2021) Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds

BRNet Introduction This is a release of the code of our paper Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds,

null 86 Oct 5, 2022
Scale-aware Automatic Augmentation for Object Detection (CVPR 2021)

SA-AutoAug Scale-aware Automatic Augmentation for Object Detection Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia [Paper] [Bi

Jia Research Lab 182 Dec 29, 2022