AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.

Adelaide Intelligent Machines (AIM) Group

Last update: Jan 2, 2023

Related tags

Deep Learning ocr solo text-recognition object-detection text-detection instance-segmentation fcos abcnet adelaidet blendmask meinst solov2 condinst boxinst densecl

Overview

AdelaiDet

AdelaiDet is an open source toolbox for multiple instance-level recognition tasks on top of Detectron2. All instance-level recognition works from our group are open-sourced here.

To date, AdelaiDet implements the following algorithms:

Models

COCO Object Detecton Baselines with FCOS

Name	inf. time	box AP	download
FCOS_R_50_1x	16 FPS	38.7	model
FCOS_MS_R_101_2x	12 FPS	43.1	model
FCOS_MS_X_101_32x8d_2x	6.6 FPS	43.9	model
FCOS_MS_X_101_32x8d_dcnv2_2x	4.6 FPS	46.6	model
FCOS_RT_MS_DLA_34_4x_shtw	52 FPS	39.1	model

More models can be found in FCOS README.md.

COCO Instance Segmentation Baselines with BlendMask

Model	Name	inf. time	box AP	mask AP	download
Mask R-CNN	R_101_3x	10 FPS	42.9	38.6
BlendMask	R_101_3x	11 FPS	44.8	39.5	model
BlendMask	R_101_dcni3_5x	10 FPS	46.8	41.1	model

For more models and information, please refer to BlendMask README.md.

COCO Instance Segmentation Baselines with MEInst

Name	inf. time	box AP	mask AP	download
MEInst_R_50_3x	12 FPS	43.6	34.5	model

For more models and information, please refer to MEInst README.md.

Total_Text results with ABCNet

Name	inf. time	e2e-hmean	det-hmean	download
v1-totaltext	11 FPS	67.1	86.0	model
v2-totaltext	7.7 FPS	71.8	87.2	model

For more models and information, please refer to ABCNet README.md.

COCO Instance Segmentation Baselines with CondInst

Name	inf. time	box AP	mask AP	download
CondInst_MS_R_50_1x	14 FPS	39.7	35.7	model
CondInst_MS_R_50_BiFPN_3x_sem	13 FPS	44.7	39.4	model
CondInst_MS_R_101_3x	11 FPS	43.3	38.6	model
CondInst_MS_R_101_BiFPN_3x_sem	10 FPS	45.7	40.2	model

For more models and information, please refer to CondInst README.md.

Note that:

Inference time for all projects is measured on a NVIDIA 1080Ti with batch size 1.
APs are evaluated on COCO2017 val split unless specified.

Installation

First install Detectron2 following the official guide: INSTALL.md.

Please use Detectron2 with commit id 9eb4831 if you have any issues related to Detectron2.

Then build AdelaiDet with:

git clone https://github.com/aim-uofa/AdelaiDet.git
cd AdelaiDet
python setup.py build develop

If you are using docker, a pre-built image can be pulled with:

docker pull tianzhi0549/adet:latest

Some projects may require special setup, please follow their own README.md in configs.

Quick Start

Inference with Pre-trained Models

Pick a model and its config file, for example, fcos_R_50_1x.yaml.
Download the model wget https://cloudstor.aarnet.edu.au/plus/s/glqFc13cCoEyHYy/download -O fcos_R_50_1x.pth
Run the demo with

python demo/demo.py \
    --config-file configs/FCOS-Detection/R_50_1x.yaml \
    --input input1.jpg input2.jpg \
    --opts MODEL.WEIGHTS fcos_R_50_1x.pth

Train Your Own Models

To train a model with "train_net.py", first setup the corresponding datasets following datasets/README.md, then run:

OMP_NUM_THREADS=1 python tools/train_net.py \
    --config-file configs/FCOS-Detection/R_50_1x.yaml \
    --num-gpus 8 \
    OUTPUT_DIR training_dir/fcos_R_50_1x

To evaluate the model after training, run:

OMP_NUM_THREADS=1 python tools/train_net.py \
    --config-file configs/FCOS-Detection/R_50_1x.yaml \
    --eval-only \
    --num-gpus 8 \
    OUTPUT_DIR training_dir/fcos_R_50_1x \
    MODEL.WEIGHTS training_dir/fcos_R_50_1x/model_final.pth

Note that:

The configs are made for 8-GPU training. To train on another number of GPUs, change the --num-gpus.
If you want to measure the inference time, please change --num-gpus to 1.
We set OMP_NUM_THREADS=1 by default, which achieves the best speed on our machines, please change it as needed.
This quick start is made for FCOS. If you are using other projects, please check the projects' own README.md in configs.

Acknowledgements

The authors are grateful to Nvidia, Huawei Noah's Ark Lab, ByteDance, Adobe who generously donated GPU computing in the past a few years.

Citing AdelaiDet

If you use this toolbox in your research or wish to refer to the baseline results published here, please use the following BibTeX entries:

@misc{tian2019adelaidet,
  author =       {Tian, Zhi and Chen, Hao and Wang, Xinlong and Liu, Yuliang and Shen, Chunhua},
  title =        {{AdelaiDet}: A Toolbox for Instance-level Recognition Tasks},
  howpublished = {\url{https://git.io/adelaidet}},
  year =         {2019}
}

and relevant publications:

@inproceedings{tian2019fcos,
  title     =  {{FCOS}: Fully Convolutional One-Stage Object Detection},
  author    =  {Tian, Zhi and Shen, Chunhua and Chen, Hao and He, Tong},
  booktitle =  {Proc. Int. Conf. Computer Vision (ICCV)},
  year      =  {2019}
}

@article{tian2021fcos,
  title   =  {{FCOS}: A Simple and Strong Anchor-free Object Detector},
  author  =  {Tian, Zhi and Shen, Chunhua and Chen, Hao and He, Tong},
  journal =  {IEEE T. Pattern Analysis and Machine Intelligence (TPAMI)},
  year    =  {2021}
}

@inproceedings{chen2020blendmask,
  title     =  {{BlendMask}: Top-Down Meets Bottom-Up for Instance Segmentation},
  author    =  {Chen, Hao and Sun, Kunyang and Tian, Zhi and Shen, Chunhua and Huang, Yongming and Yan, Youliang},
  booktitle =  {Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR)},
  year      =  {2020}
}

@inproceedings{zhang2020MEInst,
  title     =  {Mask Encoding for Single Shot Instance Segmentation},
  author    =  {Zhang, Rufeng and Tian, Zhi and Shen, Chunhua and You, Mingyu and Yan, Youliang},
  booktitle =  {Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR)},
  year      =  {2020}
}

@inproceedings{liu2020abcnet,
  title     =  {{ABCNet}: Real-time Scene Text Spotting with Adaptive {B}ezier-Curve Network},
  author    =  {Liu, Yuliang and Chen, Hao and Shen, Chunhua and He, Tong and Jin, Lianwen and Wang, Liangwei},
  booktitle =  {Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR)},
  year      =  {2020}
}

@ARTICLE{9525302,
  author={Liu, Yuliang and Shen, Chunhua and Jin, Lianwen and He, Tong and Chen, Peng and Liu, Chongyu and Chen, Hao},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  title={ABCNet v2: Adaptive Bezier-Curve Network for Real-time End-to-end Text Spotting}, 
  year={2021},
  volume={},
  number={},
  pages={1-1},
  doi={10.1109/TPAMI.2021.3107437}
}
  

@inproceedings{wang2020solo,
  title     =  {{SOLO}: Segmenting Objects by Locations},
  author    =  {Wang, Xinlong and Kong, Tao and Shen, Chunhua and Jiang, Yuning and Li, Lei},
  booktitle =  {Proc. Eur. Conf. Computer Vision (ECCV)},
  year      =  {2020}
}

@inproceedings{wang2020solov2,
  title     =  {{SOLOv2}: Dynamic and Fast Instance Segmentation},
  author    =  {Wang, Xinlong and Zhang, Rufeng and Kong, Tao and Li, Lei and Shen, Chunhua},
  booktitle =  {Proc. Advances in Neural Information Processing Systems (NeurIPS)},
  year      =  {2020}
}

@article{wang2021solo,
  title   =  {{SOLO}: A Simple Framework for Instance Segmentation},
  author  =  {Wang, Xinlong and Zhang, Rufeng and Shen, Chunhua and Kong, Tao and Li, Lei},
  journal =  {IEEE T. Pattern Analysis and Machine Intelligence (TPAMI)},
  year    =  {2021}
}

@article{tian2019directpose,
  title   =  {{DirectPose}: Direct End-to-End Multi-Person Pose Estimation},
  author  =  {Tian, Zhi and Chen, Hao and Shen, Chunhua},
  journal =  {arXiv preprint arXiv:1911.07451},
  year    =  {2019}
}

@inproceedings{tian2020conditional,
  title     =  {Conditional Convolutions for Instance Segmentation},
  author    =  {Tian, Zhi and Shen, Chunhua and Chen, Hao},
  booktitle =  {Proc. Eur. Conf. Computer Vision (ECCV)},
  year      =  {2020}
}

@inproceedings{tian2021boxinst,
  title     =  {{BoxInst}: High-Performance Instance Segmentation with Box Annotations},
  author    =  {Tian, Zhi and Shen, Chunhua and Wang, Xinlong and Chen, Hao},
  booktitle =  {Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR)},
  year      =  {2021}
}

@inproceedings{wang2021densecl,
  title     =   {Dense Contrastive Learning for Self-Supervised Visual Pre-Training},
  author    =   {Wang, Xinlong and Zhang, Rufeng and Shen, Chunhua and Kong, Tao and Li, Lei},
  booktitle =   {Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR)},
  year      =   {2021}
}

@inproceedings{Mao2021pose,
  title     =   {{FCPose}: Fully Convolutional Multi-Person Pose Estimation With Dynamic Instance-Aware Convolutions},
  author    =   {Mao, Weian and  Tian, Zhi  and Wang, Xinlong  and Shen, Chunhua},
  booktitle =   {Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR)},
  year      =   {2021}
}

License

For academic use, this project is licensed under the 2-clause BSD License - see the LICENSE file for details. For commercial use, please contact Chunhua Shen.

Comments

Experimental Results of ABCNet on English and Chinese text datasets
@Yuliang-Liu Hi, about ABCNet experimental results on CTW1500 in your paper: "Because the occupation of Chinese text in this dataset is very small, we directly regard all the Chinese text as “unseen” class during training, i.e., the 96-th class." However, if the occupation of Chinese text in one dataset is not ignored, we should enlarge the CTLABELS instead of:

CTLABELS = [' ','!','"','#','$','%','&','\'','(',')','*','+',',','-','.','/','0','1','2','3','4','5','6','7','8','9',':',';','<','=','>','?','@','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z','[','\\',']','^','_','`','a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','{','|','}','~']

In that case, after enlarging CTLABELS, why could not I still recognize Chinese text in the dataset, have I missed anything else?
good first issue
opened by Eurus-Holmes 37

Attempt to Reproduce the Results of CondInst.

Hi~ @tianzhi0549 I want to make sure the shared head architecture of CondInst. Design A

                 --- conv --- conv --- conv --- conv --- cls_pred 
                |       
                |                                        --- ctr_pred 
                |                                       |
FPN features --- --- conv --- conv --- conv --- conv --- --- reg_pred 
                |
                |
                |
                 --- conv --- conv --- conv --- conv --- controller_pred

Design B

                 --- conv --- conv --- conv --- conv --- cls_pred 
                |       
                |                                        --- ctr_pred 
                |                                       |
FPN features --- --- conv --- conv --- conv --- conv --- --- reg_pred 
                                                        |
                                                         --- controller_pred

Which one is right? I found Design B will degradation Box AP and mask AP is also very low. Here is my results for MS-R-50_1x.

Box AP | AP | AP50 | AP75 |
|:------:|:------:|:------:| | 38.269 | 57.210 | 55.405 |

Mask AP | AP | AP50 | AP75 | |:------:|:------:|:------:| | 27.531 | 51.157 | 47.783 |

The Box AP should be higher than 39.5 for MS training(~39.5) & multi-task training(+~1.0). So I think Design B is wrong. It is hard for one branch to handle 3 preds, and the grad from controller_pred degenerate the reg_pred.

opened by Yuxin-CV 16

ABCNet: Transposing Images and Log Messages!
Hi,

I was training a custom data set using ABCNet. The images are in random shape, I mean horizontal, vertical, etc. And maybe so that reason, while start training, I got the following:

transposing image datasets/images/IMG_1.JPG transposing image datasets/images/IMG_2.JPG transposing image datasets/images/IMG_3.JPG transposing image datasets/images/IMG_4.JPG transposing image datasets/images/IMG_5.JPG [07/06 08:59:35 d2.utils.events]: eta: 1:18:16 iter: 59 total_loss: 5.080 rec_loss: 1.508 loss_fcos_cls: 0.591 loss_fcos_loc: 0.556 loss_fcos_ctr: 0.649 loss_fcos_bezier: 1.466 time: 1.0022 data_time: 0.0081 lr: 0.000599 max_mem: 6881M transposing image datasets/images/IMG_6.JPG transposing image datasets/images/IMG_7.JPG transposing image datasets/images/IMG_8.JPG transposing image datasets/images/IMG_9.JPG transposing image datasets/images/IMG_10.JPG transposing image datasets/images/IMG_11.JPG [07/06 08:59:35 d2.utils.events]: eta: 1:18:16 iter: 59 total_loss: 5.080 rec_loss: 1.508 loss_fcos_cls: 0.591 loss_fcos_loc: 0.556 loss_fcos_ctr: 0.649 loss_fcos_bezier: 1.466 time: 1.0022 data_time: 0.0081 lr: 0.000599 max_mem: 6881M

Is it normal? I mean, why printing the log messages like this way! Is the program, auto-rotate the samples for its suitable input, transposing?

Issue 2

Being said that, the log messages:

total_loss rec_loss loss_fcos_cls loss_fcos_loc loss_fcos_ctr loss_fcos_bezier

I understand the first one, recognition loss, total loss simply both recognition loss, and detection loss. What about others, and name convention (fcos)?

Issue 3

And suddenly after training sometimes, the following message appears

AssertionError: The annotation bounding box is outside of the image!

But I've checked bezier_viz output, it looks good to me. What I've missed?

Issue 4

And while evaluating the model, as demonstrate here, the following files,

|_ evaluation | |_ gt_totaltext.zip | |_ gt_ctw1500.zip

in this case, are those containing polygonal annotations or bezier annotations, I mean if we unzip the above files, we txt format annotation file? When I evaluate, I get following

"E2E_RESULTS: precision: 0.5654166666666667, recall: 0.5048363095238095, hmean: 0.5334119496855346" "DETECTION_ONLY_RESULTS: precision: 0.88125, recall: 0.7868303571428571, hmean: 0.8313679245283019"

I think I've missed, but is there any built-in function to plot the prediction on the samples? If so, can you please refer to the code ref, please?

To be honest, for me, the evaluation of scene text recognition is quite complex than usual. Would you please refer to some document that demonstrates scene-text-recognition evaluation protocols. An especially exact match of the word for evaluating text recognition, considering punctuation, etc.

Issue 5

And how the ABCNet split the data set for the training and validation/testing part? In builtin.py files, we set as follows:

_PREDEFINED_SPLITS_TEXT = { "ctw1500_word_train": ("CTW1500/ctwtrain_train_image", "CTW1500/annotations/train.json"), "ctw1500_word_test": ("CTW1500/ctwtest_text_image","CTW1500/annotations/test.json"), }

So, do we need to manually split our train set and test set? And is this test set, were you using it for validation or simply test phase? OR, you trained synthetic samples and fine-tune on the train set of total-text/ctw1500 and lastly evaluate on test set of total-text/ctw1500?
opened by innat 15

Why validate_clockwise_points?

There is a function def validate_clockwise_points(points) in the adet/evaluation/rrc_evaluation_funcs.py:

def validate_clockwise_points(points):
    """
    Validates that the points that the 4 points that dlimite a polygon are in clockwise order.
    """
    
    # if len(points) != 8:
    #     raise Exception("Points list not valid." + str(len(points)))
    
    # point = [
    #             [int(points[0]) , int(points[1])],
    #             [int(points[2]) , int(points[3])],
    #             [int(points[4]) , int(points[5])],
    #             [int(points[6]) , int(points[7])]
    #         ]
    # edge = [
    #             ( point[1][0] - point[0][0])*( point[1][1] + point[0][1]),
    #             ( point[2][0] - point[1][0])*( point[2][1] + point[1][1]),
    #             ( point[3][0] - point[2][0])*( point[3][1] + point[2][1]),
    #             ( point[0][0] - point[3][0])*( point[0][1] + point[3][1])
    # ]
    
    # summatory = edge[0] + edge[1] + edge[2] + edge[3];
    # if summatory>0:
    #     raise Exception("Points are not clockwise. The coordinates of bounding quadrilaterals have to be given in clockwise order. Regarding the correct interpretation of 'clockwise' remember that the image coordinate system used is the standard one, with the image origin at the upper left, the X axis extending to the right and Y axis extending downwards.")
    pts = [(points[j], points[j+1]) for j in range(0,len(points),2)]
    try:
        pdet = Polygon(pts)
    except:
        assert(0), ('not a valid polygon', pts)
    # The polygon should be valid.
    if not pdet.is_valid: 
        assert(0), ('polygon has intersection sides', pts)
    pRing = LinearRing(pts)
    if pRing.is_ccw:
        assert(0),  ("Points are not clockwise. The coordinates of bounding quadrilaterals have to be given in clockwise order. Regarding the correct interpretation of 'clockwise' remember that the image coordinate system used is the standard one, with the image origin at the upper left, the X axis extending to the right and Y axis extending downwards.")

Why need to validate that the points that the 4 points that dlimite a polygon are in clockwise order? Does the order have any effect on the final result?

opened by Eurus-Holmes 14

RuntimeError: Error compiling objects for extension

I am trying to build AdelaiDet from source but I keep getting these errors. When i run python setup.py build develop, the error is as follows： ''' running build running build_py running build_ext building 'adet.C' extension Emitting ninja build file /home/rubyyao/PycharmProjects/AdelaiDet/build/temp.linux-x86_64-3.6/build.ninja... Compiling objects... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/4] /usr/local/cuda-8.0-cudnn6/bin/nvcc -DWITH_CUDA -I/home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/TH -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda-8.0-cudnn6/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/include/python3.6m -c -c /home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc/DefROIAlign/DefROIAlign_cuda.cu -o /home/rubyyao/PycharmProjects/AdelaiDet/build/temp.linux-x86_64-3.6/home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc/DefROIAlign/DefROIAlign_cuda.o -D__CUDA_NO_HALF_OPERATORS_ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_61,code=sm_61 -std=c++14 FAILED: /home/rubyyao/PycharmProjects/AdelaiDet/build/temp.linux-x86_64-3.6/home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc/DefROIAlign/DefROIAlign_cuda.o /usr/local/cuda-8.0-cudnn6/bin/nvcc -DWITH_CUDA -I/home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/TH -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda-8.0-cudnn6/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/include/python3.6m -c -c /home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc/DefROIAlign/DefROIAlign_cuda.cu -o /home/rubyyao/PycharmProjects/AdelaiDet/build/temp.linux-x86_64-3.6/home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc/DefROIAlign/DefROIAlign_cuda.o -D__CUDA_NO_HALF_OPERATORS_ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_61,code=sm_61 -std=c++14 nvcc fatal : Value 'c++14' is not defined for option 'std' [2/4] /usr/local/cuda-8.0-cudnn6/bin/nvcc -DWITH_CUDA -I/home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/TH -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda-8.0-cudnn6/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/include/python3.6m -c -c /home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc/BezierAlign/BezierAlign_cuda.cu -o /home/rubyyao/PycharmProjects/AdelaiDet/build/temp.linux-x86_64-3.6/home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc/BezierAlign/BezierAlign_cuda.o -D__CUDA_NO_HALF_OPERATORS_ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_61,code=sm_61 -std=c++14 FAILED: /home/rubyyao/PycharmProjects/AdelaiDet/build/temp.linux-x86_64-3.6/home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc/BezierAlign/BezierAlign_cuda.o /usr/local/cuda-8.0-cudnn6/bin/nvcc -DWITH_CUDA -I/home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/TH -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda-8.0-cudnn6/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/include/python3.6m -c -c /home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc/BezierAlign/BezierAlign_cuda.cu -o /home/rubyyao/PycharmProjects/AdelaiDet/build/temp.linux-x86_64-3.6/home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc/BezierAlign/BezierAlign_cuda.o -D__CUDA_NO_HALF_OPERATORS_ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_61,code=sm_61 -std=c++14 nvcc fatal : Value 'c++14' is not defined for option 'std' [3/4] /usr/local/cuda-8.0-cudnn6/bin/nvcc -DWITH_CUDA -I/home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/TH -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda-8.0-cudnn6/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/include/python3.6m -c -c /home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc/ml_nms/ml_nms.cu -o /home/rubyyao/PycharmProjects/AdelaiDet/build/temp.linux-x86_64-3.6/home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc/ml_nms/ml_nms.o -D__CUDA_NO_HALF_OPERATORS_ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_61,code=sm_61 -std=c++14 FAILED: /home/rubyyao/PycharmProjects/AdelaiDet/build/temp.linux-x86_64-3.6/home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc/ml_nms/ml_nms.o /usr/local/cuda-8.0-cudnn6/bin/nvcc -DWITH_CUDA -I/home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/TH -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda-8.0-cudnn6/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/include/python3.6m -c -c /home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc/ml_nms/ml_nms.cu -o /home/rubyyao/PycharmProjects/AdelaiDet/build/temp.linux-x86_64-3.6/home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc/ml_nms/ml_nms.o -D__CUDA_NO_HALF_OPERATORS_ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_61,code=sm_61 -std=c++14 nvcc fatal : Value 'c++14' is not defined for option 'std' [4/4] /usr/local/cuda-8.0-cudnn6/bin/nvcc -DWITH_CUDA -I/home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/TH -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda-8.0-cudnn6/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/include/python3.6m -c -c /home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc/cuda_version.cu -o /home/rubyyao/PycharmProjects/AdelaiDet/build/temp.linux-x86_64-3.6/home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc/cuda_version.o -D__CUDA_NO_HALF_OPERATORS_ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_61,code=sm_61 -std=c++14 FAILED: /home/rubyyao/PycharmProjects/AdelaiDet/build/temp.linux-x86_64-3.6/home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc/cuda_version.o /usr/local/cuda-8.0-cudnn6/bin/nvcc -DWITH_CUDA -I/home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/TH -I/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda-8.0-cudnn6/include -I/home/rubyyao/anaconda3/envs/det2_pytorch/include/python3.6m -c -c /home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc/cuda_version.cu -o /home/rubyyao/PycharmProjects/AdelaiDet/build/temp.linux-x86_64-3.6/home/rubyyao/PycharmProjects/AdelaiDet/adet/layers/csrc/cuda_version.o -D__CUDA_NO_HALF_OPERATORS_ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_61,code=sm_61 -std=c++14 nvcc fatal : Value 'c++14' is not defined for option 'std' ninja: build stopped: subcommand failed. Traceback (most recent call last): File "/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 1515, in _run_ninja_build env=env) File "/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/subprocess.py", line 418, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "setup.py", line 89, in cmdclass={"build_ext": torch.utils.cpp_extension.BuildExtension}, File "/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/setuptools/init.py", line 153, in setup return distutils.core.setup(**attrs) File "/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/distutils/core.py", line 148, in setup dist.run_commands() File "/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/distutils/dist.py", line 955, in run_commands self.run_command(cmd) File "/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/distutils/dist.py", line 974, in run_command cmd_obj.run() File "/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/distutils/command/build.py", line 135, in run self.run_command(cmd_name) File "/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/distutils/cmd.py", line 313, in run_command self.distribution.run_command(command) File "/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/distutils/dist.py", line 974, in run_command cmd_obj.run() File "/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 79, in run _build_ext.run(self) File "/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run _build_ext.build_ext.run(self) File "/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/distutils/command/build_ext.py", line 339, in run self.build_extensions() File "/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 649, in build_extensions build_ext.build_extensions(self) File "/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions _build_ext.build_ext.build_extensions(self) File "/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/distutils/command/build_ext.py", line 448, in build_extensions self._build_extensions_serial() File "/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/distutils/command/build_ext.py", line 473, in _build_extensions_serial self.build_extension(ext) File "/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 196, in build_extension _build_ext.build_extension(self, ext) File "/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/distutils/command/build_ext.py", line 533, in build_extension depends=ext.depends) File "/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 478, in unix_wrap_ninja_compile with_cuda=with_cuda) File "/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 1233, in _write_ninja_file_and_compile_objects error_prefix='Error compiling objects for extension') File "/home/rubyyao/anaconda3/envs/det2_pytorch/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 1529, in _run_ninja_build raise RuntimeError(message) RuntimeError: Error compiling objects for extension '''

My environment is as follows ''' _libgcc_mutex 0.1 main https://mirrors.bfsu.edu.cn/anaconda/pkgs/main absl-py 0.12.0 antlr4-python3-runtime 4.8 appdirs 1.4.4 black 21.4b2 blas 1.0 mkl https://mirrors.bfsu.edu.cn/anaconda/pkgs/free cachetools 4.2.2 certifi 2016.2.28 py36_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/free click 8.0.1 cloudpickle 1.6.0 cudatoolkit 10.1.243 h6bb024c_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main cycler 0.10.0 Cython 0.29.23 dataclasses 0.8 detectron2 0.4+cu101 freetype 2.5.5 2 https://mirrors.bfsu.edu.cn/anaconda/pkgs/free future 0.18.2 fvcore 0.1.5.post20210518 google-auth 1.30.1 google-auth-oauthlib 0.4.4 grpcio 1.38.0 hydra-core 1.1.0rc1 importlib-metadata 4.0.1 importlib-resources 5.1.4 intel-openmp 2021.2.0 h06a4308_610 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main iopath 0.1.8 jbig 2.1 0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/free jpeg 9b 0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/free kiwisolver 1.3.1 libgcc-ng 9.1.0 hdf63c60_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main libpng 1.6.30 1 https://mirrors.bfsu.edu.cn/anaconda/pkgs/free libstdcxx-ng 9.1.0 hdf63c60_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main libtiff 4.0.6 3 https://mirrors.bfsu.edu.cn/anaconda/pkgs/free Markdown 3.3.4 matplotlib 3.3.4 mkl 2020.2 256 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main mkl-service 2.3.0 py36he8ac12f_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main mkl_fft 1.3.0 py36h54f3939_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main mkl_random 1.1.1 py36h0573a6f_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main mypy-extensions 0.4.3 ninja 1.7.2 0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/free numpy 1.19.2 py36h54aff64_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main numpy-base 1.19.2 py36hfa32c7d_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/main oauthlib 3.1.0 olefile 0.44 py36_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/free omegaconf 2.1.0rc1 opencv-python 4.5.2.52 openssl 1.0.2l 0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/free pathspec 0.8.1 pillow 4.2.1 py36_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/free pip 9.0.1 py36_1 https://mirrors.bfsu.edu.cn/anaconda/pkgs/free pip 21.1.2 portalocker 2.3.0 protobuf 3.17.1 pyasn1 0.4.8 pyasn1-modules 0.2.8 pycocotools 2.0.2 pydot 1.4.2 pyparsing 2.4.7 python 3.6.2 0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/free python-dateutil 2.8.1 pytorch 1.6.0 py3.6_cuda10.1.243_cudnn7.6.3_0 pytorch PyYAML 5.4.1 readline 6.2 2 https://mirrors.bfsu.edu.cn/anaconda/pkgs/free regex 2021.4.4 requests-oauthlib 1.3.0 rsa 4.7.2 setuptools 36.4.0 py36_1 https://mirrors.bfsu.edu.cn/anaconda/pkgs/free setuptools 57.0.0 six 1.10.0 py36_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/free sqlite 3.13.0 0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/free tabulate 0.8.9 tensorboard 2.5.0 tensorboard-data-server 0.6.1 tensorboard-plugin-wit 1.8.0 termcolor 1.1.0 tk 8.5.18 0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/free toml 0.10.2 torchvision 0.7.0 py36_cu101 pytorch typed-ast 1.4.3 typing-extensions 3.10.0.0 Werkzeug 2.0.1 wheel 0.29.0 py36_0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/free xz 5.2.3 0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/free yacs 0.1.8 zipp 3.4.1 zlib 1.2.11 0 https://mirrors.bfsu.edu.cn/anaconda/pkgs/free '''

opened by jieruyao49 13
_train_loader_from_config() takes 1 positional argument but 2 were given

Traceback (most recent call last): File "tools/train_net.py", line 237, in launch( File "/root/anaconda3/lib/python3.8/site-packages/detectron2/engine/launch.py", line 62, in launch main_func(*args) File "tools/train_net.py", line 225, in main trainer = Trainer(cfg) File "tools/train_net.py", line 62, in init data_loader = self.build_train_loader(cfg) File "tools/train_net.py", line 128, in build_train_loader return build_detection_train_loader(cfg, mapper) File "/root/anaconda3/lib/python3.8/site-packages/detectron2/config/config.py", line 201, in wrapped explicit_args = _get_args_from_config(from_config, *args, **kwargs) File "/root/anaconda3/lib/python3.8/site-packages/detectron2/config/config.py", line 236, in _get_args_from_config ret = from_config_func(*args, **kwargs) TypeError: _train_loader_from_config() takes 1 positional argument but 2 were given

When I first run the code, it comes up, How could I fix it? Thx.

opened by wincle 13
Problem of training ABCNet

Training with custom datasets, programing pause more than 10m with fellowing log, haven't print training meg. I have prepare dataset via example script and checked carefully with outputed dataset. And i try to figure out what wrong with it, but i dont figure out the problem as there aren't error msg. May one have any idea about this problem?

Part of training log:

06/15 14:38:35 adet.data.datasets.text]: Loaded 476 images in COCO format from datasets/hw/annotations/train.json [06/15 14:38:35 d2.data.build]: Removed 0 images with no usable annotations. 476 images left. [06/15 14:38:35 d2.data.build]: Distribution of instances among all 1 categories: | category | #instances | |:----------:|:-------------| | text | 6436 | | | |

[06/15 14:38:35 d2.data.common]: Serializing 476 elements to byte tensors and concatenating them all ... [06/15 14:38:35 d2.data.common]: Serialized dataset takes 2.66 MiB [06/15 14:38:35 d2.data.build]: Using training sampler TrainingSampler [06/15 14:38:35 fvcore.common.checkpoint]: Loading checkpoint from pretrained/ctw1500_attn_R_50.pth [06/15 14:38:35 adet.trainer]: Starting training from iteration 0
good first issue

opened by chenyangMl 13
A question about ABCnet

Hello, I have a question about ABCnet.

Here is the sentences before 3. Experiments: "Note that during training, we directly use the generated Bezier curve GT to extract the RoI features. Therefore the detection branch does not affect the recognition branch. In the inference phase, the RoI region is replaced by the detecting Bezier curve described in Section 2.1."

Do you mean that the detection branch and the recognition are separated during training? Is ABCnet end-to-end? I don't know the total loss of ABCnet.

I would be very appreciated if you can answer me. Thank you!

opened by dy1998 13
ABCNet training has a data loss problem

@Yuliang-Liu Hi, I am training ABCNet with custom datasets. After inferencing is done, saving results to output/batext/ctw1500/attn_R_50/inference/text_results.json. However, it occurred a data loss problem. I would have 20000 images, but the det.zip only has 11951 images, as the same with the text_results.json. So it raised Exception("The sample %s not present in GT" %k), the numbers of those samples is just opposite with the det.zip, it's 8049. Why removed some data?

opened by Eurus-Holmes 12

ABCNet training ERROR with custom datasets

Hi, I am training with custom datasets, following this issue.

@shuangyichen @Yuliang-Liu Running train_net.py use command "OMP_NUM_THREADS=1 python tools/train_net.py --config-file configs/BAText/TotalText/attn_R_50.yaml --num-gpus 1"

dataset arch: datasets

mydataset
- annotations
  - train.json
- train_img
  - img_1.jpg
  - img_2.jpg

specify train img and annotations in "builtin.py": "mydataset_train":("mydataset/train_img","mydataset/annotations/train.json")

specify train config in "configs/BAText/TotalText/Base-TotalText.yaml" DATASETS: TRAIN: ("mydataset_train",) TEST: ("mydataset_train",)

Originally posted by @chenyangMl in https://github.com/aim-uofa/AdelaiDet/issues/100#issuecomment-644056170

But it occurred an error:

Traceback (most recent call last):
  File "tools/train_net.py", line 243, in <module>
    args=(args,),
  File "./AdelaiDet/env/lib/python3.7/site-packages/detectron2/engine/launch.py", line 57, in launch
    main_func(*args)
  File "tools/train_net.py", line 231, in main
    return trainer.train()
  File "tools/train_net.py", line 113, in train
    self.train_loop(self.start_iter, self.max_iter)
  File "tools/train_net.py", line 102, in train_loop
    self.run_step()
  File "./AdelaiDet/env/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 228, in run_step
    losses.backward()
  File "./AdelaiDet/env/lib/python3.7/site-packages/torch/tensor.py", line 198, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "./AdelaiDet/env/lib/python3.7/site-packages/torch/autograd/__init__.py", line 100, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED (_cudnn_rnn_backward_input at /pytorch/aten/src/ATen/native/cudnn/RNN.cpp:931)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x46 (0x7f0388a65536 in ./AdelaiDet/env/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0xf55aa7 (0x7f0389e16aa7 in ./AdelaiDet/env/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #2: at::native::_cudnn_rnn_backward(at::Tensor const&, c10::ArrayRef<at::Tensor>, long, at::Tensor const&, at::Tensor const&, at::Tensor const&, at::Tensor const&, at::Tensor const&, at::Tensor const&, at::Tensor const&, long, long, long, bool, double, bool, bool, c10::ArrayRef<long>, at::Tensor const&, at::Tensor const&, std::array<bool, 4ul>) + 0x1a9 (0x7f0389e18db9 in ./AdelaiDet/env/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #3: <unknown function> + 0xfdab4d (0x7f0389e9bb4d in ./AdelaiDet/env/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #4: <unknown function> + 0xfdc2e3 (0x7f0389e9d2e3 in ./AdelaiDet/env/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #5: <unknown function> + 0x2b08450 (0x7f03c327b450 in ./AdelaiDet/env/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #6: <unknown function> + 0x2b7b8a3 (0x7f03c32ee8a3 in ./AdelaiDet/env/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #7: torch::autograd::generated::CudnnRnnBackward::apply(std::vector<at::Tensor, std::allocator<at::Tensor> >&&) + 0x708 (0x7f03c302fd28 in ./AdelaiDet/env/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #8: <unknown function> + 0x2d89c05 (0x7f03c34fcc05 in ./AdelaiDet/env/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #9: torch::autograd::Engine::evaluate_function(std::shared_ptr<torch::autograd::GraphTask>&, torch::autograd::Node*, torch::autograd::InputBuffer&) + 0x16f3 (0x7f03c34f9f03 in ./AdelaiDet/env/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #10: torch::autograd::Engine::thread_main(std::shared_ptr<torch::autograd::GraphTask> const&, bool) + 0x3d2 (0x7f03c34face2 in ./AdelaiDet/env/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #11: torch::autograd::Engine::thread_init(int) + 0x39 (0x7f03c34f3359 in ./AdelaiDet/env/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #12: torch::autograd::python::PythonEngine::thread_init(int) + 0x38 (0x7f03cfc32828 in ./AdelaiDet/env/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #13: <unknown function> + 0xee0f (0x7f03d081ee0f in ./AdelaiDet/env/lib/python3.7/site-packages/torch/_C.cpython-37m-x86_64-linux-gnu.so)
frame #14: <unknown function> + 0x76ba (0x7f03d2ec46ba in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #15: clone + 0x6d (0x7f03d2bfa41d in /lib/x86_64-linux-gnu/libc.so.6)

However, I could run the ABCNet demo successfully (without changing anything). So, what is happening to it?

opened by Eurus-Holmes 12

No predictions from the model! (CondInst)

Hi,

I am using CondInst for Multi-class Instance Segmentation for a custom dataset.

Here is what the the log for training looks like from the bottom.

.
.
.
[01/04 08:30:47 fvcore.common.checkpoint]: Saving checkpoint to training_dir/CondInst_MS_R_50_1x/model_final.pth
[01/04 08:30:48 d2.utils.events]:  eta: 0:00:00  iter: 8  total_loss: 3.187  loss_fcos_cls: 1.137  loss_fcos_loc: 0.3397  
loss_fcos_ctr: 0.7088  loss_mask: 0.9942  time: 1.6907  data_time: 0.1392  lr: 8.992e-08  max_mem: 6693M
[01/04 08:30:48 d2.engine.hooks]: Overall training speed: 6 iterations in 0:00:11 (1.9725 s / it)
[01/04 08:30:48 d2.engine.hooks]: Total training time: 0:00:13 (0:00:01 on hooks)
.
.
.
[01/04 08:30:59 d2.evaluation.evaluator]: Start inference on 1024 images
/content/AdelaiDet/adet/modeling/fcos/fcos_outputs.py:460: UserWarning: This overload of nonzero is deprecated:
	nonzero()
Consider using one of the following signatures instead:
	nonzero(*, bool as_tuple) (Triggered internally at  /pytorch/torch/csrc/utils/python_arg_parser.cpp:882.)
  per_candidate_nonzeros = per_candidate_inds.nonzero()
[01/04 08:31:01 d2.evaluation.evaluator]: Inference done 11/1024. 0.0762 s / img. ETA=0:01:18
[01/04 08:31:06 d2.evaluation.evaluator]: Inference done 75/1024. 0.0768 s / img. ETA=0:01:14
[01/04 08:31:11 d2.evaluation.evaluator]: Inference done 139/1024. 0.0770 s / img. ETA=0:01:09
[01/04 08:31:16 d2.evaluation.evaluator]: Inference done 203/1024. 0.0771 s / img. ETA=0:01:04
[01/04 08:31:21 d2.evaluation.evaluator]: Inference done 267/1024. 0.0772 s / img. ETA=0:00:59
[01/04 08:31:26 d2.evaluation.evaluator]: Inference done 331/1024. 0.0773 s / img. ETA=0:00:54
[01/04 08:31:31 d2.evaluation.evaluator]: Inference done 394/1024. 0.0775 s / img. ETA=0:00:49
[01/04 08:31:36 d2.evaluation.evaluator]: Inference done 457/1024. 0.0777 s / img. ETA=0:00:44
[01/04 08:31:41 d2.evaluation.evaluator]: Inference done 520/1024. 0.0778 s / img. ETA=0:00:39
[01/04 08:31:46 d2.evaluation.evaluator]: Inference done 583/1024. 0.0780 s / img. ETA=0:00:34
[01/04 08:31:51 d2.evaluation.evaluator]: Inference done 646/1024. 0.0781 s / img. ETA=0:00:30
[01/04 08:31:56 d2.evaluation.evaluator]: Inference done 709/1024. 0.0782 s / img. ETA=0:00:25
[01/04 08:32:01 d2.evaluation.evaluator]: Inference done 771/1024. 0.0783 s / img. ETA=0:00:20
[01/04 08:32:06 d2.evaluation.evaluator]: Inference done 833/1024. 0.0784 s / img. ETA=0:00:15
[01/04 08:32:11 d2.evaluation.evaluator]: Inference done 895/1024. 0.0785 s / img. ETA=0:00:10
[01/04 08:32:16 d2.evaluation.evaluator]: Inference done 957/1024. 0.0786 s / img. ETA=0:00:05
[01/04 08:32:21 d2.evaluation.evaluator]: Inference done 1019/1024. 0.0787 s / img. ETA=0:00:00
[01/04 08:32:22 d2.evaluation.evaluator]: Total inference time: 0:01:21.761356 (0.080237 s / img per device, on 1 devices)
[01/04 08:32:22 d2.evaluation.evaluator]: Total inference pure compute time: 0:01:20 (0.078751 s / img per device, on 1 devices)
[01/04 08:32:22 d2.evaluation.coco_evaluation]: Preparing results for COCO format ...
[01/04 08:32:22 d2.evaluation.coco_evaluation]: Saving results to 
training_dir/CondInst_MS_R_50_1x/inference/coco_instances_results.json
[01/04 08:32:22 d2.evaluation.coco_evaluation]: Evaluating predictions with unofficial COCO API...
WARNING [01/04 08:32:22 d2.evaluation.coco_evaluation]: No predictions from the model!
WARNING [01/04 08:32:22 d2.evaluation.coco_evaluation]: No predictions from the model!
[01/04 08:32:22 d2.engine.defaults]: Evaluation results for data_in_mscoco_format_test in csv format:
[01/04 08:32:22 d2.evaluation.testing]: copypaste: Task: bbox
[01/04 08:32:22 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl
[01/04 08:32:22 d2.evaluation.testing]: copypaste: nan,nan,nan,nan,nan,nan
[01/04 08:32:22 d2.evaluation.testing]: copypaste: Task: segm
[01/04 08:32:22 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl
[01/04 08:32:22 d2.evaluation.testing]: copypaste: nan,nan,nan,nan,nan,nan
[01/04 08:32:22 d2.utils.events]:  eta: 0:00:00  iter: 8  total_loss: 3.187  loss_fcos_cls: 1.137  loss_fcos_loc: 0.3397  
loss_fcos_ctr: 0.7088  loss_mask: 0.9942  time: 1.6907  data_time: 0.1392  lr: 8.992e-08  max_mem: 6693M

opened by zeeshanalipanhwar 11

想问一下数BoxInst做自己的数据集时候，用labelme标注同一类别不同实例如何命名呢？I would like to ask how to name different instances of the same category with labelme when using BoxInst as your own data set?

我标注为cat1，cat2在注册自己数据集 {"color": [220, 20, 60], "isthing": 1, "id": 1, "name": "cat1"}, {"color": [119, 11, 32], "isthing": 1, "id": 2, "name": "cat2"},
还是 {"color": [119, 11, 32], "isthing": 1, "id": 1, "name": "cat"}呢？

I marked it as cat1, and cat2 is registering its own dataset {"color": [220, 20, 60], "istthing": 1, "id": 1, "name": "cat1"}, {"color": [119, 11, 32], "isthing": 1, "id": 2, "name": "cat2"},

OR {"color": [119, 11, 32], "istthing": 1, "id": 1, "name": "cat"}?

opened by Midlesun 0
63/5000 I got an error CUDA error: device-side assert triggered when I trained my own dataset in BoxInst

def mul(a : float, b : Tensor) -> Tensor: return b * a ~~~~~ <--- HERE def add(a : float, b : Tensor) -> Tensor: return b + a RuntimeError: CUDA error: device-side assert triggered e /cuda/IndexKernel.cu:142: block: [0,0,0], thread: [33,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. C:/cb/pytorch_1000000000000/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: block: [0,0,0], thread: [34,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. C:/cb/pytorch_1000000000000/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: block: [0,0,0], thread: [35,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. C:/cb/pytorch_1000000000000/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: block: [0,0,0], thread: [36,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. C:/cb/pytorch_1000000000000/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: block: [0,0,0], thread: [37,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. C:/cb/pytorch_1000000000000/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: block: [0,0,0], thread: [38,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. C:/cb/pytorch_1000000000000/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: block: [0,0,0], thread: [39,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. C:/cb/pytorch_1000000000000/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: block: [0,0,0], thread: [40,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. C:/cb/pytorch_1000000000000/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: block: [0,0,0], thread: [41,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. C:/cb/pytorch_1000000000000/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: block: [0,0,0], thread: [42,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. C:/cb/pytorch_1000000000000/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: block: [0,0,0], thread: [43,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.

My data set only has 2 types of items. I modified _C.MODEL.FCOS.NUM_CLASSES=2 in defaults.py and registered my own data set in builtin and builtin_meta. This error occurred during training RuntimeError: CUDA error: device-side assert triggered /cuda/IndexKernel.cu:142: block: [0,0,0], thread: [33,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.

opened by Midlesun 1
fix recording bug in after_train function

When I evaluate model in the after_train function and record results, the result will not be recorded in Tensorboard and metrics.json

This is because detectron2.utils.events.TensorboardXWriter and JSONWriter will check the trainer.iter, and each iter will only record once. If you do not set trainer.iter+1 before executing after_train, the execution result will not be recorded, because the current trainer.iter has been used in the last step

So I fixed this issue in train_loop function

opened by Coolshanlan 0
Instance Segmentation BlendMask on Android

Is it possible to optimize the model BlendMask for use on android?

I found examples for semantic segmentation by PyTorch (https://pytorch.org/tutorials/beginner/deeplabv3_on_android.html)

What parameters need to be taken into account in order to understand whether this can be implemented?

opened by Maria-Volkova98 0
Bump pillow from 8.1.1 to 9.3.0 in /docs
Bumps pillow from 8.1.1 to 9.3.0.

Release notes

Sourced from pillow's releases.

9.3.0

https://pillow.readthedocs.io/en/stable/releasenotes/9.3.0.html

Changes

Initialize libtiff buffer when saving #6699 [@radarhere]

Limit SAMPLESPERPIXEL to avoid runtime DOS #6700 [@wiredfool]

Inline fname2char to fix memory leak #6329 [@nulano]

Fix memory leaks related to text features #6330 [@nulano]

Use double quotes for version check on old CPython on Windows #6695 [@hugovk]

GHA: replace deprecated set-output command with GITHUB_OUTPUT file #6697 [@nulano]

Remove backup implementation of Round for Windows platforms #6693 [@cgohlke]

Upload fribidi.dll to GitHub Actions #6532 [@nulano]

Fixed set_variation_by_name offset #6445 [@radarhere]

Windows build improvements #6562 [@nulano]

Fix malloc in _imagingft.c:font_setvaraxes #6690 [@cgohlke]

Only use ASCII characters in C source file #6691 [@cgohlke]

Release Python GIL when converting images using matrix operations #6418 [@hmaarrfk]

Added ExifTags enums #6630 [@radarhere]

Do not modify previous frame when calculating delta in PNG #6683 [@radarhere]

Added support for reading BMP images with RLE4 compression #6674 [@npjg]

Decode JPEG compressed BLP1 data in original mode #6678 [@radarhere]

pylint warnings #6659 [@marksmayo]

Added GPS TIFF tag info #6661 [@radarhere]

Added conversion between RGB/RGBA/RGBX and LAB #6647 [@radarhere]

Do not attempt normalization if mode is already normal #6644 [@radarhere]

Fixed seeking to an L frame in a GIF #6576 [@radarhere]

Consider all frames when selecting mode for PNG save_all #6610 [@radarhere]

Don't reassign crc on ChunkStream close #6627 [@radarhere]

Raise a warning if NumPy failed to raise an error during conversion #6594 [@radarhere]

Only read a maximum of 100 bytes at a time in IMT header #6623 [@radarhere]

Show all frames in ImageShow #6611 [@radarhere]

Allow FLI palette chunk to not be first #6626 [@radarhere]

If first GIF frame has transparency for RGB_ALWAYS loading strategy, use RGBA mode #6592 [@radarhere]

Round box position to integer when pasting embedded color #6517 [@radarhere]

Removed EXIF prefix when saving WebP #6582 [@radarhere]

Pad IM palette to 768 bytes when saving #6579 [@radarhere]

Added DDS BC6H reading #6449 [@ShadelessFox]

Added support for opening WhiteIsZero 16-bit integer TIFF images #6642 [@JayWiz]

Raise an error when allocating translucent color to RGB palette #6654 [@jsbueno]

Moved mode check outside of loops #6650 [@radarhere]

Added reading of TIFF child images #6569 [@radarhere]

Improved ImageOps palette handling #6596 [@PososikTeam]

Defer parsing of palette into colors #6567 [@radarhere]

Apply transparency to P images in ImageTk.PhotoImage #6559 [@radarhere]

Use rounding in ImageOps contain() and pad() #6522 [@bibinhashley]

Fixed GIF remapping to palette with duplicate entries #6548 [@radarhere]

Allow remap_palette() to return an image with less than 256 palette entries #6543 [@radarhere]

Corrected BMP and TGA palette size when saving #6500 [@radarhere]

... (truncated)

Changelog

Sourced from pillow's changelog.

9.3.0 (2022-10-29)

Limit SAMPLESPERPIXEL to avoid runtime DOS #6700 [wiredfool]

Initialize libtiff buffer when saving #6699 [radarhere]

Inline fname2char to fix memory leak #6329 [nulano]

Fix memory leaks related to text features #6330 [nulano]

Use double quotes for version check on old CPython on Windows #6695 [hugovk]

Remove backup implementation of Round for Windows platforms #6693 [cgohlke]

Fixed set_variation_by_name offset #6445 [radarhere]

Fix malloc in _imagingft.c:font_setvaraxes #6690 [cgohlke]

Release Python GIL when converting images using matrix operations #6418 [hmaarrfk]

Added ExifTags enums #6630 [radarhere]

Do not modify previous frame when calculating delta in PNG #6683 [radarhere]

Added support for reading BMP images with RLE4 compression #6674 [npjg, radarhere]

Decode JPEG compressed BLP1 data in original mode #6678 [radarhere]

Added GPS TIFF tag info #6661 [radarhere]

Added conversion between RGB/RGBA/RGBX and LAB #6647 [radarhere]

Do not attempt normalization if mode is already normal #6644 [radarhere]

... (truncated)

Commits

d594f4c Update CHANGES.rst [ci skip]

909dc64 9.3.0 version bump

1a51ce7 Merge pull request #6699 from hugovk/security-libtiff_buffer

2444cdd Merge pull request #6700 from hugovk/security-samples_per_pixel-sec

744f455 Added release notes

0846bfa Add to release notes

799a6a0 Fix linting

00b25fd Hide UserWarning in logs

05b175e Tighter test case

13f2c5a Prevent DOS with large SAMPLESPERPIXEL in Tiff IFD

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0