deep learning for image processing including classification and object-detection etc.

Overview

深度学习在图像处理中的应用教程

前言

  • 本教程是对本人研究生期间的研究内容进行整理总结,总结的同时也希望能够帮助更多的小伙伴。后期如果有学习到新的知识也会与大家一起分享。
  • 本教程会以视频的方式进行分享,教学流程如下:
    1)介绍网络的结构与创新点
    2)使用Pytorch进行网络的搭建与训练
    3)使用Tensorflow(内部的keras模块)进行网络的搭建与训练
  • 课程中所有PPT都放在course_ppt文件夹下,需要的自行下载。

教程目录,点击跳转相应视频(后期会根据学习内容增加)

更多相关视频请进入我的bilibili频道查看


所需环境

  • Anaconda3(建议使用)
  • python3.6/3.7/3.8
  • pycharm (IDE)
  • pytorch 1.7.1 (pip package)
  • torchvision 0.8.1 (pip package)
  • tensorflow 2.4.1 (pip package)

欢迎大家关注下我的微信公众号(阿喆学习小记),平时会总结些相关学习博文。

如果有什么问题,也可以到我的CSDN中一起讨论。 https://blog.csdn.net/qq_37541097/article/details/103482003

我的bilibili频道: https://space.bilibili.com/18161609/channel/index

Comments
  • 为了得到你的许可

    为了得到你的许可

    非常抱歉打扰您,由于不知道您的联系方式,只能以这样的方式来征得您的同意。我写的论文里用了您的SSD和Faster rcnn代码做实验,我将在我的代码里公开我的代码与我的实验数据。代码链接会放您的。非常感谢您的代码以及视频讲解,帮助我很多。希望你能同意。谢谢你(哔哩哔哩也有私信过您)。如果您同意的话,请记得回复我一下。再次感谢您。

    opened by Saya520r 14
  • FasterRCNN 训练错误

    FasterRCNN 训练错误

    System information

    • Have I written custom code: no
    • OS Platform(e.g., window10 or Linux Ubuntu 16.04): linux
    • Python version: 3.8
    • Deep learning framework and version(e.g., Tensorflow2.1 or Pytorch1.3): torch1.6
    • Use GPU or not: yes
    • CUDA/cuDNN version(if you use GPU):
    • The network you trained(e.g., Resnet34 network): resnet50fpn

    Describe the current behavior 您好,用faster_rcnn训练自己的数据集,一共六种物体,create model设置的num_classes=7,但是还是出现了这个错误。其他没有改过,求教该怎么解决呀?

    Error info / logs

    Namespace(batch_size=8, data_path='/research/dept8/qdou/zwang/data/robo/final', device='cuda:0', epochs=50, output_dir='./save_weights', resume='', start_epoch=0)
    Using cuda device training.
    Using 8 dataloader workers
    /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [3,0,0], thread: [82,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
    /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [3,0,0], thread: [83,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
    /pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [3,0,0], thread: [84,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
    Traceback (most recent call last):
      File "train_res50_fpn.py", line 167, in <module>
        main(args)
      File "train_res50_fpn.py", line 99, in main
        utils.train_one_epoch(model, optimizer, train_data_loader,
      File "/research/dept8/qdou/zwang/faster_rcnn/train_utils/train_eval_utils.py", line 34, in train_one_epoch
        loss_dict = model(images, targets)
      File "/research/dept8/qdou/zwang/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/research/dept8/qdou/zwang/faster_rcnn/network_files/faster_rcnn_framework.py", line 93, in forward
        detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets)
      File "/research/dept8/qdou/zwang/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/research/dept8/qdou/zwang/faster_rcnn/network_files/roi_head.py", line 367, in forward
        proposals, matched_idxs, labels, regression_targets = self.select_training_samples(proposals, targets)
      File "/research/dept8/qdou/zwang/faster_rcnn/network_files/roi_head.py", line 222, in select_training_samples
        matched_idxs, labels = self.assign_targets_to_proposals(proposals, gt_boxes, gt_labels)
      File "/research/dept8/qdou/zwang/faster_rcnn/network_files/roi_head.py", line 144, in assign_targets_to_proposals
        labels_in_image[bg_inds] = 0
    RuntimeError: copy_if failed to synchronize: cudaErrorAssert: device-side assert triggered
    
    opened by Kyfafyd 14
  • 关于FCN网络中miou为0的问题

    关于FCN网络中miou为0的问题

    up主您好,想请教您一个问题,就是我在用FCN网络做医学肿瘤分割时,输出的结果文档第二类的miou始终为0,具体是下面这个样子: [epoch: 7] train_loss: 0.00193 lr: 0.00780 global correct: 99.8 average row correct: ['100.0', '0.0'] IoU: ['99.8', '0.0'] mean IoU: 49.9

    我已经做了以下修改: *未载入resnet50预训练权重 *将初始学习率修改为0.001或0.01

    并且我发现train_loss在不断下降,下降的速率不快,想请教一下为什么会这样以及可能的解决方法

    opened by xurui-111 10
  • MobileNetV2 训练报错

    MobileNetV2 训练报错

    System information

    • Have I written custom code: NO
    • OS Platform(e.g., window10 or Linux Ubuntu 16.04): MacOS Big Sur
    • Python version: 3.9.5
    • Deep learning framework and version(e.g., Tensorflow2.1 or Pytorch1.3): Pytorch 1.9
    • Use GPU or not: Not
    • CUDA/cuDNN version(if you use GPU):
    • The network you trained(e.g., Resnet34 network): MobileNetV2

    Describe the current behavior

    Error info / logs 截屏2021-07-04 下午11 37 17

    opened by weiqingtangx 10
  • 在使用retinanet进行多GPU训练时报错

    在使用retinanet进行多GPU训练时报错

    导师好!(狗头) 我在retinanet的backbone上面进行了修改,添加了cbam模块,使用单GPU训练正常,不会报错。但是使用多GPU却不行,我翻译了下大概是参数回传的问题,网上查了下也没搞清楚,可以帮忙看下吗??其实这个情况我在跑ssd时候也是这个错误,就没管,没想到这里又出错了…… 报错信息如下: Start training /home/lb511/anaconda3/envs/lhaozz/lib/python3.9/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1640811803361/work/aten/src/ATen/native/TensorShape.cpp:2157.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] /home/lb511/anaconda3/envs/lhaozz/lib/python3.9/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1640811803361/work/aten/src/ATen/native/TensorShape.cpp:2157.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]

    Epoch: [0] [ 0/132] eta: 0:03:01.012762 lr: 0.000173 loss: 1.8379 (1.8379) bbox_regression: 0.6653 (0.6653) classification: 1.1726 (1.1726) time: 1.3713 data: 0.3985 max mem: 8638 Traceback (most recent call last): File "/home/lhaozz/hand_retinanet/train_multi_GPU.py", line 260, in main(args) File "/home/lhaozz/hand_retinanet/train_multi_GPU.py", line 141, in main mean_loss, lr = utils.train_one_epoch(model, optimizer, data_loader, File "/home/lhaozz/hand_retinanet/train_utils/train_eval_utils.py", line 33, in train_one_epoch loss_dict = model(images, targets) File "/home/lb511/anaconda3/envs/lhaozz/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/lb511/anaconda3/envs/lhaozz/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 873, in forward if torch.is_grad_enabled() and self.reducer._rebuild_buckets():

    RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by passing the keyword argument find_unused_parameters=True to torch.nn.parallel.DistributedDataParallel, and by making sure all forward function outputs participate in calculating loss. If you already have done the above, then the distributed data parallel module wasn't able to locate the output tensors in the return value of your module's forward function. Please include the loss function and the structure of the return value of forward of your module when reporting this issue (e.g. list, dict, iterable). Parameter indices which did not receive grad for rank 0: 12 13 14 15 25 26 27 28 38 39 40 41 51 52 53 54 67 68 69 70 80 81 82 83 93 94 95 96 106 107 108 109 119 120 121 122 132 133 134 135 148 149 150 151 161 162 163 164 174 175 176 177 In addition, you can set the environment variable TORCH_DISTRIBUTED_DEBUG to either INFO or DETAIL to print out information about which particular parameters did not receive gradient on this rank as part of this error

    顺便问个小问题,混合精度训练我使用的是 两张3080,cpu是r9 5950x 16核心,,平时是一般打开就行吗?

    opened by LKssssZz 9
  • CUDA version

    CUDA version

    System information

    • Have I written custom code: No
    • OS Platform(e.g., window10 or Linux Ubuntu 16.04): Ubuntu 16.04.6 LTS
    • Python version: 3.7.10
    • Deep learning framework and version(e.g., Tensorflow2.1 or Pytorch1.3): Pytorch 1.6.0
    • Use GPU or not: No
    • CUDA/cuDNN version(if you use GPU): CUDA Version 10.1.243
    • The network you trained(e.g., Resnet34 network): pytorch_object_detection/faster_rcnn/train_res50_fpn.py

    Describe the current behavior May I ask what version of CUDA is needed for this project? Will CUDA 10.1 not work?

    Error info / logs AssertionError: The NVIDIA driver on your system is too old (found version 10010). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver.

    opened by zmz125 9
  • mismatch for inception3a.branch3.1.conv.weight: copying a param with shape torch.Size([32, 16, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 16, 5, 5]).

    mismatch for inception3a.branch3.1.conv.weight: copying a param with shape torch.Size([32, 16, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 16, 5, 5]).

    https://github.com/WZMIAOMIAO/deep-learning-for-image-processing/blob/87a31d3db4baa5a693c59cab66cb232357c326c9/pytorch_classification/Test4_googlenet/train.py#L65 您好,我加载了googlenet的预训练权重会出现标题所指问题,请问如何解决?如果我修改branch3的kernal size为3 则则会出现RuntimeError: Sizes of tensors must match except in dimension 2. Got 28 and 30 (The offending index is 2)

    opened by wang66624 8
  • predict.py 运行报错

    predict.py 运行报错

    System information

    • Have I written custom code:
    • OS Platform(e.g., window10 or Linux Ubuntu 16.04):
    • Python version:
    • Deep learning framework and version(e.g., Tensorflow2.1 or Pytorch1.3):
    • Use GPU or not:
    • CUDA/cuDNN version(if you use GPU):
    • The network you trained(e.g., Resnet34 network):

    Describe the current behavior

    Error info / logs

    opened by punk1 7
  • RuntimeError: Trying to pass too many CPU scalars to CUDA kernel!

    RuntimeError: Trying to pass too many CPU scalars to CUDA kernel!

    Thanks for sharing you code . when I run 'python train_mobilenet.py ',I meet the problem.How I can do to solve the error!

    Traceback (most recent call last): File "train_mobilenet.py", line 157, in main() File "train_mobilenet.py", line 91, in main train_loss=train_loss, train_lr=learning_rate) File "/home/dl/桌面/faster_rcnn/train_utils/train_eval_utils.py", line 33, in train_one_epoch loss_dict = model(images, targets) File "/home/dl/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, **kwargs) File "/home/dl/桌面/faster_rcnn/network_files/faster_rcnn_framework.py", line 87, in forward proposals, proposal_losses = self.rpn(images, features, targets) File "/home/dl/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, **kwargs) File "/home/dl/桌面/faster_rcnn/network_files/rpn_function.py", line 615, in forward labels, matched_gt_boxes = self.assign_targets_to_anchors(anchors, targets) File "/home/dl/桌面/faster_rcnn/network_files/rpn_function.py", line 410, in assign_targets_to_anchors matched_idxs = self.proposal_matcher(match_quality_matrix) File "/home/dl/桌面/faster_rcnn/network_files/det_utils.py", line 347, in call matches[below_low_threshold] = torch.tensor(self.BELOW_LOW_THRESHOLD) # -1 RuntimeError: Trying to pass too many CPU scalars to CUDA kernel!

    opened by why228430 7
  • 采用pytorch1.4跑YOLOv3-spp版本,删除双精度部分代码,程序out of memory

    采用pytorch1.4跑YOLOv3-spp版本,删除双精度部分代码,程序out of memory

    由于pytorch1.4版本不支持双精度(无from torch.cuda import amp) 所以修改有关双精度的代码 1.在train_eval_utils.py中,注释29和30行的
    # enable_amp = True if "cuda" in device.type else False # scaler = amp.GradScaler(enabled=enable_amp) 2.注释61行 # with amp.autocast(enabled=enable_amp): 3. 并将85行的代码修改如下(删除scaler部分):
    # backward # scaler.scale(losses).backward() losses.backward() # optimize if ni % accumulate == 0: # scaler.step(optimizer) # scaler.update() # optimizer.zero_grad() optimizer.step() optimizer.zero_grad() 报错:RuntimeError: CUDA error: out of memory 解决办法:将torch.utils.data.DataLoader中pin_memory改成False

    疑问:为什么单精度情况下pin_memory=True会报错?

    opened by Taylor-X76 6
  • 多GPU训练报错:subprocess.CalledProcessError: Command '['/opt/anaconda3/envs/py37/bin/python', '-u', 'train_multi_GPU.py']' returned non-zero exit status 1.

    多GPU训练报错:subprocess.CalledProcessError: Command '['/opt/anaconda3/envs/py37/bin/python', '-u', 'train_multi_GPU.py']' returned non-zero exit status 1.

    (py37) xiamingyang@AI-02:~/PyTorch/PyTorch_Object_detection/faster_rcnn$ CUDA_VISIBLE_DEVICES=4,6 python -m torch.distributed.launch --nproc_per_node=2 --use_env train_multi_GPU.py


    Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.


    | distributed init (rank 1): env:// | distributed init (rank 0): env:// Traceback (most recent call last): File "train_multi_GPU.py", line 249, in main(args) File "train_multi_GPU.py", line 40, in main init_distributed_mode(args) File "/Ai-Data/home/users/xiamingyang/PyTorch/PyTorch_Object_detection/faster_rcnn/train_utils/distributed_utils.py", line 320, in init_distributed_mode world_size=args.world_size, rank=args.rank) File "/opt/anaconda3/envs/py37/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 397, in init_process_group store, rank, world_size = next(rendezvous_iterator) File "/opt/anaconda3/envs/py37/lib/python3.7/site-packages/torch/distributed/rendezvous.py", line 168, in _env_rendezvous_handler store = TCPStore(master_addr, master_port, world_size, start_daemon) RuntimeError: Address already in use Traceback (most recent call last): File "/opt/anaconda3/envs/py37/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/opt/anaconda3/envs/py37/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/opt/anaconda3/envs/py37/lib/python3.7/site-packages/torch/distributed/launch.py", line 263, in main() File "/opt/anaconda3/envs/py37/lib/python3.7/site-packages/torch/distributed/launch.py", line 259, in main cmd=cmd) subprocess.CalledProcessError: Command '['/opt/anaconda3/envs/py37/bin/python', '-u', 'train_multi_GPU.py']' returned non-zero exit status 1.

    opened by Taylor-X76 6
  • efficientnet pytorch无法运行

    efficientnet pytorch无法运行

    报错为这个 Traceback (most recent call last): File "C:\Users\dell\Desktop\deep-learning-for-image-processing-master\pytorch_classification\Test9_efficientNet\train.py", line 145, in main(opt) File "C:\Users\dell\Desktop\deep-learning-for-image-processing-master\pytorch_classification\Test9_efficientNet\train.py", line 76, in main if args.weights != "": AttributeError: 'Namespace' object has no attribute 'weights'

    opened by liushuohxgjy 0
  • FileNotFound even files does exit

    FileNotFound even files does exit

    System information

    • Have I written custom code: No
    • OS Platform: window10
    • Python version: 3.8
    • Deep learning framework and version: PyTorch 1.7.1
    • Use GPU or not: use GPU
    • The network you trained: Faster R-CNN

    Describe the current behavior

    ** I am using a custom Pascal VOC dataset. but my files are named in string form and not integers. So, when I am using str format files I'm getting FileNotFoundError but when I change str to int in JPEGImages and 'filename' in the annotations file I can run my code smoothly. What should i change in my program plzz? **

    opened by Shuvo001 0
  • HRNet训练到最后报错问题

    HRNet训练到最后报错问题

    HRNet从头开始训练,跑了209个epoch之后,突然报了这样的错: Epoch: [209] Total time: 1:06:31 (0.8526 s / it) Test: [ 0/199] eta: 0:18:38 model_time: 0.5503 (0.5503) time: 5.6187 data: 3.6315 max mem: 5210 Test: [100/199] eta: 0:00:53 model_time: 0.2205 (0.2248) time: 0.3994 data: 0.0001 max mem: 5210 Test: [198/199] eta: 0:00:00 model_time: 0.1523 (0.2229) time: 0.3832 data: 0.0001 max mem: 5210 Test: Total time: 0:01:33 (0.4706 s / it) Averaged stats: model_time: 0.1523 (0.2229) Loading and preparing results... DONE (t=0.28s) creating index... index created! Running per image evaluation... Evaluate annotation type keypoints DONE (t=2.34s). Accumulating evaluation results... DONE (t=0.07s). IoU metric: keypoints Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.758 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.935 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.835 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.729 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.804 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.786 Average Recall (AR) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.942 Average Recall (AR) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.851 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.753 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.836 QObject::moveToThread: Current thread (0x561c025907d0) is not the object's thread (0x561c144fefa0). Cannot move to target thread (0x561c025907d0)

    qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "/home/ycj/.local/lib/python3.8/site-packages/cv2/qt/plugins" even though it was found. This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.

    Available platform plugins are: xcb, eglfs, linuxfb, minimal, minimalegl, offscreen, vnc, wayland-egl, wayland, wayland-xcomposite-egl, wayland-xcomposite-glx, webgl.

    已放弃 (核心已转储

    截图的话就如图所示, cgi-bin_mmwebwx-bin_webwxgetmsgimg_ MsgID=541557106273625799 skey=@crypt_dafbdf00_04b49658ff0a37e33e88b140f2ee1253 mmweb_appid=wx_webfilehelper 想问下这是正常的吗,是已经训练完成还是出了bug呢?

    opened by ycj1124 0
  • 使用up提供的fasterRCNN代码结合CAM进行可视化,结果异常,求助

    使用up提供的fasterRCNN代码结合CAM进行可视化,结果异常,求助

    只在predict.py程序中添加部分代码,没有改动作者的其余代码,同时结合下面这个链接的教程进行实验。我将作者的voc2012的权重加载到predict中,同时将教程中的部分代码加入其中进行实验,发现目标bbox能够正确绘画,但是CAM的结果很奇怪。CAM的结果对于目标部分并没有体现高亮的情况,而非目标的部分即背景反而被高亮了。目前不知道具体原因,求助!! https://github.com/jacobgil/pytorch-grad-cam/blob/master/tutorials/Class%20Activation%20Maps%20for%20Object%20Detection%20With%20Faster%20RCNN.ipynb

    实验结果是这样的很奇怪 image image

    opened by LITturtlee 0
  • 多gpu运行时候出错

    多gpu运行时候出错

    System information

    • Have I written custom code: Yes
    • OS Platform(e.g., window10 or Linux Ubuntu 16.04): Linux
    • Python version: 3.8
    • Deep learning framework and version(e.g., Tensorflow2.1 or Pytorch1.3): pytorch1.7.1
    • Use GPU or not: Use
    • CUDA/cuDNN version(if you use GPU): CUDA11.7
    • The network you trained(e.g., Resnet34 network): faster_res50_rpn

    Describe the current behavior

    您好,我用train_multi_GPU.py跑VG的数据集,数据集是按照my_dataset.py中的输出进行设置的,也转成了tensor,但是在”global_features,loss_dict = model(images, targets)“这一步的时候总是报"RuntimeError: chunk expects at least a 1-dimensional tensor“错误,不知道是哪个输入没有满足要求,请问有没有什么解决的办法?

    我的自定义部分:将训练数据集改成了VG,将coco相关的代码注释了,同时取消了验证集,还有一部分代码写在roi_head之后,不会影响前面的基础模型的训练,其它地方的代码都没有动过。

    Error info / logs Traceback (most recent call last): File "train_multi_GPU.py", line 273, in main(args) File "train_multi_GPU.py", line 151, in main mean_loss, lr = utils.train_one_epoch(model, optimizer, data_loader, File "/home/zzyyxx/Image_Catpion/faster_rcnn/train_utils/train_eval_utils.py", line 46, in train_one_epoch global_features,loss_dict = model(images, targets) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 617, in forward inputs, kwargs = self.scatter(inputs, kwargs, self.device_ids) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 643, in scatter return scatter_kwargs(inputs, kwargs, device_ids, dim=self.dim) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 36, in scatter_kwargs inputs = scatter(inputs, target_gpus, dim) if inputs else [] File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 28, in scatter res = scatter_map(inputs) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 15, in scatter_map return list(zip(*map(scatter_map, obj))) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 17, in scatter_map return list(map(list, zip(*map(scatter_map, obj)))) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 19, in scatter_map return list(map(type(obj), zip(*map(scatter_map, obj.items())))) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 15, in scatter_map return list(zip(*map(scatter_map, obj))) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 13, in scatter_map return Scatter.apply(target_gpus, None, dim, obj) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/_functions.py", line 92, in forward outputs = comm.scatter(input, target_gpus, chunk_sizes, ctx.dim, streams) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/comm.py", line 186, in scatter return tuple(torch._C._scatter(tensor, devices, chunk_sizes, dim, streams)) RuntimeError: chunk expects at least a 1-dimensional tensor Traceback (most recent call last): File "train_multi_GPU.py", line 273, in main(args) File "train_multi_GPU.py", line 151, in main mean_loss, lr = utils.train_one_epoch(model, optimizer, data_loader, File "/home/zzyyxx/Image_Catpion/faster_rcnn/train_utils/train_eval_utils.py", line 46, in train_one_epoch global_features,loss_dict = model(images, targets) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 617, in forward inputs, kwargs = self.scatter(inputs, kwargs, self.device_ids) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 643, in scatter return scatter_kwargs(inputs, kwargs, device_ids, dim=self.dim) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 36, in scatter_kwargs inputs = scatter(inputs, target_gpus, dim) if inputs else [] File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 28, in scatter res = scatter_map(inputs) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 15, in scatter_map return list(zip(*map(scatter_map, obj))) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 17, in scatter_map return list(map(list, zip(*map(scatter_map, obj)))) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 19, in scatter_map return list(map(type(obj), zip(*map(scatter_map, obj.items())))) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 15, in scatter_map return list(zip(*map(scatter_map, obj))) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 13, in scatter_map return Scatter.apply(target_gpus, None, dim, obj) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/_functions.py", line 92, in forward outputs = comm.scatter(input, target_gpus, chunk_sizes, ctx.dim, streams) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/comm.py", line 186, in scatter return tuple(torch._C._scatter(tensor, devices, chunk_sizes, dim, streams)) RuntimeError: chunk expects at least a 1-dimensional tensor Traceback (most recent call last): File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/distributed/launch.py", line 260, in main() File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/distributed/launch.py", line 255, in main raise subprocess.CalledProcessError(returncode=process.returncode, subprocess.CalledProcessError: Command '['/home/zzyyxx/enter/envs/ZTorch/bin/python', '-u', 'train_multi_GPU.py']' returned non-zero exit status 1.

    opened by BarryYHX 2
Owner
WuZhe
WuZhe
Third party Pytorch implement of Image Processing Transformer (Pre-Trained Image Processing Transformer arXiv:2012.00364v2)

ImageProcessingTransformer Third party Pytorch implement of Image Processing Transformer (Pre-Trained Image Processing Transformer arXiv:2012.00364v2)

null 61 Jan 1, 2023
TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

yifan liu 147 Dec 3, 2022
Image Classification - A research on image classification and auto insurance claim prediction, a systematic experiments on modeling techniques and approaches

A research on image classification and auto insurance claim prediction, a systematic experiments on modeling techniques and approaches

null 0 Jan 23, 2022
Simple-Image-Classification - Simple Image Classification Code (PyTorch)

Simple-Image-Classification Simple Image Classification Code (PyTorch) Yechan Kim This repository contains: Python3 / Pytorch code for multi-class ima

Yechan Kim 8 Oct 29, 2022
Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Hybrid-Supervised Object Detection System Object detection system trained by hybrid-supervision/weakly semi-supervision (HSOD/WSSOD): This project is

null 5 Dec 10, 2022
Yolo object detection - Yolo object detection with python

How to run download required files make build_image make download Docker versio

null 3 Jan 26, 2022
Implement face detection, and age and gender classification, and emotion classification.

YOLO Keras Face Detection Implement Face detection, and Age and Gender Classification, and Emotion Classification. (image from wider face dataset) Ove

Chloe 10 Nov 14, 2022
Auto-Lama combines object detection and image inpainting to automate object removals

Auto-Lama Auto-Lama combines object detection and image inpainting to automate object removals. It is build on top of DE:TR from Facebook Research and

null 44 Dec 9, 2022
Image Processing, Image Smoothing, Edge Detection and Transforms

opevcvdl-hw1 This project uses openCV and Qt to achieve the requirements. Version Python 3.7 opencv-contrib-python 3.4.2.17 Matplotlib 3.1.1 pyqt5 5.1

Kenny Cheng 3 Aug 17, 2022
Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera. This project prepares training and testing data for various deep learning projects such as 6D object pose estimation projects singleshotpose, as well as object detection and instance segmentation projects.

null 305 Dec 16, 2022
For holding anime-related object classification and detection models

Animesion An end-to-end framework for anime-related object classification, detection, segmentation, and other models. Update: 01/22/2020. Due to time-

Edwin Arkel Rios 72 Nov 30, 2022
Real Time Object Detection and Classification using Yolo Algorithm.

Real time Object detection & Classification using YOLO algorithm. Real Time Object Detection and Classification using Yolo Algorithm. What is Object D

Ketan Chawla 1 Apr 17, 2022
It is a system used to detect bone fractures. using techniques deep learning and image processing

MohammedHussiengadalla-Intelligent-Classification-System-for-Bone-Fractures It is a system used to detect bone fractures. using techniques deep learni

Mohammed Hussien 7 Nov 11, 2022
Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.

Deep Image Search - AI-Based Image Search Engine Deep Image Search is an AI-based image search engine that includes deep transfer learning features Ex

null 139 Jan 1, 2023
This project uses Template Matching technique for object detecting by detection of template image over base image.

Object Detection Project Using OpenCV This project uses Template Matching technique for object detecting by detection the template image over base ima

Pratham Bhatnagar 7 May 29, 2022
This project uses Template Matching technique for object detecting by detection of template image over base image

Object Detection Project Using OpenCV This project uses Template Matching technique for object detecting by detection the template image over base ima

Pratham Bhatnagar 4 Nov 16, 2021