deep learning for image processing including classification and object-detection etc.

WuZhe

Last update: Jan 4, 2023

Related tags

Deep Learning deep-learning pytorch classification segmentation bilibili object-detection tensorflow2

Overview

深度学习在图像处理中的应用教程

前言

本教程是对本人研究生期间的研究内容进行整理总结，总结的同时也希望能够帮助更多的小伙伴。后期如果有学习到新的知识也会与大家一起分享。
本教程会以视频的方式进行分享，教学流程如下：
1）介绍网络的结构与创新点
2）使用Pytorch进行网络的搭建与训练
3）使用Tensorflow（内部的keras模块）进行网络的搭建与训练
课程中所有PPT都放在course_ppt文件夹下，需要的自行下载。

教程目录，点击跳转相应视频（后期会根据学习内容增加）

图像分类
- LeNet（已完成）
  - Pytorch官方demo(Lenet)
  - Tensorflow2官方demo
- AlexNet（已完成）
- VggNet（已完成）
- GoogLeNet（已完成）
- ResNet（已完成）
- ResNeXt (已完成)
  - ResNeXt网络讲解
  - Pytorch搭建ResNeXt网络
- MobileNet_v1_v2（已完成）
- MobileNet_v3（已完成）
- ShuffleNet_v1_v2 (已完成)
- EfficientNet_v1（已完成）
- EfficientNet_v2 (已完成)
- Vision Transformer(已完成)
- Swin Transformer(已完成)
目标检测
- Faster-RCNN/FPN（已完成）
- SSD/RetinaNet (已完成)
- YOLOv3 SPP (已完成)
  - YOLO系列网络讲解
  - YOLOv3 SPP源码解析(Pytorch版)
语义分割
- FCN (已完成)
  - FCN网络讲解
  - FCN源码解析(Pytorch版)
- DeepLabV3 (已完成)
- LR-ASPP (已完成)
  - LR-ASPP网络讲解
  - LR-ASPP源码解析(Pytorch版)
- UNet (准备中)
  - UNet网络讲解

更多相关视频请进入我的bilibili频道查看

所需环境

Anaconda3（建议使用）
python3.6/3.7/3.8
pycharm (IDE)
pytorch 1.7.1 (pip package)
torchvision 0.8.1 (pip package)
tensorflow 2.4.1 (pip package)

欢迎大家关注下我的微信公众号（阿喆学习小记），平时会总结些相关学习博文。

如果有什么问题，也可以到我的CSDN中一起讨论。 https://blog.csdn.net/qq_37541097/article/details/103482003

我的bilibili频道： https://space.bilibili.com/18161609/channel/index

Comments

为了得到你的许可

非常抱歉打扰您，由于不知道您的联系方式，只能以这样的方式来征得您的同意。我写的论文里用了您的SSD和Faster rcnn代码做实验，我将在我的代码里公开我的代码与我的实验数据。代码链接会放您的。非常感谢您的代码以及视频讲解，帮助我很多。希望你能同意。谢谢你（哔哩哔哩也有私信过您）。如果您同意的话，请记得回复我一下。再次感谢您。

opened by Saya520r 14

FasterRCNN 训练错误

System information

Have I written custom code: no
OS Platform(e.g., window10 or Linux Ubuntu 16.04): linux
Python version: 3.8
Deep learning framework and version(e.g., Tensorflow2.1 or Pytorch1.3): torch1.6
Use GPU or not: yes
CUDA/cuDNN version(if you use GPU):
The network you trained(e.g., Resnet34 network): resnet50fpn

Describe the current behavior 您好，用faster_rcnn训练自己的数据集，一共六种物体，create model设置的num_classes=7，但是还是出现了这个错误。其他没有改过，求教该怎么解决呀？

Error info / logs

Namespace(batch_size=8, data_path='/research/dept8/qdou/zwang/data/robo/final', device='cuda:0', epochs=50, output_dir='./save_weights', resume='', start_epoch=0)
Using cuda device training.
Using 8 dataloader workers
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [3,0,0], thread: [82,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [3,0,0], thread: [83,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [3,0,0], thread: [84,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
Traceback (most recent call last):
  File "train_res50_fpn.py", line 167, in <module>
    main(args)
  File "train_res50_fpn.py", line 99, in main
    utils.train_one_epoch(model, optimizer, train_data_loader,
  File "/research/dept8/qdou/zwang/faster_rcnn/train_utils/train_eval_utils.py", line 34, in train_one_epoch
    loss_dict = model(images, targets)
  File "/research/dept8/qdou/zwang/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/research/dept8/qdou/zwang/faster_rcnn/network_files/faster_rcnn_framework.py", line 93, in forward
    detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets)
  File "/research/dept8/qdou/zwang/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/research/dept8/qdou/zwang/faster_rcnn/network_files/roi_head.py", line 367, in forward
    proposals, matched_idxs, labels, regression_targets = self.select_training_samples(proposals, targets)
  File "/research/dept8/qdou/zwang/faster_rcnn/network_files/roi_head.py", line 222, in select_training_samples
    matched_idxs, labels = self.assign_targets_to_proposals(proposals, gt_boxes, gt_labels)
  File "/research/dept8/qdou/zwang/faster_rcnn/network_files/roi_head.py", line 144, in assign_targets_to_proposals
    labels_in_image[bg_inds] = 0
RuntimeError: copy_if failed to synchronize: cudaErrorAssert: device-side assert triggered

opened by Kyfafyd 14

关于FCN网络中miou为0的问题

up主您好，想请教您一个问题，就是我在用FCN网络做医学肿瘤分割时，输出的结果文档第二类的miou始终为0，具体是下面这个样子: [epoch: 7] train_loss: 0.00193 lr: 0.00780 global correct: 99.8 average row correct: ['100.0', '0.0'] IoU: ['99.8', '0.0'] mean IoU: 49.9

我已经做了以下修改: *未载入resnet50预训练权重 *将初始学习率修改为0.001或0.01

并且我发现train_loss在不断下降，下降的速率不快，想请教一下为什么会这样以及可能的解决方法

opened by xurui-111 10
MobileNetV2 训练报错
System information

Have I written custom code: NO

OS Platform(e.g., window10 or Linux Ubuntu 16.04): MacOS Big Sur

Python version: 3.9.5

Deep learning framework and version(e.g., Tensorflow2.1 or Pytorch1.3): Pytorch 1.9

Use GPU or not: Not

CUDA/cuDNN version(if you use GPU):

The network you trained(e.g., Resnet34 network): MobileNetV2

Describe the current behavior

Error info / logs
opened by weiqingtangx 10
在使用retinanet进行多GPU训练时报错

导师好！(狗头) 我在retinanet的backbone上面进行了修改，添加了cbam模块，使用单GPU训练正常，不会报错。但是使用多GPU却不行，我翻译了下大概是参数回传的问题，网上查了下也没搞清楚，可以帮忙看下吗？？其实这个情况我在跑ssd时候也是这个错误，就没管，没想到这里又出错了…… 报错信息如下： Start training /home/lb511/anaconda3/envs/lhaozz/lib/python3.9/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1640811803361/work/aten/src/ATen/native/TensorShape.cpp:2157.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] /home/lb511/anaconda3/envs/lhaozz/lib/python3.9/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1640811803361/work/aten/src/ATen/native/TensorShape.cpp:2157.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]

Epoch: [0] [ 0/132] eta: 0:03:01.012762 lr: 0.000173 loss: 1.8379 (1.8379) bbox_regression: 0.6653 (0.6653) classification: 1.1726 (1.1726) time: 1.3713 data: 0.3985 max mem: 8638 Traceback (most recent call last): File "/home/lhaozz/hand_retinanet/train_multi_GPU.py", line 260, in main(args) File "/home/lhaozz/hand_retinanet/train_multi_GPU.py", line 141, in main mean_loss, lr = utils.train_one_epoch(model, optimizer, data_loader, File "/home/lhaozz/hand_retinanet/train_utils/train_eval_utils.py", line 33, in train_one_epoch loss_dict = model(images, targets) File "/home/lb511/anaconda3/envs/lhaozz/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/lb511/anaconda3/envs/lhaozz/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 873, in forward if torch.is_grad_enabled() and self.reducer._rebuild_buckets():

RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by passing the keyword argument find_unused_parameters=True to torch.nn.parallel.DistributedDataParallel, and by making sure all forward function outputs participate in calculating loss. If you already have done the above, then the distributed data parallel module wasn't able to locate the output tensors in the return value of your module's forward function. Please include the loss function and the structure of the return value of forward of your module when reporting this issue (e.g. list, dict, iterable). Parameter indices which did not receive grad for rank 0: 12 13 14 15 25 26 27 28 38 39 40 41 51 52 53 54 67 68 69 70 80 81 82 83 93 94 95 96 106 107 108 109 119 120 121 122 132 133 134 135 148 149 150 151 161 162 163 164 174 175 176 177 In addition, you can set the environment variable TORCH_DISTRIBUTED_DEBUG to either INFO or DETAIL to print out information about which particular parameters did not receive gradient on this rank as part of this error

顺便问个小问题，混合精度训练我使用的是两张3080，cpu是r9 5950x 16核心，，平时是一般打开就行吗？

opened by LKssssZz 9
CUDA version
System information

Have I written custom code: No

OS Platform(e.g., window10 or Linux Ubuntu 16.04): Ubuntu 16.04.6 LTS

Python version: 3.7.10

Deep learning framework and version(e.g., Tensorflow2.1 or Pytorch1.3): Pytorch 1.6.0

Use GPU or not: No

CUDA/cuDNN version(if you use GPU): CUDA Version 10.1.243

The network you trained(e.g., Resnet34 network): pytorch_object_detection/faster_rcnn/train_res50_fpn.py

Describe the current behavior May I ask what version of CUDA is needed for this project? Will CUDA 10.1 not work?

Error info / logs AssertionError: The NVIDIA driver on your system is too old (found version 10010). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver.
opened by zmz125 9
mismatch for inception3a.branch3.1.conv.weight: copying a param with shape torch.Size([32, 16, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 16, 5, 5]).

https://github.com/WZMIAOMIAO/deep-learning-for-image-processing/blob/87a31d3db4baa5a693c59cab66cb232357c326c9/pytorch_classification/Test4_googlenet/train.py#L65 您好，我加载了googlenet的预训练权重会出现标题所指问题，请问如何解决？如果我修改branch3的kernal size为3 则则会出现RuntimeError: Sizes of tensors must match except in dimension 2. Got 28 and 30 (The offending index is 2)

opened by wang66624 8
predict.py 运行报错
System information

Have I written custom code:

OS Platform(e.g., window10 or Linux Ubuntu 16.04):

Python version:

Deep learning framework and version(e.g., Tensorflow2.1 or Pytorch1.3):

Use GPU or not:

CUDA/cuDNN version(if you use GPU):

The network you trained(e.g., Resnet34 network):

Describe the current behavior

Error info / logs
opened by punk1 7
RuntimeError: Trying to pass too many CPU scalars to CUDA kernel!

Thanks for sharing you code . when I run 'python train_mobilenet.py ',I meet the problem.How I can do to solve the error!

Traceback (most recent call last): File "train_mobilenet.py", line 157, in main() File "train_mobilenet.py", line 91, in main train_loss=train_loss, train_lr=learning_rate) File "/home/dl/桌面/faster_rcnn/train_utils/train_eval_utils.py", line 33, in train_one_epoch loss_dict = model(images, targets) File "/home/dl/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, **kwargs) File "/home/dl/桌面/faster_rcnn/network_files/faster_rcnn_framework.py", line 87, in forward proposals, proposal_losses = self.rpn(images, features, targets) File "/home/dl/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, **kwargs) File "/home/dl/桌面/faster_rcnn/network_files/rpn_function.py", line 615, in forward labels, matched_gt_boxes = self.assign_targets_to_anchors(anchors, targets) File "/home/dl/桌面/faster_rcnn/network_files/rpn_function.py", line 410, in assign_targets_to_anchors matched_idxs = self.proposal_matcher(match_quality_matrix) File "/home/dl/桌面/faster_rcnn/network_files/det_utils.py", line 347, in call matches[below_low_threshold] = torch.tensor(self.BELOW_LOW_THRESHOLD) # -1 RuntimeError: Trying to pass too many CPU scalars to CUDA kernel!

opened by why228430 7
采用pytorch1.4跑YOLOv3-spp版本，删除双精度部分代码，程序out of memory

由于pytorch1.4版本不支持双精度（无from torch.cuda import amp）所以修改有关双精度的代码 1.在train_eval_utils.py中，注释29和30行的
# enable_amp = True if "cuda" in device.type else False # scaler = amp.GradScaler(enabled=enable_amp) 2.注释61行 # with amp.autocast(enabled=enable_amp): 3. 并将85行的代码修改如下（删除scaler部分）：
# backward # scaler.scale(losses).backward() losses.backward() # optimize if ni % accumulate == 0: # scaler.step(optimizer) # scaler.update() # optimizer.zero_grad() optimizer.step() optimizer.zero_grad() 报错：RuntimeError: CUDA error: out of memory 解决办法：将torch.utils.data.DataLoader中pin_memory改成False

疑问：为什么单精度情况下pin_memory=True会报错？

opened by Taylor-X76 6
多GPU训练报错：subprocess.CalledProcessError: Command '['/opt/anaconda3/envs/py37/bin/python', '-u', 'train_multi_GPU.py']' returned non-zero exit status 1.

(py37) xiamingyang@AI-02:~/PyTorch/PyTorch_Object_detection/faster_rcnn$ CUDA_VISIBLE_DEVICES=4,6 python -m torch.distributed.launch --nproc_per_node=2 --use_env train_multi_GPU.py

Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.

| distributed init (rank 1): env:// | distributed init (rank 0): env:// Traceback (most recent call last): File "train_multi_GPU.py", line 249, in main(args) File "train_multi_GPU.py", line 40, in main init_distributed_mode(args) File "/Ai-Data/home/users/xiamingyang/PyTorch/PyTorch_Object_detection/faster_rcnn/train_utils/distributed_utils.py", line 320, in init_distributed_mode world_size=args.world_size, rank=args.rank) File "/opt/anaconda3/envs/py37/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 397, in init_process_group store, rank, world_size = next(rendezvous_iterator) File "/opt/anaconda3/envs/py37/lib/python3.7/site-packages/torch/distributed/rendezvous.py", line 168, in _env_rendezvous_handler store = TCPStore(master_addr, master_port, world_size, start_daemon) RuntimeError: Address already in use Traceback (most recent call last): File "/opt/anaconda3/envs/py37/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/opt/anaconda3/envs/py37/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/opt/anaconda3/envs/py37/lib/python3.7/site-packages/torch/distributed/launch.py", line 263, in main() File "/opt/anaconda3/envs/py37/lib/python3.7/site-packages/torch/distributed/launch.py", line 259, in main cmd=cmd) subprocess.CalledProcessError: Command '['/opt/anaconda3/envs/py37/bin/python', '-u', 'train_multi_GPU.py']' returned non-zero exit status 1.

opened by Taylor-X76 6
efficientnet pytorch无法运行

报错为这个 Traceback (most recent call last): File "C:\Users\dell\Desktop\deep-learning-for-image-processing-master\pytorch_classification\Test9_efficientNet\train.py", line 145, in main(opt) File "C:\Users\dell\Desktop\deep-learning-for-image-processing-master\pytorch_classification\Test9_efficientNet\train.py", line 76, in main if args.weights != "": AttributeError: 'Namespace' object has no attribute 'weights'

opened by liushuohxgjy 0
FileNotFound even files does exit
System information

Have I written custom code: No

OS Platform: window10

Python version: 3.8

Deep learning framework and version: PyTorch 1.7.1

Use GPU or not: use GPU

The network you trained: Faster R-CNN

Describe the current behavior

** I am using a custom Pascal VOC dataset. but my files are named in string form and not integers. So, when I am using str format files I'm getting FileNotFoundError but when I change str to int in JPEGImages and 'filename' in the annotations file I can run my code smoothly. What should i change in my program plzz? **
opened by Shuvo001 0
HRNet训练到最后报错问题

HRNet从头开始训练，跑了209个epoch之后，突然报了这样的错： Epoch: [209] Total time: 1:06:31 (0.8526 s / it) Test: [ 0/199] eta: 0:18:38 model_time: 0.5503 (0.5503) time: 5.6187 data: 3.6315 max mem: 5210 Test: [100/199] eta: 0:00:53 model_time: 0.2205 (0.2248) time: 0.3994 data: 0.0001 max mem: 5210 Test: [198/199] eta: 0:00:00 model_time: 0.1523 (0.2229) time: 0.3832 data: 0.0001 max mem: 5210 Test: Total time: 0:01:33 (0.4706 s / it) Averaged stats: model_time: 0.1523 (0.2229) Loading and preparing results... DONE (t=0.28s) creating index... index created! Running per image evaluation... Evaluate annotation type keypoints DONE (t=2.34s). Accumulating evaluation results... DONE (t=0.07s). IoU metric: keypoints Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.758 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.935 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.835 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.729 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.804 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.786 Average Recall (AR) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.942 Average Recall (AR) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.851 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.753 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.836 QObject::moveToThread: Current thread (0x561c025907d0) is not the object's thread (0x561c144fefa0). Cannot move to target thread (0x561c025907d0)

qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "/home/ycj/.local/lib/python3.8/site-packages/cv2/qt/plugins" even though it was found. This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.

Available platform plugins are: xcb, eglfs, linuxfb, minimal, minimalegl, offscreen, vnc, wayland-egl, wayland, wayland-xcomposite-egl, wayland-xcomposite-glx, webgl.

已放弃 (核心已转储

截图的话就如图所示，想问下这是正常的吗，是已经训练完成还是出了bug呢？

opened by ycj1124 0
使用up提供的fasterRCNN代码结合CAM进行可视化，结果异常，求助

只在predict.py程序中添加部分代码，没有改动作者的其余代码，同时结合下面这个链接的教程进行实验。我将作者的voc2012的权重加载到predict中，同时将教程中的部分代码加入其中进行实验，发现目标bbox能够正确绘画，但是CAM的结果很奇怪。CAM的结果对于目标部分并没有体现高亮的情况，而非目标的部分即背景反而被高亮了。目前不知道具体原因，求助！！ https://github.com/jacobgil/pytorch-grad-cam/blob/master/tutorials/Class%20Activation%20Maps%20for%20Object%20Detection%20With%20Faster%20RCNN.ipynb

实验结果是这样的很奇怪

opened by LITturtlee 0
多gpu运行时候出错
System information

Have I written custom code: Yes

OS Platform(e.g., window10 or Linux Ubuntu 16.04): Linux

Python version: 3.8

Deep learning framework and version(e.g., Tensorflow2.1 or Pytorch1.3): pytorch1.7.1

Use GPU or not: Use

CUDA/cuDNN version(if you use GPU): CUDA11.7

The network you trained(e.g., Resnet34 network): faster_res50_rpn

Describe the current behavior

您好，我用train_multi_GPU.py跑VG的数据集，数据集是按照my_dataset.py中的输出进行设置的，也转成了tensor，但是在”global_features,loss_dict = model(images, targets)“这一步的时候总是报"RuntimeError: chunk expects at least a 1-dimensional tensor“错误，不知道是哪个输入没有满足要求，请问有没有什么解决的办法？

我的自定义部分：将训练数据集改成了VG，将coco相关的代码注释了，同时取消了验证集，还有一部分代码写在roi_head之后，不会影响前面的基础模型的训练，其它地方的代码都没有动过。

Error info / logs Traceback (most recent call last): File "train_multi_GPU.py", line 273, in main(args) File "train_multi_GPU.py", line 151, in main mean_loss, lr = utils.train_one_epoch(model, optimizer, data_loader, File "/home/zzyyxx/Image_Catpion/faster_rcnn/train_utils/train_eval_utils.py", line 46, in train_one_epoch global_features,loss_dict = model(images, targets) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 617, in forward inputs, kwargs = self.scatter(inputs, kwargs, self.device_ids) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 643, in scatter return scatter_kwargs(inputs, kwargs, device_ids, dim=self.dim) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 36, in scatter_kwargs inputs = scatter(inputs, target_gpus, dim) if inputs else [] File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 28, in scatter res = scatter_map(inputs) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 15, in scatter_map return list(zip(*map(scatter_map, obj))) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 17, in scatter_map return list(map(list, zip(*map(scatter_map, obj)))) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 19, in scatter_map return list(map(type(obj), zip(*map(scatter_map, obj.items())))) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 15, in scatter_map return list(zip(*map(scatter_map, obj))) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 13, in scatter_map return Scatter.apply(target_gpus, None, dim, obj) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/_functions.py", line 92, in forward outputs = comm.scatter(input, target_gpus, chunk_sizes, ctx.dim, streams) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/comm.py", line 186, in scatter return tuple(torch._C._scatter(tensor, devices, chunk_sizes, dim, streams)) RuntimeError: chunk expects at least a 1-dimensional tensor Traceback (most recent call last): File "train_multi_GPU.py", line 273, in main(args) File "train_multi_GPU.py", line 151, in main mean_loss, lr = utils.train_one_epoch(model, optimizer, data_loader, File "/home/zzyyxx/Image_Catpion/faster_rcnn/train_utils/train_eval_utils.py", line 46, in train_one_epoch global_features,loss_dict = model(images, targets) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 617, in forward inputs, kwargs = self.scatter(inputs, kwargs, self.device_ids) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 643, in scatter return scatter_kwargs(inputs, kwargs, device_ids, dim=self.dim) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 36, in scatter_kwargs inputs = scatter(inputs, target_gpus, dim) if inputs else [] File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 28, in scatter res = scatter_map(inputs) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 15, in scatter_map return list(zip(*map(scatter_map, obj))) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 17, in scatter_map return list(map(list, zip(*map(scatter_map, obj)))) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 19, in scatter_map return list(map(type(obj), zip(*map(scatter_map, obj.items())))) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 15, in scatter_map return list(zip(*map(scatter_map, obj))) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 13, in scatter_map return Scatter.apply(target_gpus, None, dim, obj) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/_functions.py", line 92, in forward outputs = comm.scatter(input, target_gpus, chunk_sizes, ctx.dim, streams) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/nn/parallel/comm.py", line 186, in scatter return tuple(torch._C._scatter(tensor, devices, chunk_sizes, dim, streams)) RuntimeError: chunk expects at least a 1-dimensional tensor Traceback (most recent call last): File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/distributed/launch.py", line 260, in main() File "/home/zzyyxx/enter/envs/ZTorch/lib/python3.8/site-packages/torch/distributed/launch.py", line 255, in main raise subprocess.CalledProcessError(returncode=process.returncode, subprocess.CalledProcessError: Command '['/home/zzyyxx/enter/envs/ZTorch/bin/python', '-u', 'train_multi_GPU.py']' returned non-zero exit status 1.
opened by BarryYHX 2

deep learning for image processing including classification and object-detection etc.

Related tags

Overview

深度学习在图像处理中的应用教程

前言

教程目录，点击跳转相应视频（后期会根据学习内容增加）

所需环境

Comments

Owner

WuZhe

Third party Pytorch implement of Image Processing Transformer (Pre-Trained Image Processing Transformer arXiv:2012.00364v2)

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

Image Classification - A research on image classification and auto insurance claim prediction, a systematic experiments on modeling techniques and approaches

Simple-Image-Classification - Simple Image Classification Code (PyTorch)

Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Yolo object detection - Yolo object detection with python

Implement face detection, and age and gender classification, and emotion classification.

Auto-Lama combines object detection and image inpainting to automate object removals

MOT-Tracking-by-Detection-Pipeline - For Tracking-by-Detection format MOT (Multi Object Tracking), is it a framework that separates Detection and Tracking processes?

Image Processing, Image Smoothing, Edge Detection and Transforms

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

For holding anime-related object classification and detection models

Real Time Object Detection and Classification using Yolo Algorithm.

It is a system used to detect bone fractures. using techniques deep learning and image processing

Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.

This project uses Template Matching technique for object detecting by detection of template image over base image.

This project uses Template Matching technique for object detecting by detection of template image over base image