Self-Supervised Monocular DepthEstimation with Internal Feature Fusion(arXiv), BMVC2021

Hang

Last update: Dec 25, 2022

Related tags

Deep Learning DIFFNet

Overview

DIFFNet

This repo is for Self-Supervised Monocular DepthEstimation with Internal Feature Fusion(arXiv), BMVC2021

A new backbone for self-supervised depth estimaiton.

If you think it is not a bad work, please consider citing it.

@article{diffnet_bmvc,
    title = {Self-Supervised Monocular DepthEstimation with Internal Feature Fusion},
    author  = { Hang Zhou, David Greenwood and Sarah Taylor},
    booktitle = {The British Machine Vision Conference (BMVC)},
    month = {November},
    year = {2021}}

Paper, implementation details and trained models are coming soon

Comparing with others

Evaluation on selected hard cases:

Trained weights

Acknowledgement

Thanks the authors for their works:

Comments

Loss

Hello， When I run the code, I wonder whether you used the options of uncertain_mask and flipping_loss. Because I can't reproduce the accuracy in your paper at the resolution of 1024*320. Thanks for your reply.

opened by wangcong607 7

Missing Keys in Pretrained Weights

Hi @brandleyzhou, thank you for your great work!

I met the following problem when testing your pretrained models:

Exception has occurred: RuntimeError
Error(s) in loading state_dict for HRDepthDecoder:
	Missing key(s) in state_dict: "convs.up_x9_0.conv.conv.weight", "convs.up_x9_0.conv.conv.bias", "convs.up_x9_1.conv.conv.weight", "convs.up_x9_1.conv.conv.bias", "convs.72.ca.fc.0.weight", "convs.72.ca.fc.2.weight", "convs.72.conv_se.weight", "convs.72.conv_se.bias", "convs.36.ca.fc.0.weight", "convs.36.ca.fc.2.weight", "convs.36.conv_se.weight", "convs.36.conv_se.bias", "convs.18.ca.fc.0.weight", "convs.18.ca.fc.2.weight", "convs.18.conv_se.weight", "convs.18.conv_se.bias", "convs.9.ca.fc.0.weight", "convs.9.ca.fc.2.weight", "convs.9.conv_se.weight", "convs.9.conv_se.bias", "convs.dispConvScale0.conv.weight", "convs.dispConvScale0.conv.bias", "convs.dispConvScale1.conv.weight", "convs.dispConvScale1.conv.bias", "convs.dispConvScale2.conv.weight", "convs.dispConvScale2.conv.bias", "convs.dispConvScale3.conv.weight", "convs.dispConvScale3.conv.bias", "decoder.0.conv.conv.weight", "decoder.0.conv.conv.bias", "decoder.1.conv.conv.weight", "decoder.1.conv.conv.bias", "decoder.2.ca.fc.0.weight", "decoder.2.ca.fc.2.weight", "decoder.2.conv_se.weight", "decoder.2.conv_se.bias", "decoder.3.ca.fc.0.weight", "decoder.3.ca.fc.2.weight", "decoder.3.conv_se.weight", "decoder.3.conv_se.bias", "decoder.4.ca.fc.0.weight", "decoder.4.ca.fc.2.weight", "decoder.4.conv_se.weight", "decoder.4.conv_se.bias", "decoder.5.ca.fc.0.weight", "decoder.5.ca.fc.2.weight", "decoder.5.conv_se.weight", "decoder.5.conv_se.bias", "decoder.6.conv.weight", "decoder.6.conv.bias", "decoder.7.conv.weight", "decoder.7.conv.bias", "decoder.8.conv.weight", "decoder.8.conv.bias", "decoder.9.conv.weight", "decoder.9.conv.bias".

The pretrained weights are downloaded from this repository page. Specifically, I was testing two pretrained models:

Could you please have a look at this and upload the complete models? Thanks in advance!

opened by ldkong1205 5

Saved trained models

Hi, Thank you for sharing your amazing code. I run the training code and it was trained for 20 epochs but I don't know where the models are saved? also your code save each epoch results separately or only save the last epoch? and the last question, where can I change the number of epochs for training?

opened by MohsenMoradiArt 5
cannot reproduced results mentioned in the paper

Hi, I trained your model with 640x192 and 1025x320 input sizes, but the results are different from what you mentioned in the paper.

Here are the results I got:

-> Computing predictions with size 640x192 abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 | & 0.108 & 0.792 & 4.589 & 0.186 & 0.889 & 0.963 & 0.982 \

-> Computing predictions with size 1024x320 abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 | & 0.103 & 0.909 & 4.642 & 0.183 & 0.899 & 0.965 & 0.982 \

And here are the results mentioned in the paper:

I don't know what cause this difference, because when I used your pre-trained weights for evaluation, I got the same results as yours. do you have any idea why? maybe the code has slightly changed? or a different version of the torch can cause this?

opened by ArminMasoumian 4
Changing Input Size

Hi, The provided code gives the results for 640x192 image size. where can I change it to the original size input (1024x320) and train with that? Also, it seems that you add an internal feature fusion to the original HRNet, I would like to remove that and test it with the original HRNet. In the "test_hr_encoder.py" I tried to remove "mixed_features" and only return "features", but in the decoder, I get an error. Is there any way to train your model with the original HRNet?

opened by ArminMasoumian 4
Environment

Hi, thank you for sharing your nice work.

Could you share the environment setting such as versions of packages for this work?

I cannot reproduce the results of this paper even if using the pretrained model that is provided in this repo.

In my evaluation: 0.1024 0.7632 4.482 1.799 0.8954 0.9645 0.9831

In the paper: 0.102 0.764 4.483 0.180 0.896 0.965 0.983

opened by seb-le 4
issue about downloading hrnet pretrained on ImageNet

First of all, thank you for sharing this cool work.

I faced an issue that an error occurred when I run the start2train.sh.

The error is below:

I ran start2train.sh on another computer that has a different IP address, but the error also occurred.

Thank you.

opened by seb-le 4
Training and testing issue

Hi,

When I'm trying to test a simple image by running the " sh test_sample.sh " code, I get this error: " ModuleNotFoundError: No module named 'hr_networks' "

Would you please let me know how I can get "hr_networks"?

Also when I tried to train the model, this error popup:

from .hrnet_config import MODEL_CONFIGS File "/media/armin/DATA/DIFFNet/networks/hrnet_config.py", line 5, in from yacs.config import CfgNode as CN ModuleNotFoundError: No module named 'yacs'

Do I miss something here?

opened by ArminMasoumian 2
where is the supplementary material mentioned in paper

At the bottom of page 9 said "The corresponding images are shown in the supplementary material." but I can't find the supplementary material section in this paper. is there any misunderstanding on my part? thanks for your time

opened by Shiwen615 2

About model's FPS

Hello Thank you for your good work!!

I'm calculating the DIFFNet's FPS in RTX2080ti to fairly compare our works. But the DIFFNet and monodepth2 's fps are so different from those reported in your paper. Can I get your code to calculate the fps, please?

I measured the fps with the following code.

import torch
import networks
en = networks.test_hr_encoder.hrnet18(False)
en.num_ch_enc = [ 64, 18, 36, 72, 144 ]
de= networks.HRDepthDecoder(en.num_ch_enc, [0])
# depth_net=DepthResNet(version="101pt")
=
device = torch.device('cuda')
en.to(device)
en.eval()
de.to(device)
de.eval()
optimal_batch_size=1
dummy_input = torch.randn(optimal_batch_size, 3,192,640, dtype=torch.float).to(device)
repetitions=10000
total_time = 0
print("start calculate")
with torch.no_grad():
      for rep in range(repetitions):
             starter, ender = torch.cuda.Event(enable_timing=True), torch.cuda.Event(enable_timing=True)
             starter.record()
             _ = de(en(dummy_input))
             ender.record()
             torch.cuda.synchronize()
             curr_time = starter.elapsed_time(ender)/1000
             if rep!=0:
                 total_time += curr_time
repetitions=repetitions-1
print(total_time)
Throughput = (repetitions*optimal_batch_size)/total_time
print('Final FPS:',Throughput,' total_time:',total_time)
print("weight num: ",sum(p.numel() for p in en.parameters())+sum(p.numel() for p in de.parameters()))

And the following results were obtained for each models.

| Model| FPS | |:------:|:------:| | DIFFNet | 34.92 | | Monodepth2 | 282.25 |

opened by big-chan 2

the layer's names of the decoder model and the depth.pth in diffnet_1024x320_ttr are not the same

hello, it seems that the layer's names of the decoder model and the depth.pth in diffnet_1024x320_ttr are not the same and cause error when running evaluate_depth.py

opened by czh0001 2
Training

Thanks for your working. Here are something detials i want 2 ask you . Here are my torch torch 1.7.1+cu110 torchaudio 0.7.2 torchsummary 1.5.1
torchvision 0.8.2+cu110 I found when i set the initial learning rate as 10−4 for the first 14 epochs and then 10−5 for last 5 epochs ,my experimental results are very different from yours . Is it the reason for different PyTorch versions？Or my training process wrong?

opened by ljy199712 3
test file missing

Thanks for your work of DIFFNet! I want to evaluate the results of the training in my PC, but the file "splits/eigen/gt_depths.npz" is required. I can't find it in the document. Could you please provide this file? Thanks!

opened by Renatusphere 1
Cityscapes model

Hi. First, thank you for opening your nice paper and source code.

Could you share checkpoints that were pretrained on Cityscapes and fine-tuned on KITTI (i.e., CS → K)?

I would like to know whether DiffNet that I pretrained on Cityscapes is correct.

Thanks!

opened by seb-le 1
About torch::jit::trace

Hello, Thank you for sharing your work, and I want to use libtorch to deploy this network in C++, but when using torch::jit::trace(), I get this error(executing test_sample.py can run successfully): Because torch::jit::trace() cannot handle dictionary, I changed the output of depth_decoder to list, and there is a line "import hr_networks" in test_sample.py, but I did not find hr_networks, I don't know if this affectstorch::jit::trace().

Thank you very much！

opened by Hugo699 3
Multi-GPU training hangs

Hello, When I start multi gpu training. I run the following command. python -m torch.distributed.launch --nproc_per_node=2 train.py --split eigen_zhou --learning_rate 1e-4 --height 320 --width 1024 --scheduler_step_size 14 --batch_size 2 --model_name mono_model --png --data_path ../4_monodepth2/data/KITTI/ --num_epochs 40 --log_dir weights_logs

If I set --nproc_per_node=1, then it runs alright on single GPU, but if I set --nproc_per_node=2, then it just prints the comments before it initializes distributed training but after that, it just stucks. From nvidia-smi, I can see the GPUs are 100% occupied, but training does not start (weight_logs also does not get created)

I have attached screenshot where it gets stuck. Can you please help me with knowing what this might be?

Thank you for you time.

opened by tushardmaske 2

Self-Supervised Monocular DepthEstimation with Internal Feature Fusion(arXiv), BMVC2021

Related tags

Overview

DIFFNet

** Paper, implementation details and trained models are coming soon **

Comparing with others

Evaluation on selected hard cases:

Trained weights

Acknowledgement

Comments

Owner

Hang

Fusion-DHL: WiFi, IMU, and Floorplan Fusion for Dense History of Locations in Indoor Environments

Self-Supervised Multi-Frame Monocular Scene Flow (CVPR 2021)

[ICCV 2021] Excavating the Potential Capacity of Self-Supervised Monocular Depth Estimation

Advancing Self-supervised Monocular Depth Learning with Sparse LiDAR

the official code for ICRA 2021 Paper: "Multimodal Scale Consistency and Awareness for Monocular Self-Supervised Depth Estimation"

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency[ECCV 2020]

Self-supervised Multi-modal Hybrid Fusion Network for Brain Tumor Segmentation

We have implemented shaDow-GNN as a general and powerful pipeline for graph representation learning. For more details, please find our paper titled Deep Graph Neural Networks with Shallow Subgraph Samplers, available on arXiv (https//arxiv.org/abs/2012.01380).

arxiv-sanity, but very lite, simply providing the core value proposition of the ability to tag arxiv papers of interest and have the program recommend similar papers.

Listing arxiv - Personalized list of today's articles from ArXiv

Arxiv harvester - Poor man's simple harvester for arXiv resources

The Self-Supervised Learner can be used to train a classifier with fewer labeled examples needed using self-supervised learning.

Official repo for BMVC2021 paper ASFormer: Transformer for Action Segmentation

The pytorch implementation of SOKD (BMVC2021).

Official PyTorch Implementation of Mask-aware IoU and maYOLACT Detector [BMVC2021]

[BMVC2021] The official implementation of "DomainMix: Learning Generalizable Person Re-Identification Without Human Annotations"

Code for BMVC2021 "MOS: A Low Latency and Lightweight Framework for Face Detection, Landmark Localization, and Head Pose Estimation"

This is the official implementation of 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection, built on SECOND.

a practicable framework used in Deep Learning. So far UDL only provide DCFNet implementation for the ICCV paper (Dynamic Cross Feature Fusion for Remote Sensing Pansharpening)

Paper, implementation details and trained models are coming soon