Pytorch codes for "Self-supervised Multi-view Stereo via Effective Co-Segmentation and Data-Augmentation"

hongbin_xu

Last update: Jan 4, 2023

Related tags

Deep Learning Self-Supervised-MVS

Overview

Self-Supervised-MVS

This repository is the official PyTorch implementation of our AAAI 2021 paper:

"Self-supervised Multi-view Stereo via Effective Co-Segmentation and Data-Augmentation" [paper] [Arxiv]

The training code is released in jdacs/ and jdacs-ms/.

JDACS utilizes MVSNet as backbone, while JDACS-MS utilizes a multi-stage MVSNet, such as CVP-MVSNet as backbone.

You can alternate the backbone network with other MVSNet series model. We will also release another implementation with CascadeMVSNet as backbone in jdacs-ms-v2/ in a few days.

Introduction

This project is inspired by many previous MVS works, such as MVSNet and CVP-MVSNet. Whereas the requirement of large-scale ground truth data limits the development of these learning-based MVS works. Hence, our model focuses on an unsupervised setting based on self-supervised photometric consistency loss.

However, existing unsupervised methods rely on the assumption that the corresponding points among different views share the same color, which may not always be true in practice. This may lead to unreliable self-supervised signal and harm the final reconstruction performance. We call this problem as color constancy ambiguity problem, as shown in the following figure:

To address the issue, we propose a novel self-supervised MVS framework integrated with more reliable supervision guided by semantic co-segmentation and data-augmentation. Specially, we excavate mutual semantic from multi-view images to guide the semantic consistency. And we devise effective data-augmentation mechanism which ensures the transformation robustness by treating the prediction of regular samples as pseudo ground truth to regularize the prediction of augmented samples. The brief illustration of our proposed framework is shown in the following figure:

Log

2021 February 13

Our paper is recently awarded for Distinguished Paper in AAAI-21!!!

2021 April 11

The training code of JDACS is released.

2021 April 20

The training code of JDACS-MS is released.

Example

We provide several examples of the reconstructed 3D scenes with our proposed method:

Citation

If you find this work is helpful to your work, please cite:

@inproceedings{xu2021self,
  title={Self-supervised Multi-view Stereo via Effective Co-Segmentation and Data-Augmentation},
  author={Xu, Hongbin and Zhou, Zhipeng and Qiao, Yu and Kang, Wenxiong and Wu, Qiuxia},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  year={2021}
}

Acknowledgement

We acknowledge the following repositories MVSNet and MVSNet_pytorch. Furthermore, the baseline of our self-supervised MVS method is partly based on the Unsup_MVS. We also thank the authors of M3VSNet for the constructive advices in experiments.

Comments

homo_warping vs inverse_warping

Hi, thanks a lot for the amazing work. I am learning the MVS and would like to ask the difference between homo_warping and inverse_warping in your code.

I understand homo_warping warp the feature from a point in one camera to another using homography H 𝑖 (𝑑) = 𝑑K 𝑖 T 𝑖 T 1 −1 K−1

Could you plz explain the inverse_warping a bit and the function of it in your code?

If the inverse_warping is just the reverse of homography from my understanding, in what situation you use the homo_warping and inverse_warping.

Thanks in advance.

opened by TWang1017 6
Cross-view Masking

Thanks for your amazing work. But I have a question about cross-view masking. According to the code provided, it seems that you only block out some regions on reference view but didn't mask out the corresponding area in source views, which is inconsistent with the statement in the paper. What's more, I think if you do mask out the corresponding area in source views, it means that you need the groudtruth depth, which is not allowable in the setting of self-supervised MVS. Is my understanding correct? Looking forward to your reply.

opened by Zhu980805 2
Out of menary when trying to train the model on RTX3070

Thanks for your excellent work!

After reading your papers, I am very interested in trying this out. I have an RTX3070 video card which has a total of 8GB of video memory. I set the batch_size to 1, but it still prompts insufficient memory during training.

RuntimeError: CUDA out of memory. Tried to allocate 480.00 MiB (GPU 0; 7.79 GiB total capacity; 5.06 GiB already allocated; 155.00 MiB free; 5.41 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I don't have the funds to afford a more advanced card right now, is there any other way to reduce the amount of video memory needed when training the network? Or can you please share your trained xxx.ckpt file to me directly? My email address is [email protected].

Anyway, thank you for your excellent work and I wish you good health and success again!

opened by CZH-cpu 1
On PyTorch Version & Training Time
Thanks for this great work! I have two quick questions:

Will PyTorch 1.4.0 be OK for running this code? I notice that the recommended version is 1.1.0.

How long will it take to train JDACS (w/o MS)? I notice that the README says training JDACS-MS can take several days with 4 GPUs. Is training JDACS less time-consuming?

BTW, is there any Python implementation of the evaluation code, which is currently implemented with Matlab?

Many thanks.
opened by II-Matto 1
loss multiply 15

I have tried to train this net when I set batchsize==8 and trained in 20 epoch,but I can't understand why the training loss don't decline and surrounding 10%. I can see the loss is in losses/unsup_loss and combine with a Vgg pretrain block ,the loss is 12 * self.reconstr_loss+6 * self.ssim_loss + 0.05 * self.smooth_loss (0.8*x+0.2*y+0.05*z)*15

I can't understand why the loss must multiply by 15.

I hope somebody can explain this,3U

opened by lilipopololo 1
size incompatibility

Hi, thanks for the amazing work. I encountered that error below when run the train.py script. any ideas what happened?

File "c:\Users\xxx\Desktop\jdacs-ms\models\network.py", line 72, in forward conv6 = conv0 + self.conv6(conv5) RuntimeError: The size of tensor a (11) must match the size of tensor b (12) at non-singleton dimension 2

opened by TWang1017 0

memory error

When I run test.sh,and I set nsrc 3, nscale 1,but terminal report error

attribute lookup MemoryError on numpy.core._exceptions failed 
 DefaultCPUAllocator: not enough memory: you tried to allocate 23040000 bytes. Buy new RAM!

i have 12g vram and 16g ram. What can I do ?

opened by lilipopololo 0

Loss weights and resultant curve issue
Hi, thanks a lot for your swift response and your reminder helps a lot.

One more thing, I train on the DTU dataset with augmentation and co-seg deactivated. The training loss looks like below, the SSIM loss dominates the standard unsupervised loss based on the default weight [12xself.reconstr_loss (photo_loss) + 6xself.ssim_loss + 0.05xself.smooth_loss]. In this case, is it sensible to change the weight, like reduce the 6xself.ssim_loss to 1xself.ssim_loss such that it is in the similar range with reconstr_loss?

Also, the training seems not steady, it fluctuates a lot. Any clues why this happens? Thanks in advance for your help.

Originally posted by @TWang1017 in https://github.com/ToughStoneX/Self-Supervised-MVS/issues/22#issuecomment-1339018531
opened by TWang1017 2
failed

CMake Deprecation Warning at CMakeLists.txt:1 (cmake_minimum_required): Compatibility with CMake < 2.8.12 will be removed from a future version of CMake.

Update the VERSION argument value or use a ... suffix to tell CMake that the project does not need compatibility with older versions.

CMake Warning at /home/camellia/anaconda3/envs/JDACS/lib/python3.8/site-packages/cmake/data/share/cmake-3.22/Modules/FindCUDA.cmake:1054 (message): Expecting to find librt for libcudart_static, but didn't find it. Call Stack (most recent call first): CMakeLists.txt:5 (find_package)

-- Could NOT find OpenMP_C (missing: OpenMP_pthread_LIBRARY) (found version "3.1") -- Could NOT find OpenMP_CXX (missing: OpenMP_pthread_LIBRARY) (found version "3.1") -- Could NOT find OpenMP (missing: OpenMP_C_FOUND OpenMP_CXX_FOUND) -- Configuring done -- Generating done -- Build files have been written to: /home/camellia/zyf/Self-Supervised-MVS-main/jdacs/fusion/fusibile/fusibile/build Consolidate compiler generated dependencies of target fusibile [ 33%] Linking CXX executable fusibile /usr/bin/ld: warning: //home/camellia/anaconda3/envs/JDACS-MVS/lib/libgomp.so.1: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001 /usr/bin/ld: warning: //home/camellia/anaconda3/envs/JDACS-MVS/lib/libgomp.so.1: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002 /usr/bin/ld: /usr/local/cuda-10.0/lib64/libcudart_static.a(libcudart_static.a.o): undefined reference to symbol 'shm_unlink@@GLIBC_2.2.5' //lib/x86_64-linux-gnu/librt.so.1: 无法添加符号: DSO missing from command line collect2: 错误： ld 返回 1 make[2]: *** [CMakeFiles/fusibile.dir/build.make:332：fusibile] 错误 1 make[1]: *** [CMakeFiles/Makefile2:83：CMakeFiles/fusibile.dir/all] 错误 2 make: *** [Makefile:91：all] 错误 2

opened by zhao-you-fei 0
About the dataset used for evaluation

Hi, sorry to disturb you. But when I reproduce the quantitative performance of my own model only with the standard loss, I find there will be a large difference using evaluation dataset you provided or the origin evaluation dataset from dtu_yao.py Specifically, when I use the depth maps from dtu_yao.py(using test list) to get point clouds by fusion.py, the number of the points will be dramatically low(about 1 or 2 million), thus make the acc. and comp. too high(about 3.3). But using your evaluation dataset can produce much better result I'm wondering where the reason lies in.

opened by knightwzh 0
Testing on TT Dataset

Hi, thanks for your excellent works. The details of my tests on the Tanks and Temples dataset are not as good as yours, can you share the test code on the Tanks dataset?

opened by Lingxingxing 0

Owner

hongbin_xu

A master student, Python/C++

GitHub

A pytorch-version implementation codes of paper: "BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation"

BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation A pytorch-version implementation

11 Oct 8, 2022

A novel method to tune language models. Codes and datasets for paper ``GPT understands, too''.

P-tuning A novel method to tune language models. Codes and datasets for paper ``GPT understands, too''. How to use our code We have released the code

562 Dec 27, 2022

Codes for NAACL 2021 Paper "Unsupervised Multi-hop Question Answering by Question Generation"

Unsupervised-Multi-hop-QA This repository contains code and models for the paper: Unsupervised Multi-hop Question Answering by Question Generation (NA

70 Nov 27, 2022

This is my codes that can visualize the psnr image in testing videos.

CVPR2018-Baseline-PSNRplot This is my codes that can visualize the psnr image in testing videos. Future Frame Prediction for Anomaly Detection – A New

12 May 29, 2021

codes for Image Inpainting with External-internal Learning and Monochromic Bottleneck

Image Inpainting with External-internal Learning and Monochromic Bottleneck This repository is for the CVPR 2021 paper: 'Image Inpainting with Externa

97 Nov 29, 2022

Source codes for "Structure-Aware Abstractive Conversation Summarization via Discourse and Action Graphs"

Structure-Aware-BART This repo contains codes for the following paper: Jiaao Chen, Diyi Yang:Structure-Aware Abstractive Conversation Summarization vi

56 Dec 8, 2022

Codes for our paper "SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge" (EMNLP 2020)

SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge Introduction SentiLARE is a sentiment-aware pre-trained language

74 Dec 30, 2022

Source codes for the paper "Local Additivity Based Data Augmentation for Semi-supervised NER"

LADA This repo contains codes for the following paper: Jiaao Chen*, Zhenghui Wang*, Ran Tian, Zichao Yang, Diyi Yang: Local Additivity Based Data Augm

36 Dec 2, 2022

Python codes for Lite Audio-Visual Speech Enhancement.

Lite Audio-Visual Speech Enhancement (Interspeech 2020) Introduction This is the PyTorch implementation of Lite Audio-Visual Speech Enhancement (LAVSE

85 Dec 1, 2022

Codes for our IJCAI21 paper: Dialogue Discourse-Aware Graph Model and Data Augmentation for Meeting Summarization

DDAMS This is the pytorch code for our IJCAI 2021 paper Dialogue Discourse-Aware Graph Model and Data Augmentation for Meeting Summarization [Arxiv Pr

55 Dec 27, 2022

Official codes for the paper "Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech"

ResDAVEnet-VQ Official PyTorch implementation of Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech What is in this repo? M

21 Aug 23, 2022

Codes for ACL-IJCNLP 2021 Paper "Zero-shot Fact Verification by Claim Generation"

Zero-shot-Fact-Verification-by-Claim-Generation This repository contains code and models for the paper: Zero-shot Fact Verification by Claim Generatio

47 Jan 1, 2023

The official codes of "Semi-supervised Models are Strong Unsupervised Domain Adaptation Learners".

SSL models are Strong UDA learners Introduction This is the official code of paper "Semi-supervised Models are Strong Unsupervised Domain Adaptation L

26 Dec 26, 2022

Source codes of CenterTrack++ in 2021 ICME Workshop on Big Surveillance Data Processing and Analysis

MOT Tracked object bounding box association (CenterTrack++) New association method based on CenterTrack. Two new branches (Tracked Size and IOU) are a

36 Oct 4, 2022

The codes and models in 'Gaze Estimation using Transformer'.

GazeTR We provide the code of GazeTR-Hybrid in "Gaze Estimation using Transformer". We recommend you to use data processing codes provided in GazeHub.

65 Dec 27, 2022

codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification

DLCF-DCA codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification. submitted t

15 Aug 30, 2022

The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"

Swin-Unet The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"(https://arxiv.org/abs/2105.05537). A validatio

869 Jan 7, 2023

The source codes for ACL 2021 paper 'BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data'

BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data This repository provides the implementation details for

124 Dec 27, 2022

This is the official repo for TransFill: Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transformations at CVPR'21. According to some product reasons, we are not planning to release the training/testing codes and models. However, we will release the dataset and the scripts to prepare the dataset.

TransFill-Reference-Inpainting This is the official repo for TransFill: Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transf

80 Dec 8, 2022