SMIS - Semantically Multi-modal Image Synthesis(CVPR 2020)

Related tags

Deep Learning SMIS
Overview

Semantically Multi-modal Image Synthesis

Project page / Paper / Demo

gif demo
Semantically Multi-modal Image Synthesis(CVPR2020).
Zhen Zhu, Zhiliang Xu, Ansheng You, Xiang Bai

Requirements


  • torch>=1.0.0
  • torchvision
  • dominate
  • dill
  • scikit-image
  • tqdm
  • opencv-python

Getting Started


Data Preperation

DeepFashion
Note: We provide an example of the DeepFashion dataset. That is slightly different from the DeepFashion used in our paper due to the impact of the COVID-19.

Cityscapes
The Cityscapes dataset can be downloaded at here

ADE20K
The ADE20K dataset can be downloaded at here

Test/Train the models

Download the tar of the pretrained models from the Google Drive Folder. Save it in checkpoints/ and unzip it. There are deepfashion.sh, cityscapes.sh and ade20k.sh in the scripts folder. Change the parameters like --dataroot and so on, then comment or uncomment some code to test/train model. And you can specify the --test_mask for SMIS test.

Acknowledgments


Our code is based on the popular SPADE

Comments
  • Error while testing

    Error while testing

    Hello thanks for publishing this project. When I try to test: python3 test.py --name deepfashion_smis --dataset_mode deepfashion --dataroot /root/filters/SMIS/data/deepfashion --no_instance --gpu_ids 1 --ngf 160 --batchSize 4 --model smis --netG deepfashion

    After placing the trained model in /checkpoints I get the following error:

    Traceback (most recent call last): File "test.py", line 19, in dataloader = data.create_dataloader(opt) File "/root/filters/SMIS/data/init.py", line 44, in create_dataloader instance.initialize(opt) File "/root/filters/SMIS/data/pix2pix_dataset.py", line 22, in initialize label_paths, image_paths, instance_paths = self.get_paths(opt) File "/root/filters/SMIS/data/deepfashion_dataset.py", line 30, in get_paths label_paths_all = make_dataset(label_dir, recursive=False) File "/root/filters/SMIS/data/image_folder.py", line 49, in make_dataset assert os.path.isdir(dir) or os.path.islink(dir), '%s is not a valid directory' % dir AssertionError: /root/filters/SMIS/data/deepfashion/cihp_test_mask is not a valid directory

    What am I doing wrong?

    opened by ofirkris 5
  • Error while running test

    Error while running test

    I use win10 pytorch 1.0.0, torchvision 0.2.2.post3, I try pytorch 1.4.0, torchvision 0.5.0 same error. I add DeepFashion Dataset and pretrained Run test: python test.py --name deepfashion_smis --dataset_mode deepfashion --dataroot D:/dataset/deepfashion --no_instance --gpu_ids 0 --ngf 160 --batchSize 4 --model smis --netE conv --netG deepfashion Error: image Please support help me!

    opened by thanhtin1997 3
  • About datset

    About datset

    Hello, I downloaded the deepfashion dataset from your home page, but I didn't find the placement form of the dataset in your description document. It's somewhat difficult to understand and the following errors occurred,hope to get your help E:/codeproject/SMIS-master/train.py Traceback (most recent call last): File "E:/codeproject/SMIS-master/train.py", line 21, in dataloader = data.create_dataloader(opt) File "E:\codeproject\SMIS-master\data_init_.py", line 44, in create_dataloader instance.initialize(opt) File "E:\codeproject\SMIS-master\data\pix2pix_dataset.py", line 22, in initialize label_paths, image_paths, instance_paths = self.get_paths(opt) File "E:\codeproject\SMIS-master\data\coco_dataset.py", line 35, in get_paths label_paths = make_dataset(label_dir, recursive=False, read_cache=True) File "E:\codeproject\SMIS-master\data\image_folder.py", line 49, in make_dataset assert os.path.isdir(dir) or os.path.islink(dir), '%s is not a valid directory' % dir AssertionError: ./datasets/deepfashion/train_label is not a valid directory

    opened by 2805413893 0
  • how to train a new model,there only has .sh for test

    how to train a new model,there only has .sh for test

    can i make some new data for training ,beacause after reading the paper,i believe this network can fit into different application and i want have a try.

    opened by SteveVanWang 0
  • (mCSD and mOCD)

    (mCSD and mOCD)

    Hello!This is a good job.

    I think that the new metrics (mCSD and mOCD) you designed is meaningful. But it seems that these new metrics are not included in the open source code. Are you willing to open the relevant code?

    Looking forward to your reply!

    opened by fuyou123 0
  • deepfasion test

    deepfasion test

    Hi! I'm appreciate for the author‘s efforts. And i try the test, the cloth try on is nice,but the person's face is not good. 00000012_02_1 00000020_06_7 00000026_05_7

    It seems like that the face restoration is not that good, is there any solution for this problem? As I think, use a mask that do not change the face? any suggestions? thx!!!

    opened by amandazw 0
  • mean Class-Specific Diversity (mCSD) and mean Other-Classes Diversity (mOCD)

    mean Class-Specific Diversity (mCSD) and mean Other-Classes Diversity (mOCD)

    Hello!This is a good job.

    I think that the new metrics (mCSD and mOCD) you designed is meaningful. But it seems that these new metrics are not included in the open source code. Are you willing to open the relevant code?

    Looking forward to your reply!

    opened by thinkerthinker 1
  • Some problems about ADE20K training

    Some problems about ADE20K training

    Hello! Thank you for your wonderful project! When I trained ADE20k data set, some problems appeared. What is the value of-use _ vae10 in ade20k.sh? I set 10 to start training, but the speed is very slow. What equipment do you train on? About how long did you train? Thank you! Looking forward to your reply!

    opened by VicZlq 1
  • How to transfer the cloth of one person to another?

    How to transfer the cloth of one person to another?

    Hi @Seanseattle and the team. Congratulations and thank you for the wonderful work!

    Using the file test.py, I have been able to generate multiple images in which the pose of the model comes from the input image, while the cloths are randomly generated. I would like to go a bit further by reproducing the test that you did in the paper where you take the cloth of one person an put it on another model. I couldn't find out how to do it with your source code. Could you please give me some instruction on this?

    Thank you very much!

    opened by doantientai 2
  • Sample Notebook

    Sample Notebook

    This paper is Awesome so I would request Plz include a Sample Notebook including custom data interpolation/Stylemix as documents are not clear. Greetings!

    opened by shahik 0
Owner
null
Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation This project attempted to implement the paper Putting NeRF on a

null 254 Dec 27, 2022
[CVPR'21] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

TransFuser This repository contains the code for the CVPR 2021 paper Multi-Modal Fusion Transformer for End-to-End Autonomous Driving. If you find our

null 695 Jan 5, 2023
Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral) This repo is the official imp

如今我已剑指天涯 46 Dec 21, 2022
[CVPR 2022 Oral] Versatile Multi-Modal Pre-Training for Human-Centric Perception

Versatile Multi-Modal Pre-Training for Human-Centric Perception Fangzhou Hong1  Liang Pan1  Zhongang Cai1,2,3  Ziwei Liu1* 1S-Lab, Nanyang Technologic

Fangzhou Hong 96 Jan 3, 2023
Code and pre-trained models for MultiMAE: Multi-modal Multi-task Masked Autoencoders

MultiMAE: Multi-modal Multi-task Masked Autoencoders Roman Bachmann*, David Mizrahi*, Andrei Atanov, Amir Zamir Website | arXiv | BibTeX Official PyTo

Visual Intelligence & Learning Lab, Swiss Federal Institute of Technology (EPFL) 385 Jan 6, 2023
《Image2Reverb: Cross-Modal Reverb Impulse Response Synthesis》(2021)

Image2Reverb Image2Reverb is an end-to-end neural network that generates plausible audio impulse responses from single images of acoustic environments

Nikhil Singh 48 Nov 27, 2022
The Official Implementation of the ICCV-2021 Paper: Semantically Coherent Out-of-Distribution Detection.

SCOOD-UDG (ICCV 2021) This repository is the official implementation of the paper: Semantically Coherent Out-of-Distribution Detection Jingkang Yang,

Jake YANG 62 Nov 21, 2022
From this paper "SESNet: A Semantically Enhanced Siamese Network for Remote Sensing Change Detection"

SESNet for remote sensing image change detection It is the implementation of the paper: "SESNet: A Semantically Enhanced Siamese Network for Remote Se

null 1 May 24, 2022
SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data (AAAI 2021)

SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data (AAAI 2021) PyTorch implementation of SnapMix | paper Method Overview Cite

DavidHuang 126 Dec 30, 2022
SeMask: Semantically Masked Transformers for Semantic Segmentation.

SeMask: Semantically Masked Transformers Jitesh Jain, Anukriti Singh, Nikita Orlov, Zilong Huang, Jiachen Li, Steven Walton, Humphrey Shi This repo co

Picsart AI Research (PAIR) 186 Dec 30, 2022
A pytorch-based deep learning framework for multi-modal 2D/3D medical image segmentation

A 3D multi-modal medical image segmentation library in PyTorch We strongly believe in open and reproducible deep learning research. Our goal is to imp

Adaloglou Nikolas 1.2k Dec 27, 2022
CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training

UC2 UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training Mingyang Zhou, Luowei Zhou, Shuohang Wang, Yu Cheng, Linjie Li, Zhou Yu,

Mingyang Zhou 28 Dec 30, 2022
Probabilistic Cross-Modal Embedding (PCME) CVPR 2021

Probabilistic Cross-Modal Embedding (PCME) CVPR 2021 Official Pytorch implementation of PCME | Paper Sanghyuk Chun1 Seong Joon Oh1 Rafael Sampaio de R

NAVER AI 87 Dec 21, 2022
Official Implement of CVPR 2021 paper “Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting”

RGBT Crowd Counting Lingbo Liu, Jiaqi Chen, Hefeng Wu, Guanbin Li, Chenglong Li, Liang Lin. "Cross-Modal Collaborative Representation Learning and a L

null 37 Dec 8, 2022
[CVPR 2021] Anycost GANs for Interactive Image Synthesis and Editing

Anycost GAN video | paper | website Anycost GANs for Interactive Image Synthesis and Editing Ji Lin, Richard Zhang, Frieder Ganz, Song Han, Jun-Yan Zh

MIT HAN Lab 726 Dec 28, 2022
A Multi-modal Model Chinese Spell Checker Released on ACL2021.

ReaLiSe ReaLiSe is a multi-modal Chinese spell checking model. This the office code for the paper Read, Listen, and See: Leveraging Multimodal Informa

DaDa 106 Dec 29, 2022
We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.

Multi-Modal Self-Supervision using GDT and StiCa This is an official pytorch implementation of papers: Multi-modal Self-Supervision from Generalized D

Facebook Research 42 Dec 9, 2022
AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition

AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition [ArXiv] [Project Page] This repository is the official implementation of AdaMML:

International Business Machines 43 Dec 26, 2022
Self-supervised Multi-modal Hybrid Fusion Network for Brain Tumor Segmentation

JBHI-Pytorch This repository contains a reference implementation of the algorithms described in our paper "Self-supervised Multi-modal Hybrid Fusion N

FeiyiFANG 5 Dec 13, 2021