Pytorch implementation of MixNMatch

Last update: Dec 30, 2022

Related tags

Deep Learning deep-learning pytorch image-manipulation image-generation gans fine-grained disentangled-representations

Overview

MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation
[Paper]

Yuheng Li, Krishna Kumar Singh, Utkarsh Ojha, Yong Jae Lee
UC Davis
In CVPR, 2020

1/31/2020 update: Code and models released.

Demo Video

This is our CVPR2020 presentation video link

Web Demo

For interactive web demo click here. This web demo is created by Yang Xue.

Requirements

Linux
Python 3.7
Pytorch 1.3.1
NVIDIA GPU + CUDA CuDNN

Getting started

Clone the repository

git clone https://github.com/Yuheng-Li/MixNMatch.git
cd MixNMatch

Setting up the data

Download the formatted CUB data from this link and extract it inside the data directory

Downloading pretrained models

Pretrained models for CUB, Dogs and Cars are available at this link. Download and extract them in the models directory.

Evaluating the model

In code

Run python eval.py --z path_to_pose_source_images --b path_to_bg_source_images --p path_to_shape_source_images --c path_to_color_source_images --out path_to_ourput --mode code_or_feature --models path_to_pretrained_models
For example python eval.py --z pose/pose-1.png --b background/background-1.png --p shape/shape-1.png --c color/color.png --mode code --models ../models --out ./code-1.png
- NOTE:(1) in feature mode pose source images will be ignored; (2) Generator, Encoder and Feature_extractor in models folder should be named as G.pth, E.pth and EX.pth

Training your own model

In code/config.py:

Specify the dataset location in DATA_DIR.
- NOTE: If you wish to train this on your own (different) dataset, please make sure it is formatted in a way similar to the CUB dataset that we've provided.
Specify the number of super and fine-grained categories that you wish for FineGAN to discover, in SUPER_CATEGORIES and FINE_GRAINED_CATEGORIES.
For the first stage training run python train_first_stage.py output_name
For the second stage training run python train_second_stage.py output_name path_to_pretrained_G path_to_pretrained_E
- NOTE: output will be in output/output_name
- NOTE: path_to_pretrained_G will be output/output_name/Model/G_0.pth
- NOTE: path_to_pretrained_E will be output/output_name/Model/E_0.pth
For example python train_second_stage.py Second_stage ../output/output_name/Model/G_0.pth ../output/output_name/Model/E_0.pth

Results

1. Extracting all factors from differnet real images to synthesize a new image

2. Comparison between the feature and code mode

3. Manipulating real images by varying a single factor

4. Inferring style from unseen data

Cartoon -> image	Sketch -> image

5. Converting a reference image according to a reference video

Citation

If you find this useful in your research, consider citing our work:

@inproceedings{li-cvpr2020,
  title = {MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation},
  author = {Yuheng Li and Krishna Kumar Singh and Utkarsh Ojha and Yong Jae Lee},
  booktitle = {CVPR},
  year = {2020}
}

Comments

Bounding Box

What implementation did you use to get the bounding box in that format? I can see that (x, y, w, h) is the pixel distance from the top and left, I just don't know of a library that uses that format.

opened by avilatner 3
the Question about eval result!!

Hi, thank your sharing !! It is very interesting, I have tested some birds picture with pre-trained model by using eval.py and I find some question

I run the command line code: python eval.py --z pose/pose-2.png --b background/background-2.png --p shape/shape-2.png --c color/color-2.png --mode **_feature_** --models ../models/bird --out ./feature-2.png got feature-2.png and I run the: python eval.py --z pose/pose-2.png --b background/background-2.png --p shape/shape-2.png --c color/color-2.png --mode **_code_** --models ../models/bird --out ./code-2.png got code-2.png

the feature-2.png is the same as /code/result/0001.png but when I check, pose-2.png , background-2.png, shape-2.png , color-2.png. check image :

Conclusion:

1.the feature-2.png contain:

background of background-2.png, (checked ) shape of shape-2.png, (checked ) texture of color-2.png , (checked ) but different pose from pose-2.png (failed)

2.the code-2.png contain:

background of background-2.png, (checked ) shape of shape-2.png, (maybe checked ) texture of color-2.png , (checked ) but different pose from pose-2.png (checked)

My question is why it happened?

opened by Johnson-yue 3
Large amount of data is needed?

I took the top 6 categories in CUB dataset(323 pictures) and only changed SUPER_CATEGORIES to 3 and FINE_GRAINED_CATEGORIES to 6 to run train_first_stage.py but got bad results. In my opinion, deep learning model will easily fit the dataset which has little amount of data, it semms that your model need large amount of data to train? Expect to get your reply, Thanks.

opened by hfz223322 2
where the trained modles?

for my understanding to run eval.py and create new image (from 4 images ) i need the generator (G.pth) and encoder (E.pth) so after the first training stage i got folder with Model folder that contain the followings models : BD_0,DO_O,D1_0,D2_0,E_0,G_0 after training the second stage i have folder EX_0.pth my general questions what s should i need to do next for making new image? i need path for 4 images that is ok but i also need path to a folder with 3 pre trained models G.pth E.pth EX.pth - where are them? do i need to take E_0,G_0 (from first training stage) and EX_0 (from second training stage) and put them together under the same folder and that folder is the models path?

few questions: EX.pth that came from secound training stage is for feature mode? for eval.py why do i need E.pth and EX.pth ? how many epochs is needed (first and second stage) for getting the same resualts for birds datasets, the paper say nothing about that thanks

opened by orydatadudes 2
question about eval result and re-train result

Hi, thanks your helpping , I have finish re-train all stage. There are 6 pictures for result:

Format

| - | - | - | - | |:---: | :---: | :---: | :---: | |pose_file | background_file | shape_file | color_file | |retrain_gen_feature_mode | retrain_gen_code_mode | pretrain_gen_code_mode |pretrain_gen_feature_mode|

Pictures:

| pic name | pic |
|:--- | :---: | | pic_1 (pose-1, backgroud-1, shape-1, color-1) | | | pic_2 (pose-2, backgroud-2, shape-2, color-2) | | | pic_3 (pose-3, backgroud-3, shape-3, color-3) | | | pic_4 (pose-4, backgroud-4, shape-4, color-4) | | | pic_5 (pose-5, backgroud-5, shape-5, color-5) | | | pic_6 (pose-6, backgroud-6, shape-6, color-6) | |

My quesions:

Q1: why object of pic_1 is out of picture in code mode ? such as both pic_1(retrain_gen_code_mode and pretrain_gen_code_mode ) ?

Q2: my retrain model weight maybe failed in capture texture feature ? Did you have some experience for this ? the reason is different random seed or something else? such as all retrain_gen pictures

Q3: Nothing to be generated except background, [such as pic_5(retrain_gen_code_mode)], or bad backgroud [such as pic_6(pretrain_gen_feature_mode)]

opened by Johnson-yue 2
Parameter details

Hi,

What is the significance of these parameters SUPER_CATEGORIES and FINE_GRAINED_CATEGORIES.

Also, Is there a minimum number of recommended Image count for training. Do you have a Image preparation script to suit the model input for training.

opened by sridhar21111976 1
why not initialization the weight of BD_Networks?

Hi, in load_networks functions only three networks have been initialized : netG, netDs, encoder

but only BD not intialization ? why ? and this is your experiment performance ?

opened by Johnson-yue 1
Wrong comment

background_stage network : this line comment

because ngf = cfg.GAN_GF_DIM = 64 So ngf*8 = 512 . The output feature of self.fc is ngf*8*4*4 = 512 *4*4 , not 1024*4*4 So this function all comment is wrong

opened by Johnson-yue 1
Change in output image size from 128 * 128 px to custom size

I am trying to train the model on custom dataset. The train dataset is of the size 400 * 400 px. But while inferring the size of generated output is 128 * 128 px.

Is it possible to change the size of output image? If yes, is some modification required in the code?

opened by imgugale 1
Converting a reference image according to a reference video

is this keeping a video as pose images and then background, texture, and shape is the same image? Can you provide an example for Converting a reference image according to a reference video in eval.py?

opened by ak9250 1
mutual information discriminator

Hi, does the first p in the mutual information discriminator D(P|Pfm) used in the parent phase of FineGAN refer to the latent code p or the generated fake image p,I tried to read the source code of FineGAN, but I couldn't understand the idea of def train_Gnet in trainer.py about the implementation of the parent mutual information discriminator

opened by Haosouth 0
What is the Bounding box？

I finished reading this paper . the paper said we only require a loose bounding box around the object to model background, But the author didn't explain it in detail. and i have some questions 1.How we define the bounding box，what is it used for？ 2.what should i do to get the bounding box if i wanna train my own dataset?

i‘m looking forward to your reply。

opened by fanghao123-qw 3
Unable to setting up the dataset

Hey, guys your work is so interesting, that I want to setting up this model but having issues in doing that, Can you please refer me some videos of it how to set up the dataset in the model. I also want to contribute it by training it on different dataset. Can you please help me?

opened by Hani1-2 2
Unable to download pretrained models

Hey ! Amazing work guys !!! I am able to download the data but when I try the pretrained models it says Error 404 page not found. Just wanted to ask if you guys have moved the pretrained models from the given link ?

opened by Prajje 1
Colab version please?

Hello. I love this repo and I'd like to use it. Colab notebooks tend to be easy to set up, I tried doing it just now and was unable to. Would you mind creating a colab notebook version of this repo? Thank you!

opened by Tylersuard 2
image resize

Hey,

Thanks for the great work and releasing the code. I was wondering how to set the ratio(76/64) at the resize part here https://github.com/Yuheng-Li/MixNMatch/blob/21095b3581c7d47f67ed1bb360ca8ac3db6c299f/code/datasets.py#L57 ,

To extend the work to other image size for training, which resize ratio would be recommended?

Thanks!

opened by longw010 5

Owner

GitHub

RealFormer-Pytorch Implementation of RealFormer using pytorch

RealFormer-Pytorch Implementation of RealFormer using pytorch. Includes comparison with classical Transformer on image classification task (ViT) wrt C

90 Dec 8, 2022

A PyTorch implementation of the paper Mixup: Beyond Empirical Risk Minimization in PyTorch

Mixup: Beyond Empirical Risk Minimization in PyTorch This is an unofficial PyTorch implementation of mixup: Beyond Empirical Risk Minimization. The co

121 Dec 17, 2022

A pytorch implementation of Pytorch-Sketch-RNN

Pytorch-Sketch-RNN A pytorch implementation of https://arxiv.org/abs/1704.03477 In order to draw other things than cats, you will find more drawing da

172 Dec 12, 2022

PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Advantage async actor-critic Algorithms (A3C) in PyTorch @inproceedings{mnih2016asynchronous, title={Asynchronous methods for deep reinforcement lea

111 Dec 8, 2022

Pytorch-diffusion - A basic PyTorch implementation of 'Denoising Diffusion Probabilistic Models'

PyTorch implementation of 'Denoising Diffusion Probabilistic Models' This reposi

76 Jan 7, 2023

RetinaNet-PyTorch - A RetinaNet Pytorch Implementation on remote sensing images and has the similar mAP result with RetinaNet in MMdetection

?? RetinaNet Horizontal Detector Based PyTorch This is a horizontal detector Ret

13 Nov 19, 2022

RETRO-pytorch - Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

RETRO - Pytorch (wip) Implementation of RETRO, Deepmind's Retrieval based Attent

556 Jan 4, 2023

HashNeRF-pytorch - Pure PyTorch Implementation of NVIDIA paper on Instant Training of Neural Graphics primitives

HashNeRF-pytorch Instant-NGP recently introduced a Multi-resolution Hash Encodin

616 Jan 6, 2023

Generic template to bootstrap your PyTorch project with PyTorch Lightning, Hydra, W&B, and DVC.

NN Template Generic template to bootstrap your PyTorch project. Click on Use this Template and avoid writing boilerplate code for: PyTorch Lightning,

520 Dec 30, 2022

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

This repository holds NVIDIA-maintained utilities to streamline mixed precision and distributed training in Pytorch. Some of the code here will be included in upstream Pytorch eventually. The intention of Apex is to make up-to-date utilities available to users as quickly as possible.

6.9k Jan 3, 2023

Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch

30 Days Of Machine Learning Using Pytorch Objective of the repository is to learn and build machine learning models using Pytorch. List of Algorithms

119 Nov 24, 2022

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

1.4k Jan 1, 2023

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka

360 Dec 10, 2022

The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

This is a curated list of tutorials, projects, libraries, videos, papers, books and anything related to the incredible PyTorch. Feel free to make a pu

9.2k Jan 2, 2023

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka

359 Jan 5, 2023

A bunch of random PyTorch models using PyTorch's C++ frontend

PyTorch Deep Learning Models using the C++ frontend Gettting started Clone the repo 1. https://github.com/mrdvince/pytorchcpp 2. cd fashionmnist or

0 Jul 13, 2021

PyTorch Autoencoders - Implementing a Variational Autoencoder (VAE) Series in Pytorch.

PyTorch Autoencoders Implementing a Variational Autoencoder (VAE) Series in Pytorch. Inspired by this repository Model List check model paper conferen

8 Nov 21, 2022

PyTorch-LIT is the Lite Inference Toolkit (LIT) for PyTorch which focuses on easy and fast inference of large models on end-devices.

PyTorch-LIT PyTorch-LIT is the Lite Inference Toolkit (LIT) for PyTorch which focuses on easy and fast inference of large models on end-devices. With

157 Dec 11, 2022

A general framework for deep learning experiments under PyTorch based on pytorch-lightning

torchx Torchx is a general framework for deep learning experiments under PyTorch based on pytorch-lightning. TODO list gan-like training wrapper text

6 Mar 17, 2022

Pytorch implementation of MixNMatch

Related tags

Overview

MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation [Paper]

Demo Video

Web Demo

Requirements

Getting started

Clone the repository

Setting up the data

Downloading pretrained models

Evaluating the model

Training your own model

Results

1. Extracting all factors from differnet real images to synthesize a new image

2. Comparison between the feature and code mode

3. Manipulating real images by varying a single factor

4. Inferring style from unseen data

5. Converting a reference image according to a reference video

Citation

Comments

1.the feature-2.png contain:

2.the code-2.png contain:

Format

Pictures:

My quesions:

Owner

RealFormer-Pytorch Implementation of RealFormer using pytorch

A PyTorch implementation of the paper Mixup: Beyond Empirical Risk Minimization in PyTorch

A pytorch implementation of Pytorch-Sketch-RNN

PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Pytorch-diffusion - A basic PyTorch implementation of 'Denoising Diffusion Probabilistic Models'

RetinaNet-PyTorch - A RetinaNet Pytorch Implementation on remote sensing images and has the similar mAP result with RetinaNet in MMdetection

RETRO-pytorch - Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

HashNeRF-pytorch - Pure PyTorch Implementation of NVIDIA paper on Instant Training of Neural Graphics primitives

Generic template to bootstrap your PyTorch project with PyTorch Lightning, Hydra, W&B, and DVC.

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

A bunch of random PyTorch models using PyTorch's C++ frontend

PyTorch Autoencoders - Implementing a Variational Autoencoder (VAE) Series in Pytorch.

PyTorch-LIT is the Lite Inference Toolkit (LIT) for PyTorch which focuses on easy and fast inference of large models on end-devices.

A general framework for deep learning experiments under PyTorch based on pytorch-lightning

MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation
[Paper]