DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort

Overview

DatasetGAN

This is the official code and data release for:

DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort

Yuxuan Zhang*, Huan Ling*, Jun Gao, Kangxue Yin, Jean-Francois Lafleche, Adela Barriuso, Antonio Torralba, Sanja Fidler

CVPR'21, Oral [paper] [supplementary] [Project Page]

News

  • Benchmark Challenge - A benchmark with diversed testing images is coming soon -- stay tuned!

  • Generated dataset for downstream tasks is coming soon -- stay tuned!

License

For any code dependency related to Stylegan, the license is under the Creative Commons BY-NC 4.0 license by NVIDIA Corporation. To view a copy of this license, visit LICENSE.

The code of DatasetGAN is released under the MIT license. See LICENSE for additional details.

The dataset of DatasetGAN is released under the Creative Commons BY-NC 4.0 license by NVIDIA Corporation. You can use, redistribute, and adapt the material for non-commercial purposes, as long as you give appropriate credit by citing our paper and indicating any changes that you've made.

Requirements

  • Python 3.6 or 3.7 are supported.
  • Pytorch 1.4.0 + is recommended.
  • This code is tested with CUDA 10.2 toolkit and CuDNN 7.5.
  • Please check the python package requirement from requirements.txt, and install using
pip install -r requirements.txt

Download Dataset from google drive and put it in the folder of ./datasetGAN/dataset_release. Please be aware that the dataset of DatasetGAN is released under the Creative Commons BY-NC 4.0 license by NVIDIA Corporation.

Download pretrained checkpoint from Stylegan and convert the tensorflow checkpoint to pytorch. Put checkpoints in the folder of ./datasetGAN/dataset_release/stylegan_pretrain. Please be aware that the any code dependency and checkpoint related to Stylegan, the license is under the Creative Commons BY-NC 4.0 license by NVIDIA Corporation.

Note: a good example of converting stylegan tensorlow checkpoint to pytorch is available this Link.

Training

To reproduce paper DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort:

cd datasetGAN
  1. Run Step1: Interpreter training.
  2. Run Step2: Sampling to generate massive annotation-image dataset.
  3. Run Step3: Train Downstream Task.

1. Interpreter Training

python train_interpreter.py --exp experiments/.json 

Note: Training time for 16 images is around one hour. 160G RAM is required to run 16 images training. One can cache the data returned from prepare_data function to disk but it will increase trianing time due to I/O burden.

Example of annotation schema for Face class. Please refer to paper for other classes.

img

2. Run GAN Sampling

python train_interpreter.py \
--generate_data True --exp experiments/.json  \
--resume [path-to-trained-interpreter in step3] \
--num_sample [num-samples]

To run sampling processes in parallel

sh datasetGAN/script/generate_face_dataset.sh

Example of sampling images and annotation:

img

3. Train Downstream Task

python train_deeplab.py \
--data_path [path-to-generated-dataset in step4] \
--exp experiments/.json

Inference

img

python test_deeplab_cross_validation.py --exp experiments/face_34.json\
--resume [path-to-downstream task checkpoint] --cross_validate True

June 21st Update:

For training interpreter, we change the upsampling method from nearnest upsampling to bilinar upsampling in line and update results in Table 1. The table reports mIOU.

Citations

Please ue the following citation if you use our data or code:

@inproceedings{zhang2021datasetgan,
  title={Datasetgan: Efficient labeled data factory with minimal human effort},
  author={Zhang, Yuxuan and Ling, Huan and Gao, Jun and Yin, Kangxue and Lafleche, Jean-Francois and Barriuso, Adela and Torralba, Antonio and Fidler, Sanja},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={10145--10155},
  year={2021}
}
Comments
  • Lacking documentation on how to create the average latent file

    Lacking documentation on how to create the average latent file

    The example configuration files (https://github.com/nv-tlabs/datasetGAN_release/tree/master/datasetGAN/experiments), ex. "cat_16.json" or "car_20.json", contain a field named "average_latent" and "annotation_image_latent_path". Both of these fields are paths to .npy files.

    This repo does not describe how these files are generated. Even after one trains a StyleGAN model, one cannot then use DatasetGAN without these files. Please provide documentation on how to create this file for a custom dataset.

    I imagine that you'd want to provide examples using your up-to-date repos, ex. https://github.com/NVlabs/stylegan2-ada-pytorch and/or https://github.com/NVlabs/stylegan2-ada.

    opened by ericchansen 13
  • No detailed description of config is given.

    No detailed description of config is given.

    Hi, To run datasetgan, i need not only the model represented by .pt but also the npy binaries for average_latent and annotation_image_latent. However, the readme only has a configuration that works based on four pre-prepared config files. I understand that I need to write my own config and prepare the necessary data, but there is no explanation for that. Or is this code just a demo and not supposed to work with any data sets? Thank you.

    opened by udemegane 11
  • stylegan2-ada-pytorch support

    stylegan2-ada-pytorch support

    Hello,

    Thanks for your great work, I just wanted to know how can we use the stylegan2 pre-trained models for training the interpreter? do you think you will support it in the future?

    I have trained my models with stylegan2-ada-pytorch from https://github.com/NVlabs/stylegan2-ada-pytorch

    opened by afotonower 7
  • Control graphics memory usage

    Control graphics memory usage

    I got a cuda OOM while running train_interpreter, I found a variable "batch_size" in train_interpreter.py but changing it doesn't seem to help, can I limit the memory usage?(For example, from the json config) I have a 2080ti board with 11GB of GMEM. Thank you.

    Opt {'exp_dir': 'model_dir/face_34', 'category': 'face', 'debug': False, 'dim': [512, 512, 5088], 'deeplab_res': 512, 'number_class': 34, 'testing_data_number_class': 34, 'max_training': 16, 'stylegan_ver': '1', 'annotation_data_from_w': False, 'annotation_mask_path': './dataset_release/annotation/training_data/face_processed', 'testing_path': './dataset_release/annotation/testing_data/face_34_class', 'average_latent': './dataset_release/training_latent/face_34/avg_latent_stylegan1.npy', 'annotation_image_latent_path': './dataset_release/training_latent/face_34/latent_stylegan1.npy', 'stylegan_checkpoint': './dataset_release/stylegan_pretrain/karras2019stylegan-ffhq-1024x1024.for_g_all.pt', 'model_num': 10, 'upsample_mode': 'bilinear'} /home/udemegane/anaconda3/envs/dataset/lib/python3.7/site-packages/torch/nn/functional.py:2973: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details. "See the documentation of nn.Upsample for details.".format(mode)) Traceback (most recent call last): File "train_interpreter.py", line 578, in <module> main(opts) File "train_interpreter.py", line 446, in main all_feature_maps_train_all, all_mask_train_all, num_data = prepare_data(args, palette) File "train_interpreter.py", line 403, in prepare_data return_upsampled_layers=True, use_style_latents=args['annotation_data_from_w']) File "../utils/utils.py", line 94, in latent_to_image affine_layers_upsamples = torch.FloatTensor(1, number_feautre, dim, dim).cuda() RuntimeError: CUDA out of memory. Tried to allocate 4.97 GiB (GPU 0; 10.76 GiB total capacity; 7.61 GiB already allocated; 2.33 GiB free; 7.65 GiB reserved in total by PyTorch)

    opened by udemegane 7
  • Question about key point

    Question about key point

    1. In class generate_data, What is the purpose of using 10 models?
    2. I noticed make_ training_ Data.py seems to be used to generate data, but I want to train the first stage with my own labeled training data. How should I operate?

    Thank you!

    opened by sky-fly97 6
  • Can I reduce the nums of input features of the final classification layer

    Can I reduce the nums of input features of the final classification layer

    At present, I use 256 * 256*3 images for training, and the final styleGAN output is 256 * 256 * 4864. Due to the limitation of memory, I can only load the features of about 100 images at a time under 400G memory, and it will take dozens of minutes. Is there any way to use more images for training? For example, is it feasible to reduce the number of input features?

    opened by sky-fly97 4
  • ADE-Car-12 testing set and PASCAL-Car-5

    ADE-Car-12 testing set and PASCAL-Car-5

    Hi, Nice work! I am curious about how to construct the ADE-Car-12 and PASCAL-Car-5. In the paper, it says ADE-Car-20 contains 50 and 250 images for training and testing. I wonder what are those exact images (e.g. image index). Are they preprocessed? Is there any instruction for constructing it? I also wonder which 900 images are taken from Pascal Part for PASCAL-Car-5 and how are the 5 classes defined. Thanks very much!

    opened by yangyu12 4
  • Confusion regarding checkpoints

    Confusion regarding checkpoints

    Can you please provide greater clarity on where to download the styleGAN checkpoints from?

    The linked repo has pickled models but not checkpoints and certainly not .pt checkpoints. Also its a tensorflow implementation so I'm a little confused...

    Specfically where can I find:

    • karras2019stylegan-cars-512x384.for_g_all.pt
    • karras2019stylegan-cats-256x256.for_g_all.pt
    • karras2019stylegan-celebahq-1024x1024.for_g_all.pt
    opened by bfialkoff 4
  • Missing pickle files for training deeplab

    Missing pickle files for training deeplab

    Hello, Thank you so much for providing the code for this paper :) I am trying to generate an annotated face dataset, but I am having some trouble with the train_deeplab.py. In ln[95] there is all_pickle = glob.glob(data_path + '/*.pickle'), which loads the 16 manually annotated data if I understood the docs correctly. However, I can't seem to find these .pickle files. I guess one can load the training/test data provided in dataset_release, but it would still be missing the uncertainty_score.

    opened by AbdouMechraoui 3
  • How to reduce the number of input features of pixel_classifier

    How to reduce the number of input features of pixel_classifier

    Hi,

    After having trained my own StyleGAN on images of 128x128 pixels in order to then use the DatasetGAN, the code of "train_interpreter.py" indicates that the total number of features that will go into the pixel_classifier (the value dim [-1]) would be of 79,691,776 - which is way too big and just not normal compared to the values presented in the configuration files (.json files).

    Is there a way that would allow me to reduce this value or correct this ?

    Note: This values of 79,691,776 (= 128x128x4864) was the one giving by the line of code "feature_maps = feature_maps.reshape(-1, args["dim"][2])" in the function "prepare_data"

    opened by PoissonChasseur 3
  • Training deeplab on higher resolution images

    Training deeplab on higher resolution images

    Hi!

    Thanks once again for your responsiveness :)

    May I ask about the procedure used to train deeplab-v3 on 1024 images, I noticed the provided code in the repo samples train_interpreter.py, and trains train_deeplab.py on images with 512 x 512 resolution for face_34 task.

    When doing the same for 1024 x 1024 images, the cross_validation script results in this: trained_deeplab_1024_500

    Could you provide more information on the procedure for training deeplab on 1024 x 1024 images? How large was the training set that you used? The number of epochs, and batch size? I ask the latter, as I also had some issues with OOM cuda error, when training on 32G NVIDIA Tesla V100.

    opened by AbdouMechraoui 2
  • Cache Data to Decrease RAM

    Cache Data to Decrease RAM

    Hi @arieling,

    Great job with this repository, it is awesome.

    On the ReadMe page you state that:

    "One can cache the data returned from prepare_data function to disk but it will increase trianing time due to I/O burden."

    How would I implement this?

    Thank You!

    opened by getshaun24 0
  • What is the exact PCK used for keypoint detection

    What is the exact PCK used for keypoint detection

    What is the exact PCK used for keypoint detection?

    To be more specific, PCK means the keypoint accuracy where a keypoint is defined as "correct" if it lies in a range of the GT keypoint. I cannot find the exact range definition in the paper.

    Thanks in advance.

    opened by xingzhehe 0
  • Using DatasetGan under EditGan

    Using DatasetGan under EditGan

    Is it possible to use only the DatasetGan part in the new version under EditGan? I want to generate annotated dataset based on StyleGan 2 but without the app of EditGan, it is still possible?

    opened by nadavpo 0
  • RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x18 and 512x1024)

    RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x18 and 512x1024)

    Thank you for your code. I got this error: RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x18 and 512x1024)

    To fix this I add transpose on the latent variable before latent_to_image in make_training_data.py

    opened by nadavpo 0
  • Does RTX3090 support training and inference of this network?

    Does RTX3090 support training and inference of this network?

    The author indicated that the graphics card device was Tesla V100, which is relatively high configuration requirements for some general lab. I would like to ask if the author has performed calculations on other relatively low-profile graphics cards, such as RTX30 series or RTX20 series graphics cards?

    opened by KevinBanksB 1
  • Affine Layers extracted from StyleGAN1

    Affine Layers extracted from StyleGAN1

    Thank you for open-sourcing this project! We are trying to implement the model, and are trying to better understand the affine_layers variable that is used to generate the synthetic images. What are these layers that are being taken from the pre-trained StyleGAN1 (SG1)? Are they specific hidden layers? Also, if we were to swap this SG1 for our own model, would the corresponding affine_layers be sufficient along with the randomly generated latent variable?

    Thank you for your time, we appreciate it!

    opened by SarthakJShetty 1
Owner
null
Rayvens makes it possible for data scientists to access hundreds of data services within Ray with little effort.

Rayvens augments Ray with events. With Rayvens, Ray applications can subscribe to event streams, process and produce events. Rayvens leverages Apache

CodeFlare 32 Dec 25, 2022
BESS: Balanced Evolutionary Semi-Stacking for Disease Detection via Partially Labeled Imbalanced Tongue Data

Balanced-Evolutionary-Semi-Stacking Code for the paper ''BESS: Balanced Evolutionary Semi-Stacking for Disease Detection via Partially Labeled Imbalan

null 0 Jan 16, 2022
Minimal diffusion models - Minimal code and simple experiments to play with Denoising Diffusion Probabilistic Models (DDPMs)

Minimal code and simple experiments to play with Denoising Diffusion Probabilist

Rithesh Kumar 16 Oct 6, 2022
[CVPR2021] DoDNet: Learning to segment multi-organ and tumors from multiple partially labeled datasets

DoDNet This repo holds the pytorch implementation of DoDNet: DoDNet: Learning to segment multi-organ and tumors from multiple partially labeled datase

null 116 Dec 12, 2022
The toolkit to generate auto labeled datasets

Ozeu Ozeu is the toolkit to autolabal dataset for instance segmentation. You can generate datasets labaled with segmentation mask and bounding box fro

Xiong Jie 28 Mar 28, 2022
The Self-Supervised Learner can be used to train a classifier with fewer labeled examples needed using self-supervised learning.

Published by SpaceML • About SpaceML • Quick Colab Example Self-Supervised Learner The Self-Supervised Learner can be used to train a classifier with

SpaceML 92 Nov 30, 2022
Implementation for "Domain-Specific Bias Filtering for Single Labeled Domain Generalization"

DSBF Introduction This repository contains the implementation code for paper: Domain-Specific Bias Filtering for Single Labeled Domain Generalization

ScottYuan 7 Jan 5, 2023
StyleGAN-Human: A Data-Centric Odyssey of Human Generation

StyleGAN-Human: A Data-Centric Odyssey of Human Generation Abstract: Unconditional human image generation is an important task in vision and graphics,

stylegan-human 762 Jan 8, 2023
[CVPR2021] UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles

UAV-Human Official repository for CVPR2021: UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicle Paper arXiv Res

null 129 Jan 4, 2023
Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors, CVPR 2021

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors Human POSEitioning System (H

Aymen Mir 66 Dec 21, 2022
Human Action Controller - A human action controller running on different platforms.

Human Action Controller (HAC) Goal A human action controller running on different platforms. Fun Easy-to-use Accurate Anywhere Fun Examples Mouse Cont

null 27 Jul 20, 2022
Python scripts for performing 3D human pose estimation using the Mobile Human Pose model in ONNX.

Python scripts for performing 3D human pose estimation using the Mobile Human Pose model in ONNX.

Ibai Gorordo 99 Dec 31, 2022
Efficient-GlobalPointer - Pytorch Efficient GlobalPointer

引言 感谢苏神带来的模型,原文地址:https://spaces.ac.cn/archives/8877 如何运行 对应模型EfficientGlobalPoi

powerycy 40 Dec 14, 2022
Minimal deep learning library written from scratch in Python, using NumPy/CuPy.

SmallPebble Project status: experimental, unstable. SmallPebble is a minimal/toy automatic differentiation/deep learning library written from scratch

Sidney Radcliffe 92 Dec 30, 2022
PyTorch reimplementation of minimal-hand (CVPR2020)

Minimal Hand Pytorch Unofficial PyTorch reimplementation of minimal-hand (CVPR2020). you can also find in youtube or bilibili bare hand youtube or bil

Hao Meng 228 Dec 29, 2022
Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering (NAACL 2021)

Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering Abstract In open-domain question answering (QA), retrieve-and-read mec

Clova AI Research 34 Apr 13, 2022
Minimal PyTorch implementation of YOLOv3

A minimal PyTorch implementation of YOLOv3, with support for training, inference and evaluation.

Erik Linder-Norén 6.9k Dec 29, 2022
Minimal implementation of Denoised Smoothing: A Provable Defense for Pretrained Classifiers in TensorFlow.

Denoised-Smoothing-TF Minimal implementation of Denoised Smoothing: A Provable Defense for Pretrained Classifiers in TensorFlow. Denoised Smoothing is

Sayak Paul 19 Dec 11, 2022
Minimal implementation of PAWS (https://arxiv.org/abs/2104.13963) in TensorFlow.

PAWS-TF ?? Implementation of Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting View Assignments with Support Samples (PAWS)

Sayak Paul 43 Jan 8, 2023