Generative Models as a Data Source for Multiview Representation Learning

Ali

Last update: Dec 3, 2022

Related tags

Deep Learning GenRep

Overview

GenRep

Generative Models as a Data Source for Multiview Representation Learning
Ali Jahanian, Xavier Puig, Yonglong Tian, Phillip Isola

Prerequisites

Linux
Python 3
CPU or NVIDIA GPU + CUDA CuDNN

Table of Contents:

Setup
Visualizations - plotting image panels, videos, and distributions
Training - pipeline for training your encoder
Testing - pipeline for testing/transfer learning your encoder
Notebooks - some jupyter notebooks, good place to start for trying your own dataset generations
Colab Demo - a colab notebook to demo how the contrastive encoder training works

Setup

Clone this repo:

git clone https://github.com/ali-design/GenRep

Install dependencies:
- we provide a Conda environment.yml file listing the dependencies. You can create a Conda environment with the dependencies using:

conda env create -f environment.yml

Download resources:
- we provide a script for downloading associated resources. Fetch these by running:

bash resources/download_resources.sh

Visualizations

Plotting contrasting images:

Run simclr_views_paper_figure.ipynb and supcon_views_paper_figure.ipynb to get the anchors and their contrastive pairs showin in the paper.
To generate more images run biggan_generate_samples_paper_figure.py.

Training encoders

The current implementation covers these variants:
- Contrastive (SimCLR and SupCon)
- Inverters
- Classifiers
Some examples of commands for training contrastive encoders:

# train a SimCLR on an unconditional IGM dataset (e.g. your dataset is generated by a Gaussian walk, called my_gauss in a GANs model)
CUDA_VISIBLE_DEVICES=0,1 python main_unified.py --method SimCLR --cosine \ 
	--dataset path_to_your_dataset --walk_method my_gauss \ 
	--cache_folder your_ckpts_path >> log_train_simclr.txt &

# train a SupCon on a conditional IGM dataset (e.g. your dataset is generated by steering walks, called my_steer in a GANs model)
CUDA_VISIBLE_DEVICES=0,1 python main_unified.py --method SupCon --cosine \
	--dataset path_to_your_dataset --walk_method my_steer \ 
	--cache_folder your_ckpts_path >> log_train_supcon.txt &

If you want to find out more about training configurations, you can find the yml file of each pretrained models in models_pretrained

Testing encoders

You can currently test (i.e. trasfer learn) your encoder on:
- ImageNet linear classification
- PASCAL classification
- PASCAL detection

Imagenet linear classification

Below is the command to train a linear classifier on top of the features learned

# test your unconditional or conditional IGM trained model (i.e. the encoder you trained in the previous section) on ImageNet
CUDA_VISIBLE_DEVICES=0,1 python main_linear.py --learning_rate 0.3 \ 
	--ckpt path_to_your_encoder --data_folder path_to_imagenet \
	>> log_test_your_model_name.txt &

Pascal VOC2007 classification

To test classification on PascalVOC, you will extract features from a pretrained model and run an SVM on top of the futures. You can do that running the following code:

cd transfer_classification
./run_svm_voc.sh 0 path_to_your_encoder name_experiment path_to_pascal_voc

The code is based on FAIR Self-Supervision Benchmark

Pascal VOC2007 detection

To test transfer in detection experiments do the following:

Enter into transfer_detection
Install detectron2, replacing the detectron2 folder.
Convert the checkpoints path_to_your_encoder to detectron2 format:

python convert_ckpt.py path_to_your_encoder output_ckpt.pth

Add a symlink from the PascalVOC07 and PascalVOC12 into the datasets folder.
Train the detection model:

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python train_net.py \
      --num-gpus 8 \
      --config-file config/pascal_voc_R_50_C4_transfer.yaml \
      MODEL.WEIGHTS ckpts/${name}.pth \
      OUTPUT_DIR outputs/${name}

Notebooks

We provide some examples of jupyter notebooks illustrating the full training pipeline. See notebooks.
If using the provided conda environment, you'll need to add it to the jupyter kernel:

source activate genrep_env
python -m ipykernel install --user --name genrep_env

Colab

You can find a google colab notebook implementation here.

git Acknowledgements

We thank the authors of these repositories:

Citation

If you use this code for your research, please cite our paper:

@article{jahanian2021generative, 
	title={Generative Models as a Data Source for Multiview Representation Learning}, 
	author={Jahanian, Ali and Puig, Xavier and Tian, Yonglong and Isola, Phillip}, 
	journal={arXiv preprint arXiv:2106.05258}, 
	year={2021} 
}

Comments

Why the image size is not 128?

Hi, the image size in main_unified is multiplied by 0.85. The paper says the training image size is 128 while the actual image size in this code is 128 * 0.85. Can you explain why?

opened by xiao7199 4
A question about the real data the scale in experiments

Thanks for your great work. I have a little question about the real data scale in the comparative experiment in the paper. In the experiment setting, 'the real data encoders are trained on the ImageNet1000 dataset', the backbone is Resnet-50, and use the SimCLR data augmentation, the baseline result in Table 1 is 43.90. Is that because the experiment do not use all the training data in Imagenet1000? I didn't check the specific data, but only using resnet50 as the backbone to train a 1000-class classifier should not have such a low accuracy? Or I missing some important setting in the paper?

opened by wwq111111 1
Data Normalization for BigBiGAN

Hi, Please check if the normalization is correct for the BigBiGAN encoder. Based on the description of (https://github.com/ali-design/GenRep/blob/master/utils/utils_bigbigan.py#L115), the input data for BigBiGAN should be normalized between [-1,1]. But, at L144-145 of main_linear_bigbigan_encoder.py, you use ((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)) instead of ((0.5,0.5,0.5), (0.5,0.5,0.5)) which maps images to [-1,1]. I tried your default option for main_linear_bigbigan_encoder.py on ImageNet100 , and it gives 54 Top1 ACC, close to 55.7, as reported in Sec 4.2.
However, when I use ((0.5,0.5,0.5), (0.5,0.5,0.5)), it gives me over 72 Top1 ACC.

opened by xiao7199 0
Questions about path_to_your_dataset
About how to reproduce your work, I listed some steps:

install dependencies

use StyleGAN to your script in "utils" directory to generate dataset and the path of generated dataset was called "A".

In "Training encoders", there are two command and I took the first one as example. I replaced path_to_your_dataset by 'A', then run the command.

Testing

Are these steps correct? Please point out any misunderstand if it exists. Thanks a lot.
opened by ShaobinChen-AH 0

Generative Models as a Data Source for Multiview Representation Learning

Related tags

Overview

GenRep

Prerequisites

Setup

Visualizations

Training encoders

Testing encoders

Imagenet linear classification

Pascal VOC2007 classification

Pascal VOC2007 detection

Notebooks

Colab

git Acknowledgements

Citation

You might also like...

TAug :: Time Series Data Augmentation using Deep Generative Models

[CVPR2021] The source code for our paper 《Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning》.

Implementation of the paper "Language-agnostic representation learning of source code from structure and context".

[AAAI2021] The source code for our paper 《Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion》.

Learning Generative Models of Textured 3D Meshes from Real-World Images, ICCV 2021

This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.

Some tentative models that incorporate label propagation to graph neural networks for graph representation learning in nodes, links or graphs.

Comments

Why the image size is not 128?

A question about the real data the scale in experiments

Data Normalization for BigBiGAN

Questions about path_to_your_dataset

Owner

Ali

Deep Semisupervised Multiview Learning With Increasing Views (IEEE TCYB 2021, PyTorch Code)

Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance

Multiview 3D object detection on MultiviewC dataset through moft3d.

Joint Versus Independent Multiview Hashing for Cross-View Retrieval[J] (IEEE TCYB 2021, PyTorch Code)

Semi-supervised Representation Learning for Remote Sensing Image Classification Based on Generative Adversarial Networks

source code for https://arxiv.org/abs/2005.11248 "Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics"

Minimal PyTorch implementation of Generative Latent Optimization from the paper "Optimizing the Latent Space of Generative Networks"

Generative Query Network (GQN) in PyTorch as described in "Neural Scene Representation and Rendering"

Eff video representation - Efficient video representation through neural fields

Deep generative modeling for time-stamped heterogeneous data, enabling high-fidelity models for a large variety of spatio-temporal domains.