Official PyTorch Implementation of GAN-Supervised Dense Visual Alignment

Overview

GAN-Supervised Dense Visual Alignment — Official PyTorch Implementation

Paper | Project Page | Video

Teaser image Teaser image

This repo contains training, evaluation and visualization code for the GANgealing algorithm from our GAN-Supervised Dense Visual Alignment paper.

GAN-Supervised Dense Visual Alignment
William Peebles, Jun-Yan Zhu, Richard Zhang, Antonio Torralba, Alexei Efros, Eli Shechtman
UC Berkeley, Carnegie Mellon University, Adobe Research, MIT CSAIL

GAN-Supervised Learning is a framework for learning discriminative models and their GAN-generated training data jointly end-to-end. We apply our framework to the dense visual alignment problem. Inspired by the classic Congealing method, our GANgealing algorithm trains a Spatial Transformer to warp random samples from a GAN trained on unaligned data to a common, jointly-learned target mode. The target mode is updated to make the Spatial Transformer's job "as easy as possible." The Spatial Transformer is trained exclusively on GAN images and generalizes to real images at test time automatically.

Watch the video

This repository contains:

  • 🎱 Pre-trained GANgealing models for eight datasets, including both the Spatial Transformers and generators
  • 💥 Training code which fully supports Distributed Data Parallel
  • 🎥 Scripts for running our Mixed Reality application with pre-trained Spatial Transformers
  • A lightning-fast CUDA implementation of splatting to generate high quality warping visualizations
  • 🎆 Several additional evaluation and visualization scripts to reproduce results from our paper and website

This code base should be mostly ready to go, but we may make a few tweaks over December 2021 to smooth out any remaining wrinkles.

Setup

First, download the repo:

git clone [email protected]:wpeebles/gangealing.git
cd gangealing

We provide an environment.yml file that can be used to create a Conda environment:

conda env create -f environment.yml
conda activate gg

If you use your own environment, we recommend using the most current version of PyTorch.

Running Pre-Trained Models

The applications directory contains several files for evaluating and visualizing pre-trained GANgealing models.

We provide several pre-trained GANgealing models: bicycle, cat, celeba, cub, dog and tvmonitor. We also have pre-trained checkpoints for our car and horse clustering models. Calling any of the files in applications with the --ckpt argument will automatically download and cache the weights. As described in our paper, we highly recommend using --iters 3, which controls the number of times the similarity Spatial Transformer is recursively evaluated, for all LSUN models to get the most accurate results (and --iters 1 for In-The-Wild CelebA and CUB). Finally, the --output_resolution argument controls the size of congealed images output by the Spatial Transformer. For the highest quality results, we recommend setting this equal to --real_size (default value is 128).

Preparing Real Data

We use LMDBs for storing data. You can use prepare_data.py to pre-process input datasets. Note that setting-up real data is not required for training.

LSUN: Download and unzip the relevant category from here (e.g., cat). You can pre-process the data with the following command:

python prepare_data.py --input_is_lmdb --path path_to_downloaded_folder --out data/lsun_cats --pad center --size 512

Image Folders: For any dataset where you have all images in a single folder, you can pre-process them with:

python prepare_data.py --path folder_of_images --out data/my_new_dataset --pad [center/border/zero] --size S

where S is the square resolution the images will be resized to.

SPair-71K: You can download and prepare SPair for PCK evaluation (e.g., for Cats) with:

python prepare_data.py --spair_category cat --spair_split test --out data/spair_cats_test --size 256

CUB: We closely follow the pre-processing steps used by ACSM for CUB PCK evaluation. You can download and prepare the CUB validation split with:

python prepare_data.py --cub_acsm --out data/cub_val --size 256

Congealing and Dense Correspondence Visualization

Teaser image

vis_correspondence.py produces a video depicting real images being gradually aligned with our Spatial Transformer network. It also can be used to visualize label/object propagation:

python applications/vis_correspondence.py --ckpt cat --iters 3 --real_data_path data/lsun_cats --vis_in_stages --real_size 512 --output_resolution 512 --resolution 512 --label_path assets/masks/cat_mask.png --dset_indices 1922 2363 8558 7401 9750 7432 2105 53 1946

Mixed Reality (Object Lenses)

Teaser image

mixed_reality.py applies a pre-trained Spatial Transformer per-frame to an input video. We include several objects and masks you can propagate in the assets folder.

The first step is to prepare the video dataset. If you have the video saved as an image folder (with filenames in order based on timestamp), you can run:

python prepare_data.py --path folder_of_frames --out data/my_video_dataset --pad center --size 1024

This command will pre-process the images to square with center-cropping and resize them to 1024x1024 resolution (you can use any square resolution you like). You can also instead specify --pad border to perform border padding instead of cropping.

If your video is saved in mp4, mov, etc. format, we provide a script that will convert it into frames via FFmpeg:

./process_video.sh path_to_video

This will save a folder of frames in the data/video folder, which you can then run prepare_data.py on as described above.

Now that the data is setup, we can run GANgealing on the video. For example, this will propagate a cartoon face via our LSUN Cats model:

python -m torch.distributed.launch --nproc_per_node=NUM_GPUS --master_port=6085 applications/mixed_reality.py --ckpt cat --iters 3 --objects --label_path assets/objects/cat/cat_cartoon.png --sigma 0.3 --opacity 1 --real_size 1024 --resolution 8192 --out video_materials_full/cats --real_data_path path_to_my_video --no_flip_inference

This will efficiently parallelize evaluation of the video over NUM_GPUS. If you are propagating to a long video/ are running out of memory, you can add the --save_frames argument which should use significantly less memory (at the cost of speed). The --objects argument pulls propagated RGB values from the RGBA image --label_path points to. If you omit --objects, only the alpha channel of --label_path will be used and a colorscale will be created (useful for visualizing tracking when propagating masks). For models that do not benefit much from flipping (e.g., LSUN Cats, TVs and CelebA), we recommend using the --no_flip_inference argument to disable unnecessary flipping.

Creating New Object Lenses

To propagate your own custom object, you need to create a new RGBA image saved as a png. You can take the pre-computed average congealed image for your model of interest (located in assets/averages) and load it into an image editor like Photoshop. Then, overlay your object of interest on the template and export the object as an RGBA png image. Pass your new object with the --label_path argument like above.

We recommend saving the object at a high resolution for the highest quality results (e.g., 4K resolution or higher if you are propagating to a 1K resolution video).

PCK-Transfer Evaluation

Our repo includes a fast implementation of PCK-Transfer in pck.py that supports multi-GPU evaluation. First, make sure you've setup either SPair-71K or CUB as described earlier. You can evaluate PCK-Transfer as follows:

To evaluate SPair-71K (e.g., cats category):

python -m torch.distributed.launch --nproc_per_node=NUM_GPUS --master_port=6085 applications/pck.py --ckpt cat --iters 3 --real_data_path data/spair_cats_test --real_size 256

To evaluate PCK on CUB:

python -m torch.distributed.launch --nproc_per_node=NUM_GPUS --master_port=6085 applications/pck.py --ckpt cub --real_data_path data/cub_val --real_size 256 --num_pck_pairs 10000 --transfer_both_ways

You can also add the --vis_transfer argument to save a visualization of keypoint transfers.

Note that different methods compute PCK in slightly different ways depending on dataset. For CUB, the protocol used by past methods is to sample 10,000 random pairs from the validation set and evaluate bidirectional transfers. For SPair, fixed pairs are always used and the transfers are one-way. Our implementation of PCK supports both of these protocols to ensure accurate comparisons against baseslines.

Learned Pre-Processing of Datasets

Finally, we also include a script that applies a pre-trained Spatial Transformer to align and filter a datasets (e.g., for downstream GAN training).

To do this, you will need two versions of your dataset: (1) a pre-processed version (via prepare_data.py as described above) which will be used to quickly compute flow smoothness scores, and (2) a raw, unprocessed version of the dataset stored in LMDB format. We'll explain how to create this second unprocessed copy below.

The first recommended step is to compute flow smoothness scores for each image in the dataset. As described in our paper, these scores do a good job at identifying (1) images the Spatial Transformer fails on and (2) images that are impossible to align to the learned target mode. The scores can be computed as follows:

python -m torch.distributed.launch --nproc_per_node=NUM_GPUS --master_port=6085 applications/flow_scores.py --ckpt cat --iters 3 --real_data_path my_dataset --real_size S --no_flip_inference

where my_dataset should be created with our prepare_data.py script as described above. This will cache a tensor of flow scores at my_dataset/flow_scores.pt.

Next is the alignment step. Create an LMDB of the raw, unprocessed images in your unaligned dataset using the --pad none argument:

python prepare_data.py --path folder_of_frames --out data/new_lmdb_data --pad none --size 0

Finally, you can generate a new, aligned and filtered dataset:

python -m torch.distributed.launch --nproc_per_node=NUM_GPUS --master_port=6085 applications/congeal_dataset.py --ckpt cat --iters 3 --real_data_path data/new_lmdb_data --out data/my_new_aligned_dataset --real_size 0 --flow_scores my_dataset/flow_scores.pt --fraction_retained 0.25 --output_resolution S

where S is the desired output resolution of the dataset and the --fraction_retained argument controls the percentange of images that will be retained based on flow scores. There are some other arguments you can adjust---see documentation in congeal_dataset.py for details.

Using Pre-Trained Clustering Models

The clustering models are usable in most places the unimodal models are (with a few current exceptions, such as flow_scores.py and congeal_dataset.py). To load the clustering models, add --num_heads K (for our pre-trained models, K=4). There are also several files that let you propagate from a chosen cluster with the --cluster <cluster_index> argument (e.g., mixed_reality.py and vis_correspondence.py). Please refer to the documentation in these files for details.

Training

(We will add additional training scripts in the coming days!)

To train new GANgealing models, you will need pre-trained StyleGAN2(-ADA) generator weights from the rosinality repo. We also include generator checkpoints in all of our pre-trained GANgealing weights. Please refer to the scripts folder for examples of training commands, and see train.py for details.

When training a clustering model (--num_heads > 1), you will need to train a cluster classifier network to use the model on real images. This is done with train_cluster_classifier.py; see example commands in scripts.

Note that for the majority of experiments in our paper, we trained using 8 GPUs and a per-GPU batch size of 5.

Citation

If our code or models aided your research, please cite our paper:

@article{peebles2021gansupervised,
title={GAN-Supervised Dense Visual Alignment},
author={William Peebles and Jun-Yan Zhu and Richard Zhang and Antonio Torralba and Alexei Efros and Eli Shechtman},
year={2021},
journal={arXiv preprint arXiv:2112.05143},
}

Acknowledgments

We thank Tim Brooks for his antialiased sampling code and helpful discussions. We thank Tete Xiao, Ilija Radosavovic, Taesung Park, Assaf Shocher, Phillip Isola, Angjoo Kanazawa, Shubham Goel, Allan Jabri, Shubham Tulsiani and Dave Epstein for helpful discussions. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. DGE 2146752. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. Additional funding provided by Adobe and Berkeley Deep Drive.

This repository is built on top of rosinality's excellent PyTorch re-implementation of StyleGAN2.

Comments
  • could anyone set this up on google colab?

    could anyone set this up on google colab?

    Would be awesome to try out this project on google colab?

    Also, would this kind of inference be possible on device (say on a mediatek dimensity 1200 chip ? )

    opened by GeorvityLabs 12
  • model synthesizes features to fool the loss

    model synthesizes features to fool the loss

    Hello:

    I've been using the model on my own custom dataset for a while. When I visualize the congealing process on test set and the propagated dense tracking, I noticed:

    1. the congealing process synthesize features (i.e. create animal's head on its tail side), instead of rotate or flip the image to do the correct alignment
    2. on the dense tracking, color scale will flip (i.e. flip by head to tail on animals), which I think corresponding to the last point

    I read the paper about using flow smoothness and flip to avoid this issue and I understand this can occur a lot. How exactly are flip and flow smoothness helping avoid this issue? What parameter can I adjust to make my model more robust? Does the model improve on this issue after 1 million epochs ? For resources reason I haven't been able to run to 1 million epochs yet but I read the other issue post that you mention it usually takes that long for the model to improve.

    I have try the default setting script for both 1, 2 head and 4 head. Then I also tried increase the inject and flow_size parameter. I also tried turn on the sample_from_full_resolution option, but haven't got any good progress from these trial yet.

    Thanks in advance and it's really appriciated that you're consistently helping out : )

    opened by petercmh01 10
  • loss is always 0 &&

    loss is always 0 && "transformed_sample_0******.png" become one-color and only 6kb after 105000 iters

    question one: on training, the loss is always 0 question two: "transformed_sample_0******.png" become one-color and only 6kb after 105000 iters,

    why did I fail to train the model? Thanks!


    I didn't change the code. The weights was auto download.

    script:

    CUDA_VISIBLE_DEVICES=7 python train.py \
    --ckpt cat --load_G_only --padding_mode border --vis_every 5000 --ckpt_every 50000 \
    --iter 1500000 --tv_weight 1000 --loss_fn vgg_ssl --exp-name debug
    

    out:

    Setting up [baseline] perceptual loss: trunk [vgg], v[0.1], spatial [off]
    Loading VGG with pretrained=False
    Loaded custom VGG weights.
    Loading model from cat
    Only G_EMA has been loaded from checkpoint. Other nets are random!
    Fitting PCA...
    Learning Rate Cycles: [149999, 187499, 262499, 412499, 712499, 1312499]
    perceptual loss: 0.1561; tv loss: 0.000000; identity loss: 0.0000; psi: 1.0000:
    
    opened by wangherr 10
  • IndexError: list index out of range

    IndexError: list index out of range

    Hi, I try to run any command and will get this error:

    
    No CUDA runtime is found, using CUDA_HOME='/cm/shared/apps/cuda100/10.0.130'
    Traceback (most recent call last):
      File "applications/propagate_to_images.py", line 20, in <module>
        from applications import base_eval_argparse, load_stn, determine_flips
      File "/home/user/nvidia_code/gangealing/applications/__init__.py", line 3, in <module>
        from models import get_stn, ResnetClassifier
      File "/home/user/nvidia_code/gangealing/models/__init__.py", line 3, in <module>
        from models.spatial_transformers.spatial_transformer import get_stn, ComposedSTN, SpatialTransformer
      File "/home/user/nvidia_code/gangealing/models/spatial_transformers/spatial_transformer.py", line 5, in <module>
        from models.stylegan2.networks import EqualLinear, ConvLayer, ResBlock
      File "/home/user/nvidia_code/gangealing/models/stylegan2/networks.py", line 6, in <module>
        from models.stylegan2.op import FusedLeakyReLU, fused_leaky_relu, upfirdn2d, conv2d_gradfix
      File "/home/user/nvidia_code/gangealing/models/stylegan2/op/__init__.py", line 1, in <module>
        from .fused_act import FusedLeakyReLU, fused_leaky_relu
      File "/home/user/nvidia_code/gangealing/models/stylegan2/op/fused_act.py", line 11, in <module>
        fused = load(
      File "/home/user/anaconda2/envs/gg/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1144, in load
        return _jit_compile(
      File "/home/user/anaconda2/envs/gg/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1357, in _jit_compile
        _write_ninja_file_and_build_library(
      File "/home/user/anaconda2/envs/gg/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1456, in _write_ninja_file_and_build_library
        _write_ninja_file_to_build_library(
      File "/home/user/anaconda2/envs/gg/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1857, in _write_ninja_file_to_build_library
        cuda_flags = common_cflags + COMMON_NVCC_FLAGS + _get_cuda_arch_flags()
      File "/home/user/anaconda2/envs/gg/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1626, in _get_cuda_arch_flags
        arch_list[-1] += '+PTX'
    

    THe command I run is : python applications/propagate_to_images.py --ckpt cat --real_data_path data/lsun_cats --real_size 512 --dset_indices 1922 2363 8558 7401 9750 7432 2105 53 1946

    Any help would be appreciated.

    opened by tankche1 9
  • Is there anyway that I get the predicted transformation of an image?

    Is there anyway that I get the predicted transformation of an image?

    Hi: I'm quite new from the STN model. Just like the title says, for every image that I align by using a trained STN, I want to get each of their predicted transformation (i.e., the predicted set of transformation matrices). Is there anyway I can extract that from the STN model ?

    For some reason, I'm trying to use the predicted transformation of an image to apply on another different image. I saw in the forward function of STN there are return_flow, return_wrap function that can produce matrix, but not sure which should I take.

    Thanks in advance : )

    opened by petercmh01 7
  • get dense corresponded / tracking coordination

    get dense corresponded / tracking coordination

    Hello: thanks for the awsome work and consistently helping with the issues!

    I'm running visualization on a model that I trained on my own dataset. I was able to reproduce the dense corresponded mix reality on video and propagation to each individual frame images with a mask that I created my own.

    I wonder is there any way that I can get the mask position / coordination in image from the model after I applied the dense tracking?

    thanks in advance!

    opened by petercmh01 5
  • Adding masks to the upper forehead/lower chin

    Adding masks to the upper forehead/lower chin

    Hello!

    I was wondering if it was possible to add masks on the upper forehead/lower chin. The congealed images are are cut off at both the forehead and the chin (See here). I was hoping to apply a mask on the full face including these regions.

    Thanks and very cool work!

    opened by codyreading 4
  • No module named 'fused'

    No module named 'fused'

    hello,i have something problem! wuuuu

    i run this command : python -m torch.distributed.launch --nproc_per_node=8 train.py --ckpt cat --load_G_only --padding_mode border --vis_every 5000 --ckpt_every 50000 --iter 1500000 --tv_weight 1000 --loss_fn vgg_ssl --exp-name lsun_cats

    1 2 3 QQ截图20220525172209

    thanks!

    opened by Dong09 4
  • about StyleGan2 ADA training process

    about StyleGan2 ADA training process

    hello wpeebles, i try to train the styleGan net with my own datasets,and i used a 2080Ti 12G GPU batchsize=4 to train 4 days, it returned this,and I want to know if I didn't have enough training time or if I failed.

    QQ截图20220629183619

    opened by Dong09 3
  • loss going up

    loss going up

    I run the following script:

    torchrun --nproc_per_node=4 train.py \
    --ckpt cat --load_G_only --padding_mode border --vis_every 5000 --ckpt_every 50000 \
    --iter 1500000 --tv_weight 1000 --loss_fn vgg_ssl --exp-name lsun_cats --batch 10
    
    

    and find the loss is going up and the transformed image learn almost nothing. Also, it take 1.85s/iter and need 1500000 iter which cost ~220 hour. Is that normal?

    Screen Shot 2022-05-26 at 12 40 22 PM
    opened by tankche1 3
  • Single GPU training

    Single GPU training

    Did anyone manage to run training on a single GPU? I keep getting torch.distributed.elastic.multiprocessing.errors.child failed error when using one GPU to train with this following training script:

    torchrun --nproc_per_node=1 /path/train.py
    --ckpt cat --load_G_only --padding_mode border --vis_every 500 --ckpt_every 10000
    --iter 100000 --tv_weight 2500 --loss_fn lpips --exp-name lsun_cats_lpips --real_data_path /my/train_data/

    Thanks in advance!

    Edit: I tried to turn off the distributed mode but a math value error pops out in here https://github.com/wpeebles/gangealing/blob/739da2a25de62702d54d83fad6b644646512039c/utils/annealing.py#L42

    says Value error math domain error

    opened by petercmh01 3
  • How can I use dense_correspondences in downstream tasks?

    How can I use dense_correspondences in downstream tasks?

    Let's say I have a dataset which I've preprocessed with gangealing and I have saved the dense_correspondences.py in a file.

    How can I map the congealed frame pixels back to the original image using the dense_correspondences?

    opened by gessha 0
  • failed to creat process

    failed to creat process

    err report:

    1. OS version : windows server 2019
    2. The envs had activate,and I have installed all the installation steps and required dependencies with no error warnings.
    3. err detailed : (gg) C:\Users\Administrator\gangealing>torchrun --nproc_per_node=1 applications/mixed_reality.py --ckpt cat --objects --label_path assets/objects/cat/cat_cartoon.png --sigma 0.3 --opacity 1 --real_size 512 --resolution 8192 --real_data_path data\video_frames\white_cat-PNG --no_flip_inferencee failed to create process.

    I don't know why this error occurred,and how to fix it.

    opened by Pythonpa 0
  • Add a CPU version

    Add a CPU version

    • [x] converted the splat function from c++ to python. Able to run the unit test of splat_2d in CPU. https://github.com/Xiaoyang-Rebecca/gangealing/blob/cpu/unit_tests/test_splat.py

    • [x] added --device "cpu" option and other places calling device='cuda

    • [ ] [TODO] Need to support StyleGAN inferencing in cpu, but stylegan2/op only support cuda. Any suggestion? @wpeebles

    nohup: ignoring input
    No CUDA runtime is found, using CUDA_HOME='/usr'
    Traceback (most recent call last):
      File "applications/propagate_to_images.py", line 20, in <module>
        from applications import base_eval_argparse, load_stn, determine_flips
      File "project-py-GANgealing/applications/__init__.py", line 3, in <module>
        from models import get_stn, ResnetClassifier
      File "project-py-GANgealing/models/__init__.py", line 3, in <module>
        from models.spatial_transformers.spatial_transformer import get_stn, ComposedSTN, SpatialTransformer
      File "project-py-GANgealing/models/spatial_transformers/spatial_transformer.py", line 5, in <module>
        from models.stylegan2.networks import EqualLinear, ConvLayer, ResBlock
      File "project-py-GANgealing/models/stylegan2/networks.py", line 6, in <module>
        from models.stylegan2.op import FusedLeakyReLU, fused_leaky_relu, upfirdn2d, conv2d_gradfix
      File "project-py-GANgealing/models/stylegan2/op/__init__.py", line 1, in <module>
        from .fused_act import FusedLeakyReLU, fused_leaky_relu
      File "project-py-GANgealing/models/stylegan2/op/fused_act.py", line 15, in <module>
        os.path.join(module_path, "fused_bias_act_kernel.cu"),
      File "python3.7/site-packages/torch/utils/cpp_extension.py", line 1156, in load
        keep_intermediates=keep_intermediates)
      File "python3.7/site-packages/torch/utils/cpp_extension.py", line 1367, in _jit_compile
        is_standalone=is_standalone)
      File "python3.7/site-packages/torch/utils/cpp_extension.py", line 1465, in _write_ninja_file_and_build_library
        is_standalone=is_standalone)
      File "python3.7/site-packages/torch/utils/cpp_extension.py", line 1857, in _write_ninja_file_to_build_library
        cuda_flags = common_cflags + COMMON_NVCC_FLAGS + _get_cuda_arch_flags()
      File "python3.7/site-packages/torch/utils/cpp_extension.py", line 1626, in _get_cuda_arch_flags
        arch_list[-1] += '+PTX'
    IndexError: list index out of range
    
    opened by Xiaoyang-Rebecca 0
  • add Gradio Web Demo to cvpr 2022 organization

    add Gradio Web Demo to cvpr 2022 organization

    Hi, would you be interested in adding gangealing to Hugging Face as a Gradio Web Demo for CVPR 2022? The Hub offers free hosting, and it would make your work more accessible and visible to the rest of the ML community. Models/datasets/spaces(web demos) can be added to a user account or organization similar to github.

    more info see CVPR organization on Hugging Face: https://huggingface.co/CVPR

    here is a example Gradio Demo for the CVPR org: https://huggingface.co/spaces/CVPR/ml-talking-face

    and here is a guide for adding web demo to the organization: https://huggingface.co/blog/gradio-spaces

    Please let us know if you would be interested and if you have any questions, we can also help with the technical implementation.

    opened by AK391 2
Weakly Supervised Dense Event Captioning in Videos, i.e. generating multiple sentence descriptions for a video in a weakly-supervised manner.

WSDEC This is the official repo for our NeurIPS paper Weakly Supervised Dense Event Captioning in Videos. Description Repo directories ./: global conf

Melon(Xuguang Duan) 96 Nov 1, 2022
Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)

Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)

null 105 Nov 7, 2022
Official pytorch code for SSC-GAN: Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation(ICCV 2021)

SSC-GAN_repo Pytorch implementation for 'Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation'.PDF SSC-GAN:Sem

tyty 4 Aug 28, 2022
PyTorch implementation of Lip to Speech Synthesis with Visual Context Attentional GAN (NeurIPS2021)

Lip to Speech Synthesis with Visual Context Attentional GAN This repository contains the PyTorch implementation of the following paper: Lip to Speech

null 6 Nov 2, 2022
Official implementation of our paper "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" in Pytorch.

LLA: Loss-aware Label Assignment for Dense Pedestrian Detection This project provides an implementation for "LLA: Loss-aware Label Assignment for Dens

null 35 Dec 6, 2022
FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

XCL 191 Dec 31, 2022
DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time

DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time Introduction This is official implementation for DR-GAN (IEEE TCS

Kang Liao 18 Dec 23, 2022
UniMoCo: Unsupervised, Semi-Supervised and Full-Supervised Visual Representation Learning

UniMoCo: Unsupervised, Semi-Supervised and Full-Supervised Visual Representation Learning This is the official PyTorch implementation for UniMoCo pape

dddzg 49 Jan 2, 2023
An official implementation of the paper Exploring Sequence Feature Alignment for Domain Adaptive Detection Transformers

Sequence Feature Alignment (SFA) By Wen Wang, Yang Cao, Jing Zhang, Fengxiang He, Zheng-jun Zha, Yonggang Wen, and Dacheng Tao This repository is an o

WangWen 79 Dec 24, 2022
The official implementation of CVPR 2021 Paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation This repository is the official implementation of CVPR 2021 paper:

null 9 Nov 14, 2022
This repository is the official implementation of Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning (NeurIPS21).

Core-tuning This repository is the official implementation of ``Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regular

vanint 18 Dec 17, 2022
Official PyTorch implementation of the paper "Recycling Discriminator: Towards Opinion-Unaware Image Quality Assessment Using Wasserstein GAN", accepted to ACM MM 2021 BNI Track.

RecycleD Official PyTorch implementation of the paper "Recycling Discriminator: Towards Opinion-Unaware Image Quality Assessment Using Wasserstein GAN

Yunan Zhu 23 Nov 5, 2022
Pytorch implementation for "Implicit Feature Alignment: Learn to Convert Text Recognizer to Text Spotter".

Implicit Feature Alignment: Learn to Convert Text Recognizer to Text Spotter This is a pytorch-based implementation for paper Implicit Feature Alignme

wangtianwei 61 Nov 12, 2022
🧠 A PyTorch implementation of 'Deep CORAL: Correlation Alignment for Deep Domain Adaptation.', ECCV 2016

Deep CORAL A PyTorch implementation of 'Deep CORAL: Correlation Alignment for Deep Domain Adaptation. B Sun, K Saenko, ECCV 2016' Deep CORAL can learn

Andy Hsu 200 Dec 25, 2022
PyTorch implementation of "Representing Shape Collections with Alignment-Aware Linear Models" paper.

deep-linear-shapes PyTorch implementation of "Representing Shape Collections with Alignment-Aware Linear Models" paper. If you find this code useful i

Romain Loiseau 27 Sep 24, 2022
A pytorch implementation of the ACL2019 paper "Simple and Effective Text Matching with Richer Alignment Features".

RE2 This is a pytorch implementation of the ACL 2019 paper "Simple and Effective Text Matching with Richer Alignment Features". The original Tensorflo

null 287 Dec 21, 2022
Dense Contrastive Learning (DenseCL) for self-supervised representation learning, CVPR 2021.

Dense Contrastive Learning for Self-Supervised Visual Pre-Training This project hosts the code for implementing the DenseCL algorithm for se

Xinlong Wang 491 Jan 3, 2023
MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera

MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera

Felix Wimbauer 494 Jan 6, 2023
CVPR2022 paper "Dense Learning based Semi-Supervised Object Detection"

[CVPR2022] DSL: Dense Learning based Semi-Supervised Object Detection DSL is the first work on Anchor-Free detector for Semi-Supervised Object Detecti

Bhchen 69 Dec 8, 2022