Official Implementation for "StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery" (ICCV 2021 Oral)

Last update: Jan 5, 2023

Related tags

Deep Learning StyleCLIP

Overview

StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery (ICCV 2021 Oral)

Run this model on Replicate

Optimization: Global directions: Mapper:

Check our full demo video here:

StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
Or Patashnik*, Zongze Wu*, Eli Shechtman, Daniel Cohen-Or, Dani Lischinski
*Equal contribution, ordered alphabetically
https://arxiv.org/abs/2103.17249

Abstract: Inspired by the ability of StyleGAN to generate highly realistic images in a variety of domains, much recent work has focused on understanding how to use the latent spaces of StyleGAN to manipulate generated and real images. However, discovering semantically meaningful latent manipulations typically involves painstaking human examination of the many degrees of freedom, or an annotated collection of images for each desired manipulation. In this work, we explore leveraging the power of recently introduced Contrastive Language-Image Pre-training (CLIP) models in order to develop a text-based interface for StyleGAN image manipulation that does not require such manual effort. We first introduce an optimization scheme that utilizes a CLIP-based loss to modify an input latent vector in response to a user-provided text prompt. Next, we describe a latent mapper that infers a text-guided latent manipulation step for a given input image, allowing faster and more stable textbased manipulation. Finally, we present a method for mapping a text prompts to input-agnostic directions in StyleGAN’s style space, enabling interactive text-driven image manipulation. Extensive results and comparisons demonstrate the effectiveness of our approaches.

Description

Official Implementation of StyleCLIP, a method to manipulate images using a driving text. Our method uses the generative power of a pretrained StyleGAN generator, and the visual-language power of CLIP. In the paper we present three methods:

Latent vector optimization.
Latent mapper, trained to manipulate latent vectors according to a specific text description.
Global directions in the StyleSpace.

Updates

15/8/2021 Add support for StyleSpace in optimization and latent mapper methods

6/4/2021 Add mapper training and inference (including a jupyter notebook) code

6/4/2021 Add support for custom StyleGAN2 and StyleGAN2-ada models, and also custom images

2/4/2021 Add the global directions code (a local GUI and a colab notebook)

31/3/2021 Upload paper to arxiv, and video to YouTube

14/2/2021 Initial version

Setup (for all three methods)

For all the methods described in the paper, is it required to have:

Anaconda
CLIP

Specific requirements for each method are described in its section. To install CLIP please run the following commands:

conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=<CUDA_VERSION>
pip install ftfy regex tqdm gdown
pip install git+https://github.com/openai/CLIP.git

Editing via Latent Vector Optimization

Setup

Here, the code relies on the Rosinality pytorch implementation of StyleGAN2. Some parts of the StyleGAN implementation were modified, so that the whole implementation is native pytorch.

In addition to the requirements mentioned before, a pretrained StyleGAN2 generator will attempt to be downloaded, (or manually download from here).

Usage

Given a textual description, one can both edit a given image, or generate a random image that best fits to the description. Both operations can be done through the main.py script, or the optimization_playground.ipynb notebook ().

Editing

To edit an image set --mode=edit. Editing can be done on both provided latent vector, and on a random latent vector from StyleGAN's latent space. It is recommended to adjust the --l2_lambda according to the desired edit.

Generating Free-style Images

To generate a free-style image set --mode=free_generation.

Editing via Latent Mapper

Here, we provide the code for the latent mapper. The mapper is trained to learn residuals from a given latent vector, according to the driving text. The code for the mapper is in mapper/.

Setup

As in the optimization, the code relies on Rosinality pytorch implementation of StyleGAN2. In addition the the StyleGAN weights, it is neccessary to have weights for the facial recognition network used in the ID loss. The weights can be downloaded from here.

The mapper is trained on latent vectors. It is recommended to train on inverted real images. To this end, we provide the CelebA-HQ that was inverted by e4e: train set, test set.

Usage

Training

The main training script is placed in mapper/scripts/train.py.
Training arguments can be found at mapper/options/train_options.py.
Intermediate training results are saved to opts.exp_dir. This includes checkpoints, train outputs, and test outputs. Additionally, if you have tensorboard installed, you can visualize tensorboard logs in opts.exp_dir/logs. Note that
To resume a training, please provide --checkpoint_path.
--description is where you provide the driving text.
If you perform an edit that is not supposed to change "colors" in the image, it is recommended to use the flag --no_fine_mapper.

Example for training a mapper for the moahwk hairstyle:

cd mapper
python train.py --exp_dir ../results/mohawk_hairstyle --no_fine_mapper --description "mohawk hairstyle"

All configurations for the examples shown in the paper are provided there.

Inference

The main inferece script is placed in mapper/scripts/inference.py.
Inference arguments can be found at mapper/options/test_options.py.
Adding the flag --couple_outputs will save image containing the input and output images side-by-side.

Pretrained models for variuos edits are provided. Please refer to utils.py for the complete links list.

We also provide a notebook for performing inference with the mapper Mapper notebook:

Editing via Global Direction

Here we provide GUI for editing images with the global directions. We provide both a jupyter notebook , and the GUI used in the video. For both, the linear direction are computed in real time. The code is located at global_directions/.

Setup

Here, we rely on the official TensorFlow implementation of StyleGAN2.

It is required to have TensorFlow, version 1.14 or 1.15 (conda install -c anaconda tensorflow-gpu==1.14).

Usage

Local GUI

To start the local GUI please run the following commands:

cd global_directions

# input dataset name 
dataset_name='ffhq' 

# pretrained StyleGAN2 model from standard [NVlabs implementation](https://github.com/NVlabs/stylegan2) will be download automatically.
# pretrained StyleGAN2-ada model could be download from https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada/pretrained/ .
# for custom StyleGAN2 or StyleGAN2-ada model, please place the model under ./StyleCLIP/global_directions/model/ folder.


# input prepare data 
python GetCode.py --dataset_name $dataset_name --code_type 'w'
python GetCode.py --dataset_name $dataset_name --code_type 's'
python GetCode.py --dataset_name $dataset_name --code_type 's_mean_std'

# preprocess (this may take a few hours). 
# we precompute the results for StyleGAN2 on ffhq, StyleGAN2-ada on afhqdog, afhqcat. For these model, we can skip the preprocess step.
python SingleChannel.py --dataset_name $dataset_name

# generated image to be manipulated 
# this operation will generate and replace the w_plu.npy and .jpg images in './data/dataset_name/' folder. 
# if you you want to keep the original data, please rename the original folder.
# to use custom images, please use e4e encoder to generate latents.pt, and place it in './data/dataset_name/' folder, and add --real flag while running this function.
# you may skip this step if you want to manipulate the real human faces we prepare in ./data/ffhq/ folder.   
python GetGUIData.py --dataset_name $dataset_name

# interactively manipulation 
python PlayInteractively.py --dataset_name $dataset_name

As shown in the video, to edit an image it is requires to write a neutral text and a target text. To operate the GUI, please do the following:

Maximize the window size
Double click on the left square to choose an image. The images are taken from global_directions/data/ffhq, and the corresponding latent vectors are in global_directions/data/ffhq/w_plus.npy.
Type a neutral text, then press enter
Modify the target text so that it will contain the target edit, then press enter.

You can now play with:

Manipulation strength - positive values correspond to moving along the target direction.
Disentanglement threshold - large value means more disentangled edit, just a few channels will be manipulated so only the target attribute will change (for example, grey hair). Small value means less disentangled edit, a large number of channels will be manipulated, related attributes will also change (such as wrinkle, skin color, glasses).

Examples:

Edit	Neutral Text	Target Text
Smile	face	smiling face
Gender	female face	male face
Blonde hair	face with hair	face with blonde hair
Hi-top fade	face with hair	face with Hi-top fade hair
Blue eyes	face with eyes	face with blue eyes

More examples could be found in the video and in the paper.

Pratice Tips:

In the terminal, for every manipulation, the number of channels being manipulated is printed (the number is controlled by the attribute (neutral, target) and the disentanglement threshold).

For color transformation, usually 10-20 channels is enough. For large structure change (for example, Hi-top fade), usually 100-200 channels are required.
For an attribute (neutral, target), if you give a low disentanglement threshold, there are just few channels (<20) being manipulated, and usually it is not enough for performing the desired edit.

Notebook

Open the notebook in colab and run all the cells. In the last cell you can play with the image.

beta corresponds to the disentanglement threshold, and alpha to the manipulation strength.

After you set the desired set of parameters, please run again the last cell to generate the image.

Editing Examples

In the following, we show some results obtained with our methods. All images are real, and were inverted into the StyleGAN's latent space using e4e. The driving text that was used for each edit appears below or above each image.

Latent Optimization

Latent Mapper

Global Directions

Related Works

The global directions we find for editing are direction in the S Space, which was introduced and analyzed in StyleSpace (Wu et al).

To edit real images, we inverted them to the StyleGAN's latent space using e4e (Tov et al.).

The code strcuture of the mapper is heavily based on pSp.

Citation

If you use this code for your research, please cite our paper:

@misc{patashnik2021styleclip,
      title={StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery}, 
      author={Or Patashnik and Zongze Wu and Eli Shechtman and Daniel Cohen-Or and Dani Lischinski},
      year={2021},
      eprint={2103.17249},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Comments

Difficulty producing convincing results.
I got this result with:

python3 main.py --description "Blue Hair" --mode=edit

Do I need to change something in the invocation to get better results?
opened by stolk 8

[Errno 2] No such file or directory: 'stylegan2-ffhq-config-f.pt'

error in colab

[Errno 2] No such file or directory: 'stylegan2-ffhq-config-f.pt'

from main import main
from argparse import Namespace
result = main(Namespace(**args))


---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-4-fbc1a59be2c9> in <module>()
      1 from main import main
      2 from argparse import Namespace
----> 3 result = main(Namespace(**args))

3 frames
/usr/local/lib/python3.6/dist-packages/torch/serialization.py in __init__(self, name, mode)
    209 class _open_file(_opener):
    210     def __init__(self, name, mode):
--> 211         super(_open_file, self).__init__(open(name, mode))
    212 
    213     def __exit__(self, *args):

FileNotFoundError: [Errno 2] No such file or directory: 'stylegan2-ffhq-config-f.pt'

opened by molo32 8

How to invert and edit an image

Hello,

I looked at the codes but couldn't locate the code that inverts the image so it can be edited. You said you used e4e, but I couldn't find the relevant codes.

Thank you.

opened by ramtiin 6

CUDA 11 environment installation config?

Hey there, been having some trouble installing on RTX 30 series (which require CUDA 11), and was hoping I could get some tips on how to get it working (either from the developers or other users who've encountered the same problem?)

Not sure if this is being run on a CUDA 11 environment in your group at present (if so please let me know what I've missed!), I've not been able to run the global/ subdirectory code, beginning with GetCode.py on either Python 3.6 or 3.7 (while 3.8 and above are incompatible, they require TensorFlow 2.2).

I installed via conda as the pip installed TensorFlow was built against CUDA 10, and raised errors about missing *.so.10 libraries as a result, which disappeared when using the conda-forge package.

After getting an error that "Setting up TensorFlow plugin "fused_bias_act.cu": Failed!" I tried some advice on an NVIDIA forum post for StyleGAN2, to change line 135 of global/dnnlib/tflib/custom_ops.py to

compile_opts += f' --compiler-options \'-fPIC -D_GLIBCXX_USE_CXX11_ABI=1\''

However this had no effect: there still seems to be a failure to register the GPU.

To check whether I can use the environment I'm running

StyleCLIP/global $ python GetCode.py --code_type "w"

My environment setups (3.6 and 3.7 respectively) are as follows (after each is the error output for that environment)

Click to show setup for Python 3.6, CUDA 11.0.221, PyTorch 1.7.1, TensorFlow 1.14.0

conda create -n styleclip4
conda activate styleclip4
conda install -y "python<3.7" -c conda-forge # Python 3.6.13 (restricted by TensorFlow 1.x dependency)
conda install -y pytorch==1.7.1 torchvision "cudatoolkit<11.2" -c pytorch
# PyPi tensorflow-gpu package is built for CUDA 10, incompatible with 11, use conda-forge community package
conda install "tensorflow-gpu<2" -c conda-forge # 1.14.0
pip install git+https://github.com/openai/CLIP.git # forces pytorch 1.7.1 install
pip install pandas requests opencv-python matplotlib scikit-learn gdown
gdown https://drive.google.com/u/0/uc?id=1EM87UquaoQmk17Q8d5kYIAHqu0dkYqdT&export=download
git clone https://github.com/omertov/encoder4editing.git

Gives:

/home/louis/miniconda3/envs/styleclip4/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/louis/miniconda3/envs/styleclip4/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/louis/miniconda3/envs/styleclip4/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/louis/miniconda3/envs/styleclip4/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/louis/miniconda3/envs/styleclip4/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/louis/miniconda3/envs/styleclip4/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
/home/louis/miniconda3/envs/styleclip4/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/louis/miniconda3/envs/styleclip4/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/louis/miniconda3/envs/styleclip4/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/louis/miniconda3/envs/styleclip4/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/louis/miniconda3/envs/styleclip4/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/louis/miniconda3/envs/styleclip4/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
Setting up TensorFlow plugin "fused_bias_act.cu": Failed!
Traceback (most recent call last):
  File "GetCode.py", line 284, in <module>
    GetCode(Gs,random_state,num_img,num_once,dataset_name)
  File "GetCode.py", line 109, in GetCode
    dlatent_avg=Gs.get_var('dlatent_avg')
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/network.py", line 396, in get_var
    return self.find_var(var_or_local_name).eval()
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/network.py", line 391, in find_var
    return self._get_vars()[var_or_local_name] if isinstance(var_or_local_name, str) else var_or_local_name
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/network.py", line 297, in _get_vars
    self._vars = OrderedDict(self._get_own_vars())
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/network.py", line 286, in _get_own_vars
    self._init_graph()
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/network.py", line 151, in _init_graph
    out_expr = self._build_func(*self._input_templates, **build_kwargs)
  File "<string>", line 187, in G_main
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/network.py", line 232, in input_shape
    return self.input_shapes[0]
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/network.py", line 219, in input_shapes
    self._input_shapes = [t.shape.as_list() for t in self.input_templates]
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/network.py", line 267, in input_templates
    self._init_graph()
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/network.py", line 151, in _init_graph
    out_expr = self._build_func(*self._input_templates, **build_kwargs)
  File "<string>", line 491, in G_synthesis_stylegan2
  File "<string>", line 455, in layer
  File "<string>", line 99, in modulated_conv2d_layer
  File "<string>", line 68, in apply_bias_act
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/ops/fused_bias_act.py", line 72, in fused_bias_act
    return impl_dict[impl](x=x, b=b, axis=axis, act=act, alpha=alpha, gain=gain, clamp=clamp)
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/ops/fused_bias_act.py", line 132, in _fused_bias_act_cuda
    cuda_op = _get_plugin().fused_bias_act
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/ops/fused_bias_act.py", line 18, in _get_plugin
    return custom_ops.get_plugin(os.path.splitext(__file__)[0] + '.cu')
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/custom_ops.py", line 139, in get_plugin
    compile_opts += f' --gpu-architecture={_get_cuda_gpu_arch_string()}'
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/custom_ops.py", line 60, in _get_cuda_gpu_arch_string
    raise RuntimeError('No GPU devices found')
RuntimeError: No GPU devices found

Click to show setup for Python 3.7, CUDA 11.0.221, PyTorch 1.7.1, TensorFlow 1.14.0

conda create -n styleclip3
conda activate styleclip3
conda install -y "python<3.8" -c conda-forge # Python 3.7.10 (restricted by TensorFlow 1.x dependency)
conda install -y pytorch==1.7.1 torchvision "cudatoolkit<11.2" -c pytorch
# PyPi tensorflow-gpu package is built for CUDA 10, incompatible with 11, use conda-forge community package
conda install "tensorflow-gpu<2" -c conda-forge # 1.14.0
pip install git+https://github.com/openai/CLIP.git # forces pytorch 1.7.1 install
pip install pandas requests opencv-python matplotlib scikit-learn gdown
gdown https://drive.google.com/u/0/uc?id=1EM87UquaoQmk17Q8d5kYIAHqu0dkYqdT&export=download
git clone https://github.com/omertov/encoder4editing.git

/home/louis/miniconda3/envs/styleclip3/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/louis/miniconda3/envs/styleclip3/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/louis/miniconda3/envs/styleclip3/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/louis/miniconda3/envs/styleclip3/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/louis/miniconda3/envs/styleclip3/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/louis/miniconda3/envs/styleclip3/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
/home/louis/miniconda3/envs/styleclip3/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/louis/miniconda3/envs/styleclip3/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/louis/miniconda3/envs/styleclip3/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/louis/miniconda3/envs/styleclip3/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/louis/miniconda3/envs/styleclip3/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/louis/miniconda3/envs/styleclip3/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
Setting up TensorFlow plugin "fused_bias_act.cu": Failed!
Traceback (most recent call last):
  File "GetCode.py", line 284, in <module>
    GetCode(Gs,random_state,num_img,num_once,dataset_name)
  File "GetCode.py", line 109, in GetCode
    dlatent_avg=Gs.get_var('dlatent_avg')
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/network.py", line 396, in get_var
    return self.find_var(var_or_local_name).eval()
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/network.py", line 391, in find_var
    return self._get_vars()[var_or_local_name] if isinstance(var_or_local_name, str) else var_or_local_name
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/network.py", line 297, in _get_vars
    self._vars = OrderedDict(self._get_own_vars())
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/network.py", line 286, in _get_own_vars
    self._init_graph()
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/network.py", line 151, in _init_graph
    out_expr = self._build_func(*self._input_templates, **build_kwargs)
  File "<string>", line 187, in G_main
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/network.py", line 232, in input_shape
    return self.input_shapes[0]
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/network.py", line 219, in input_shapes
    self._input_shapes = [t.shape.as_list() for t in self.input_templates]
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/network.py", line 267, in input_templates
    self._init_graph()
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/network.py", line 151, in _init_graph
    out_expr = self._build_func(*self._input_templates, **build_kwargs)
  File "<string>", line 491, in G_synthesis_stylegan2
  File "<string>", line 455, in layer
  File "<string>", line 99, in modulated_conv2d_layer
  File "<string>", line 68, in apply_bias_act
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/ops/fused_bias_act.py", line 72, in fused_bias_act
    return impl_dict[impl](x=x, b=b, axis=axis, act=act, alpha=alpha, gain=gain, clamp=clamp)
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/ops/fused_bias_act.py", line 132, in _fused_bias_act_cuda
    cuda_op = _get_plugin().fused_bias_act
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/ops/fused_bias_act.py", line 18, in _get_plugin
    return custom_ops.get_plugin(os.path.splitext(__file__)[0] + '.cu')
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/custom_ops.py", line 139, in get_plugin
    compile_opts += f' --gpu-architecture={_get_cuda_gpu_arch_string()}'
  File "/home/louis/dev/cv/StyleCLIP/global/dnnlib/tflib/custom_ops.py", line 60, in _get_cuda_gpu_arch_string
    raise RuntimeError('No GPU devices found')
RuntimeError: No GPU devices found

Requiring packages come from the anaconda channel instead for some reason is enforcing TensorFlow 1.10.0, which (once you get past some interface changes) still leads to the same GPU registration problem.

Click to show setup for Python 3.6, CUDA 11.0.221, PyTorch 1.7.1, TensorFlow 1.10.0 (anaconda channel)

conda create -n styleclip6
conda activate styleclip6
conda install -y "python<3.7" -c anaconda # Python 3.6.13 (restricted by TensorFlow 1.x dependency)
conda install -y pytorch==1.7.1 torchvision "cudatoolkit>=11,<11.2" -c pytorch
# PyPi tensorflow-gpu package is built for CUDA 10, incompatible with 11, use conda-forge community package
conda install "tensorflow-gpu<2" -c anaconda # 1.10.0
pip install git+https://github.com/openai/CLIP.git # forces pytorch 1.7.1 install
pip install pandas requests opencv-python matplotlib scikit-learn gdown
gdown https://drive.google.com/u/0/uc?id=1EM87UquaoQmk17Q8d5kYIAHqu0dkYqdT&export=download
git clone https://github.com/omertov/encoder4editing.git

CUDA can also be downloaded from conda-forge but upon installing pytorch, the package is superseded by the higher-priority cudatoolkit package in that channel (making it equivalent to the attempt above which failed)

I'm out of ideas so giving up at this point, please let me know if there's a solution!

opened by lmmx 5

Training on my own dataset?

Hello team, Thank your for this amazing project. I'd like to train StyleCLIP on own my dataset. I have the images already. What other data do I have to prepare? Could you please let me know the detail steps to train it?

opened by tamnguyenvan 4
Load other images, but return ValueError

I tried it with my pictures but failed all, and it returns: ValueError: operands could not be broadcast together with shapes (1,1,0) (1,1,512) (1,1,0), is my picture different from demo images? My 2 color pictures include a cat and a human face, both same size 1024*1024.

opened by zhangjingcode 4
Using a user-specified image for the Global Directions Colab Notebook

The Colab notebook for e2e creates a (1, 18, 512) latent vector which works with the Global Optimization network. However the Global Directions Notebook wants a 26-element list with 15 512-vectors, 3 256-vectors, 3 128-vectors, 3 64-vectors, and 2 32-vectors.

How do you get a compatible vector from e2e for custom images to global directions?

opened by minimaxir 4
Speeding up trainer - avenue for exploring

I watched some youtube videos and came across pytorch lightning which supposedly simplifies distributed trainining. https://pytorch-lightning.readthedocs.io/en/latest/common/trainer.html (I don't want to wait 10 hours for training to finish / mainly because of the power usage on my gpu)

opened by johndpope 3
What is the meaning of fs3?
Hi，I am following your work and thank you for your excellent paper.

However, I have some confusion, please help me.

What is the meaning of fs3 in the code, I don't know what that means or what it does.

What do you estimate with 200 images, which I don't really understand in the paper.

How to get delta_i or delta_ic in the paper?

Looking forward to your responses, thank you very much!!!
opened by sunpeng1996 3
Utility method to download weights #17

Automatically downloads weights from Google Drive using 'gdown' if not found. Updates readme. The dictionary in utils.py can be updated with new filenames and matching URLs if weights are added. @orpatashnik @Skylion007

opened by lukestanley 3
How to encode existing image?

The README suggests using e2e for encoding existing face image, but this repo seems to be empty. I tried using stylegan-encoder instead, but it is based on Tensorflow and I got stuck with downloading the model with "Google Drive quota exceeded" error. Any suggestions for what to use for StyleGAN encoding?

opened by tambetm 3
Your latent mapper implementation is weird and possibly wrong

As is stated in your paper and also a common practice, the latent mapper should learn a residual of the input code,

w_edit = w_input + M(w_input)

However, the residual step is not implemented in your latent_mapper and styleclip_mappper. And in your trainer, you choose to ignore the forward function of styleclip_mapper and directly access styleclip_mapper.mapper, which is quite weird.

opened by johannwyh 0
shape_predictor_68_face_landmarks

what is the purpose of "shape_predictor_68_face_landmarks"? If I want to implement the dataset of LSUN (outdoor church), what kind of file that I have to choose to replace the "shape_predictor_68_face_landmarks"? Would you please release the file?

The Image is from the "StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery". This is the style that I want to try to implement.

opened by snow1929 0
PT File Format

Hi,

I would like to test training/testing using the provided dataset, but I found those files were provided with *.pt file format. The information that I reached on the internet told me this extension meant "Panther Project File" and it can be opened using Panther software: https://prolifics.com/panther-support/ .

The website provides the information about free trial by registering emails that is fine. Is there any other free software available publicly?

Best regards,

opened by tmoriyam 0
How long should it take to preprocess for global direction?

I think in the paper it is mentioned it takes ~4 hours to preprocess the global direction for FFHQ. However, when I tried to proprocess for the same FFHQ model, it is taking ~24 hours (using V100 GPU). I did not change any settings. Is it expected? Or am I missing something?

opened by Mehrab-Tanjim 10

Owner

GitHub

Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization' (ICCV-21 Oral)

Learning-Action-Completeness-from-Points Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal A

67 Jan 3, 2023

A PyTorch implementation of the baseline method in Panoptic Narrative Grounding (ICCV 2021 Oral)

52 Dec 19, 2022

Code for "Human Pose Regression with Residual Log-likelihood Estimation", ICCV 2021 Oral

Human Pose Regression with Residual Log-likelihood Estimation [Paper] [arXiv] [Project Page] Human Pose Regression with Residual Log-likelihood Estima

347 Dec 24, 2022

Improving Contrastive Learning by Visualizing Feature Transformation, ICCV 2021 Oral

Improving Contrastive Learning by Visualizing Feature Transformation This project hosts the codes, models and visualization tools for the paper: Impro

83 Dec 15, 2022

BARF: Bundle-Adjusting Neural Radiance Fields 🤮 (ICCV 2021 oral)

BARF ?? : Bundle-Adjusting Neural Radiance Fields Chen-Hsuan Lin, Wei-Chiu Ma, Antonio Torralba, and Simon Lucey IEEE International Conference on Comp

539 Dec 28, 2022

[ICCV 2021 Oral] PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers

PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers Created by Xumin Yu*, Yongming Rao*, Ziyi Wang, Zuyan Liu, Jiwen Lu, Jie Zhou

317 Dec 26, 2022

Code for the ICCV 2021 paper "Pixel Difference Networks for Efficient Edge Detection" (Oral).

Pixel Difference Convolution This repository contains the PyTorch implementation for "Pixel Difference Networks for Efficient Edge Detection" by Zhuo

236 Dec 21, 2022

[ICCV 2021 Oral] NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo

NerfingMVS Project Page | Paper | Video | Data NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo Yi Wei, Shaohui

369 Dec 24, 2022

[ICCV 2021 (oral)] Planar Surface Reconstruction from Sparse Views

Planar Surface Reconstruction From Sparse Views Linyi Jin, Shengyi Qian, Andrew Owens, David F. Fouhey University of Michigan ICCV 2021 (Oral) This re

89 Jan 5, 2023

ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models (ICCV 2021 Oral)

ILVR + ADM This is the implementation of ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models (ICCV 2021 Oral). This repository is h

225 Dec 28, 2022

Code Release for ICCV 2021 (oral), "AdaFit: Rethinking Learning-based Normal Estimation on Point Clouds"

AdaFit: Rethinking Learning-based Normal Estimation on Point Clouds (ICCV 2021 oral) **Project Page | Arxiv ** Runsong Zhu¹, Yuan Liu², Zhen Dong¹, Te

40 Dec 30, 2022

[ICCV 2021 Oral] SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer

This repository contains the source code for the paper SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer (ICCV 2021 Oral). The project page is here.

65 Dec 26, 2022

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

111 Dec 31, 2022

Official PyTorch implementation of RobustNet (CVPR 2021 Oral)

RobustNet (CVPR 2021 Oral): Official Project Webpage Codes and pretrained models will be released soon. This repository provides the official PyTorch

173 Dec 21, 2022

Official PyTorch Implementation of Convolutional Hough Matching Networks, CVPR 2021 (oral)

Convolutional Hough Matching Networks This is the implementation of the paper "Convolutional Hough Matching Network" by J. Min and M. Cho. Implemented

70 Nov 22, 2022

Official pytorch implementation of "Feature Stylization and Domain-aware Contrastive Loss for Domain Generalization" ACMMM 2021 (Oral)

Feature Stylization and Domain-aware Contrastive Loss for Domain Generalization This is an official implementation of "Feature Stylization and Domain-

22 Sep 22, 2022

Official Implementation for "StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery" (ICCV 2021 Oral)

Related tags

Overview

StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery (ICCV 2021 Oral)

Description

Updates

Setup (for all three methods)

Editing via Latent Vector Optimization

Setup

Usage

Editing

Generating Free-style Images

Editing via Latent Mapper

Setup

Usage

Training

Inference

Editing via Global Direction

Setup

Usage

Local GUI

Examples:

Pratice Tips:

Notebook

Editing Examples

Latent Optimization

Latent Mapper

Global Directions

Related Works

Citation

Comments

Owner

Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization' (ICCV-21 Oral)

A PyTorch implementation of the baseline method in Panoptic Narrative Grounding (ICCV 2021 Oral)

Code for "Human Pose Regression with Residual Log-likelihood Estimation", ICCV 2021 Oral

Improving Contrastive Learning by Visualizing Feature Transformation, ICCV 2021 Oral

BARF: Bundle-Adjusting Neural Radiance Fields 🤮 (ICCV 2021 oral)

[ICCV 2021 Oral] PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers

Code for the ICCV 2021 paper "Pixel Difference Networks for Efficient Edge Detection" (Oral).

[ICCV 2021 Oral] NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo

[ICCV 2021 (oral)] Planar Surface Reconstruction from Sparse Views

ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models (ICCV 2021 Oral)

Code Release for ICCV 2021 (oral), "AdaFit: Rethinking Learning-based Normal Estimation on Point Clouds"

[ICCV 2021 Oral] SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

Official PyTorch implementation of RobustNet (CVPR 2021 Oral)

Official PyTorch Implementation of Convolutional Hough Matching Networks, CVPR 2021 (oral)

Official pytorch implementation of "Feature Stylization and Domain-aware Contrastive Loss for Domain Generalization" ACMMM 2021 (Oral)

A Fast and Accurate One-Stage Approach to Visual Grounding, ICCV 2019 (Oral)

Official implementation of NPMs: Neural Parametric Models for 3D Deformable Shapes - ICCV 2021

Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.