Styled text-to-drawing synthesis method. Featured at the 2021 NeurIPS Workshop on Machine Learning for Creativity and Design

Overview

StyleCLIPDraw

Peter Schaldenbrand, Zhixuan Liu, Jean Oh September 2021

To be featured in the 2021 NeurIPS Workshop on Machine Learning and Design

StyleCLIPDraw adds a style loss to the CLIPDraw (Frans et al. 2021) (code) text-to-drawing synthesis model to allow artistic control of the synthesized drawings in addition to control of the content via text. Whereas performing decoupled style transfer on a generated image only affects the texture, our proposed coupled approach is able to capture a style in both texture and shape, suggesting that the style of the drawing is coupled with the drawing process itself.

Checkout our code on Colab

Method

Unlike most other image generation models, CLIPDraw produces drawings consisting of a series of Bezier curves defined by a list of coordinates, a color, and an opacity. The drawing begins as randomized Bezier curves on a canvas and is optimized to fit the given style and text. The StyleCLIPDraw model architecture is shown above. The brush strokes are rendered into a raster image via differentiable model. There are two losses for StyleCLIPDraw that correspond to each input. The text input and the augmented raster drawing are fed the the CLIP model and the difference in embeddings are compared using cosine distance to compute a loss that encourages the drawing to fit the text input. The image is augmented to avoid finding shallow solutions to optimizing through the CLIP model. The raster image and the style image are fed through early layers of the VGG-16 model and the difference in extracted features form the loss that encourages the drawings to fit the style of the style image.

Results

StyleCLIPDraw vs. CLIPDraw then Style Transfer

Comments
  • RuntimeError: radix_sort: failed on 1st step: cudaErrorInvalidDeviceFunction: invalid device function

    RuntimeError: radix_sort: failed on 1st step: cudaErrorInvalidDeviceFunction: invalid device function

    Hi this research is really fantastic and exciting ! According Colab instruction, when I run img = style_clip_draw('A man is watching TV', 'https://raw.githubusercontent.com/pschaldenbrand/StyleCLIPDraw/master/images/fruit.jpg',\ num_iter=1000, style_opt_freq=5, style_opt_iter=50) show_img(img)

    the runtime return

    Downloading: "https://download.pytorch.org/models/vgg16-397923af.pth" to /root/.cache/torch/hub/checkpoints/vgg16-397923af.pth 100% 528M/528M [00:05<00:00, 103MB/s] RuntimeError Traceback (most recent call last) <ipython-input-6-c7386fef0808> in <module>() ----> 1 img = style_clip_draw('A man is watching TV', 'https://raw.githubusercontent.com/pschaldenbrand/StyleCLIPDraw/master/images/fruit.jpg', num_iter=1000, style_opt_freq=5, style_opt_iter=50) 2 show_img(img) 4 frames /usr/local/lib/python3.7/dist-packages/diffvg-0.0.1-py3.7-linux-x86_64.egg/pydiffvg/render_pytorch.py in backward(ctx, grad_img) 707 use_prefiltering, 708 diffvg.float_ptr(eval_positions.data_ptr()), --> 709 eval_positions.shape[0]) 710 time_elapsed = time.time() - start 711 global print_timing RuntimeError: radix_sort: failed on 1st step: cudaErrorInvalidDeviceFunction: invalid device function

    And rerun twice or triple also show same info. Can this work's expert help me ? Thanks!

    opened by lizekui 5
  • AttributeError: module 'diffvg' has no attribute 'FilterType'

    AttributeError: module 'diffvg' has no attribute 'FilterType'

    Hi, thanks for your sharing the project, idea and code. I tried to run the colab and it stop at the pydiffvg import , showing the following error :

    AttributeError: module 'diffvg' has no attribute 'FilterType'

    I tried to google and install the diffvg GitHub manually but still no luck, do you have any idea I could make it work? thanks for your help!

    opened by chikiuso 4
  • Colab StyleCLIPDraw error

    Colab StyleCLIPDraw error

    Using Colab pro with a GPU 0: Tesla P100-PCIE-16GB

    Executing:

    img = style_clip_draw('A man is watching TV', 'https://raw.githubusercontent.com/pschaldenbrand/StyleCLIPDraw/master/images/fruit.jpg',\
                              num_iter=1000, style_opt_freq=5, style_opt_iter=50, debug=True) 
    show_img(img)
    

    I got this error

    /usr/local/lib/python3.7/dist-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
        154     Variable._execution_engine.run_backward(
        155         tensors, grad_tensors_, retain_graph, create_graph, inputs,
    --> 156         allow_unreachable=True, accumulate_grad=True)  # allow_unreachable flag
        157 
        158 
    RuntimeError: merge_sort: failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered
    
    opened by mcanet 3
  • Error on colab

    Error on colab

    Hey guys,

    Thank you for sharing the code :) I am trying to use it in colab but after the installation cell is done I got the following error:

    AttributeError: module 'diffvg' has no attribute 'FilterType'
    
    opened by FrancescoSaverioZuppichini 1
  • use compatible cuda version; yield progressive outputs

    use compatible cuda version; yield progressive outputs

    Hi Peter! I was running into issues with Torch so I upgraded to 1.10.0 and it worked locally. However, when pushing to Replicate I needed to set CUDA==10.2 because the NVIDIA driver version on Replicate nodes is less than required by CUDA 11.3.

    opened by andreasjansson 0
Owner
Peter Schaldenbrand
Research programmer and machine learning graduate student at Carnegie Mellon University. Pitt and Central Catholic graduate. Okay at volleyball and crayons.
Peter Schaldenbrand
Styled Augmented Translation

SAT Style Augmented Translation Introduction By collecting high-quality data, we were able to train a model that outperforms Google Translate on 6 dif

null 139 Dec 29, 2022
Guiding evolutionary strategies by (inaccurate) differentiable robot simulators @ NeurIPS, 4th Robot Learning Workshop

Guiding Evolutionary Strategies by Differentiable Robot Simulators In recent years, Evolutionary Strategies were actively explored in robotic tasks fo

Vladislav Kurenkov 4 Dec 14, 2021
Image transformations designed for Scene Text Recognition (STR) data augmentation. Published at ICCV 2021 Workshop on Interactive Labeling and Data Augmentation for Vision.

Data Augmentation for Scene Text Recognition (ICCV 2021 Workshop) (Pronounced as "strog") Paper Arxiv Why it matters? Scene Text Recognition (STR) req

Rowel Atienza 152 Dec 28, 2022
deep learning model that learns to code with drawing in the Processing language

sketchnet sketchnet - processing code generator can we teach a computer to draw pictures with code. We use Processing and java/jruby code paired with

null 41 Dec 12, 2022
Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)

Skyformer This repository is the official implementation of Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr"om Method (NeurIPS 2021).

Qi Zeng 46 Sep 20, 2022
Colour detection is necessary to recognize objects, it is also used as a tool in various image editing and drawing apps.

Colour Detection On Image Colour detection is the process of detecting the name of any color. Simple isn’t it? Well, for humans this is an extremely e

Astitva Veer Garg 1 Jan 13, 2022
CL-Gym: Full-Featured PyTorch Library for Continual Learning

CL-Gym: Full-Featured PyTorch Library for Continual Learning CL-Gym is a small yet very flexible library for continual learning research and developme

Iman Mirzadeh 36 Dec 25, 2022
Ultra-Data-Efficient GAN Training: Drawing A Lottery Ticket First, Then Training It Toughly

Ultra-Data-Efficient GAN Training: Drawing A Lottery Ticket First, Then Training It Toughly Code for this paper Ultra-Data-Efficient GAN Tra

VITA 77 Oct 5, 2022
dualFace: Two-Stage Drawing Guidance for Freehand Portrait Sketching (CVMJ)

dualFace dualFace: Two-Stage Drawing Guidance for Freehand Portrait Sketching (CVMJ) We provide python implementations for our CVM 2021 paper "dualFac

Haoran XIE 46 Nov 10, 2022
A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing.

AnimeGAN A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing. Randomly Generated Images The images are

Jie Lei 雷杰 1.2k Jan 3, 2023
labelpix is a graphical image labeling interface for drawing bounding boxes

Welcome to labelpix ?? labelpix is a graphical image labeling interface for drawing bounding boxes. ?? Homepage Install pip install -r requirements.tx

schissmantics 26 May 24, 2022
PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop.

VoiceLoop PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop. VoiceLoop is a n

Meta Archive 873 Dec 15, 2022
ElegantRL is featured with lightweight, efficient and stable, for researchers and practitioners.

Lightweight, efficient and stable implementations of deep reinforcement learning algorithms using PyTorch. ??

AI4Finance 2.5k Jan 8, 2023
Notepy is a full-featured Notepad Python app

Notepy A full featured python text-editor Notable features Autocompletion for parenthesis and quote Auto identation Syntax highlighting Compile and ru

Mirko Rovere 11 Sep 28, 2022
Source codes of CenterTrack++ in 2021 ICME Workshop on Big Surveillance Data Processing and Analysis

MOT Tracked object bounding box association (CenterTrack++) New association method based on CenterTrack. Two new branches (Tracked Size and IOU) are a

null 36 Oct 4, 2022
TAP: Text-Aware Pre-training for Text-VQA and Text-Caption, CVPR 2021 (Oral)

TAP: Text-Aware Pre-training TAP: Text-Aware Pre-training for Text-VQA and Text-Caption by Zhengyuan Yang, Yijuan Lu, Jianfeng Wang, Xi Yin, Dinei Flo

Microsoft 61 Nov 14, 2022