Homepage of paper: Paint Transformer: Feed Forward Neural Painting with Stroke Prediction, ICCV 2021.

Overview

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction

[Paper] [PaddlePaddle Implementation]

Homepage of paper:

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction,

Songhua Liu*, Tianwei Lin*, Dongliang He, Fu Li, Ruifeng Deng, Xin Li, Errui Ding, Hao Wang (* indicates equal contribution)

ICCV 2021 (Oral)

Input Animated Output

App

Citation

  • If you find ideas useful for your research, please consider citing:

    @inproceedings{liu2021paint,
      title={Paint Transformer: Feed Forward Neural Painting with Stroke Prediction},
      author={Liu, Songhua and Lin, Tianwei and He, Dongliang and Li, Fu and Deng, Ruifeng and Li, Xin and Ding, Errui and Wang, Hao},
      booktitle={Proceedings of the IEEE International Conference on Computer Vision},
      year={2021}
    }
    
Comments
  • why grid need to divide 2 and add 0.25

    why grid need to divide 2 and add 0.25

    从代码上来看,grid表示归一化后的位置信息,范围是0-1 那么这里除以2 再加上0.25的操作怎么理解? 同时回归的宽高也除以2 https://github.com/Huage001/PaintTransformer/blob/7960b8eef2f6af7171af01933a3d8dbc73190a8d/inference/inference.py#L476 https://github.com/Huage001/PaintTransformer/blob/7960b8eef2f6af7171af01933a3d8dbc73190a8d/inference/inference.py#L477

    opened by junedgar 6
  • inference seems to take way more memory

    inference seems to take way more memory

    Hi, I wanted to try to play with so I cloned it locally and run inference but seems I can't fit GPU memory even it shall be fine as the model askes for 3.5GB and I have RTX 2070 with 8GB memory which is completely free...

    python inference.py
    Traceback (most recent call last):
      File "inference.py", line 490, in <module>
        main(input_path='input/chicago.jpg',
      File "inference.py", line 445, in main
        final_result = param2img_parallel(param, decision, meta_brushes, final_result)
      File "inference.py", line 263, in param2img_parallel
        valid_foregrounds, valid_alphas = param2stroke(param[decision, :], patch_size_y, patch_size_x, meta_brushes)
      File "inference.py", line 74, in param2stroke
        foreground = morphology.dilation(foreground)
      File "/home/jirka/Workingspace/PaintTransformer/inference/morphology.py", line 49, in dilation
        channel = nn.functional.unfold(x_pad, 2 * m + 1, padding=0, stride=1).view(b, c, -1, h, w)
      File "/home/jirka/.local/lib/python3.8/site-packages/torch/nn/functional.py", line 4472, in unfold
        return torch._C._nn.im2col(input, _pair(kernel_size), _pair(dilation), _pair(padding), _pair(stride))
    RuntimeError: CUDA out of memory. Tried to allocate 3.38 GiB (GPU 0; 7.80 GiB total capacity; 3.39 GiB already allocated; 3.25 GiB free; 3.41 GiB reserved in total by PyTorch)
    

    well, it seems like it needs to allocate twice the amount, so maybe allow also CPU inference?

    opened by Borda 3
  •  I don't have access to pre-trained models stored in Google Drive

    I don't have access to pre-trained models stored in Google Drive

    Thank you for your wonderful work.

    I didn't have access to Google Drive. Will the Pretrained model be published later? Or do you need a request for access?

    opened by Kazuhito00 2
  • Define the script arguments from the console

    Define the script arguments from the console

    Firstly, I am very impressed with your project. Great work!

    Current behavior

    Currently, the user has to open the inference.py file and manually change the location of the input file and other necessary parameters.

    Expected behavior

    The user can provide the script arguments from the console and does not need to edit the inference.py file.

    For example:

    python inference.py serial=True
    

    I am happy to contribute and add this change if you see value in it. :)

    opened by SkalskiP 1
  • Python requirements

    Python requirements

    Firstly, I am very impressed with your project. Great work!

    Current behavior

    Currently, a potential user of the repository has to install each necessary library separately

    Expected behavior

    The required libraries would be listed in the requirements.txt file and then the necessary python environment could be installed with a single command.

    pip install -r requirements.txt
    

    I am happy to contribute and add this change if you see value in it. :)

    opened by SkalskiP 1
  • how is Stylized Painting generated?

    how is Stylized Painting generated?

    hi, great work! I am just wondering how is stylized painting generated? In the paper it was simply put "utilize existing style transfer methods", does it mean you first stylize the original picture to "stylized picture" then convert to painting with your work? or there is something to do with your stylized renderer?

    If it is the former one, I suggest you put the stylized picture alongside with the final result to avoid any confusion. If it is the latter one, can you share some more details?

    Thanks,

    opened by qianyizhang 1
  • A possible bug in inference.py

    A possible bug in inference.py

    At line 473 of inference.py, it writes

    decision = stroke_decision.view(1, h, w, stroke_num).contiguous().bool()
    

    However, this float to bool convert seems to behave wrongly. It almost always yields all true decisions. (I didn't find a document that specifies the converting rule in PyTorch. However, in other languages, float with non-zero values are treated as true.) Consequently, this causes so-called lattice artifacts in the final result.

    image

    The following code fixes this issue. :hugs:

    decision = stroke_decision.view(1, h, w, stroke_num).contiguous() > 0
    
    image
    opened by CWHer 0
Owner
null
Stroke-predictions-ml-model - Machine learning model to predict individuals chances of having a stroke

stroke-predictions-ml-model machine learning model to predict individuals chance

Alex Volchek 1 Jan 3, 2022
Official code release for ICCV 2021 paper SNARF: Differentiable Forward Skinning for Animating Non-rigid Neural Implicit Shapes.

Official code release for ICCV 2021 paper SNARF: Differentiable Forward Skinning for Animating Non-rigid Neural Implicit Shapes.

null 235 Dec 26, 2022
Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt

Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt. This is done by

Mehdi Cherti 135 Dec 30, 2022
Official PyTorch implementation of the preprint paper "Stylized Neural Painting", accepted to CVPR 2021.

Official PyTorch implementation of the preprint paper "Stylized Neural Painting", accepted to CVPR 2021.

Zhengxia Zou 1.5k Dec 28, 2022
Official PaddlePaddle implementation of Paint Transformer

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction [Paper] [Paddle Implementation] Update We have optimized the serial inference p

TianweiLin 284 Dec 31, 2022
The official homepage of the COCO-Stuff dataset.

The COCO-Stuff dataset Holger Caesar, Jasper Uijlings, Vittorio Ferrari Welcome to official homepage of the COCO-Stuff [1] dataset. COCO-Stuff augment

Holger Caesar 715 Dec 31, 2022
The official homepage of the (outdated) COCO-Stuff 10K dataset.

COCO-Stuff 10K dataset v1.1 (outdated) Holger Caesar, Jasper Uijlings, Vittorio Ferrari Overview Welcome to official homepage of the COCO-Stuff [1] da

Holger Caesar 263 Dec 11, 2022
The official code of Anisotropic Stroke Control for Multiple Artists Style Transfer

ASMA-GAN Anisotropic Stroke Control for Multiple Artists Style Transfer Proceedings of the 28th ACM International Conference on Multimedia The officia

Six_God 146 Nov 21, 2022
Joint parameterization and fitting of stroke clusters

StrokeStrip: Joint Parameterization and Fitting of Stroke Clusters Dave Pagurek van Mossel1, Chenxi Liu1, Nicholas Vining1,2, Mikhail Bessmeltsev3, Al

Dave Pagurek 44 Dec 1, 2022
ShuttleNet: Position-aware Fusion of Rally Progress and Player Styles for Stroke Forecasting in Badminton (AAAI 2022)

ShuttleNet: Position-aware Rally Progress and Player Styles Fusion for Stroke Forecasting in Badminton (AAAI 2022) Official code of the paper ShuttleN

Wei-Yao Wang 11 Nov 30, 2022
Painting app using Python machine learning and vision technology.

AI Painting App We are making an app that will track our hand and helps us to draw from that. We will be using the advance knowledge of Machine Learni

Badsha Laskar 3 Oct 3, 2022
VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).

VSR-Transformer By Jiezhang Cao, Yawei Li, Kai Zhang, Luc Van Gool This paper proposes a new Transformer for video super-resolution (called VSR-Transf

Jiezhang Cao 225 Nov 13, 2022
Pytorch implementation of "Forward Thinking: Building and Training Neural Networks One Layer at a Time"

forward-thinking-pytorch Pytorch implementation of Forward Thinking: Building and Training Neural Networks One Layer at a Time Requirements Python 2.7

Kim Heecheol 65 Oct 6, 2022
Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

Vision Transformer with Progressive Sampling This is the official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

yuexy 123 Jan 1, 2023
Code release for ICCV 2021 paper "Anticipative Video Transformer"

Anticipative Video Transformer Ranked first in the Action Anticipation task of the CVPR 2021 EPIC-Kitchens Challenge! (entry: AVT-FB-UT) [project page

Facebook Research 123 Dec 13, 2022
Code for the ICCV 2021 Workshop paper: A Unified Efficient Pyramid Transformer for Semantic Segmentation.

Unified-EPT Code for the ICCV 2021 Workshop paper: A Unified Efficient Pyramid Transformer for Semantic Segmentation. Installation Linux, CUDA>=10.0,

null 29 Aug 23, 2022
Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021

ACTOR Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021. Please visit our we

Mathis Petrovich 248 Dec 23, 2022
Implementation for paper "STAR: A Structure-aware Lightweight Transformer for Real-time Image Enhancement" (ICCV 2021).

STAR-pytorch Implementation for paper "STAR: A Structure-aware Lightweight Transformer for Real-time Image Enhancement" (ICCV 2021). CVF (pdf) STAR-DC

null 43 Dec 21, 2022