3D mesh stylization driven by a text input in PyTorch

Overview

Text2Mesh [Project Page]

arXiv Pytorch crochet candle Text2Mesh is a method for text-driven stylization of a 3D mesh, as described in "Text2Mesh: Text-Driven Neural Stylization for Meshes" (forthcoming).

Getting Started

Installation

Note: The below installation will fail if run on something other than a CUDA GPU machine.

conda env create --file text2mesh.yml
conda activate text2mesh
System requirements [click to expand] ### System Requirements - Python 3.7 - CUDA 10.2 - GPU w/ minimum 8 GB ram

Run examples

Call the below shell scripts to generate example styles.

# cobblestone alien
./demo/run_alien_cobble.sh
# shoe made of cactus 
./demo/run_shoe.sh
# lamp made of brick
./demo/run_lamp.sh
# ...

The outputs will be saved to results/demo, with the stylized .obj files, colored and uncolored render views, and screenshots during training.

Outputs

alien alien geometry alien style

alien alien geometry alien style

candle candle geometry candle style

person ninja geometry ninja style

shoe shoe geometry shoe style

vase vase geometry vase style

lamp lamp geometry lamp style

horse horse geometry horse style

Citation

@article{text2mesh,
    author = {Michel, Oscar
              and Bar-On, Roi
              and Liu, Richard
              and Benaim, Sagie
              and Hanocka, Rana
              },
    title = {{Text2Mesh: Text-Driven Neural Stylization for Meshes}},
    journal = {TODO: ARXIV},
    year  = {2021}
}
Comments
  • Two Intro Questions

    Two Intro Questions

    I did not see a Discussion section, so asking here in Issues: (1) Has anyone published a Google Colab notebook? and (2) how was the animation created of the vase?

    For (1), I can whip one up assuming that the CUDA matches what Colab gives. Otherwise, this needs tweaking

    For (2), it looks like you are varying the seed and/or the text prompt and then perhaps creating keyframes which are then interpolated? It is a cool effect. The other would be to explore 3D printing of things like "the cactus shoe"

    opened by metaphorz 1
  • How to run on GPU with cuda 11.3 or 11.2

    How to run on GPU with cuda 11.3 or 11.2

    Hi! Do you know, maybe, how to run the code on GPU with CUDA 11.3 or 11.2? AFAIU, one would need pytorch 1.10 for this, but then kaolin asks for pytorch <= 1.9

    opened by gexahedron 1
  • Fix typo in main.py (introduced in #24)

    Fix typo in main.py (introduced in #24)

    Running the demo scripts on a fresh clone of the repo throws an error:

    (text2mesh) cpacker@...:text2mesh$ ./demo/run_alien_cobble.sh
    Traceback (most recent call last):
      File "text2mesh/main.py", line 499, in <module>
        run_branched(args)
      File "text2mesh/main.py", line 34, in run_branched
        clip_model, preprocess = clip.load(args.clipmodel, device, jit=args.jit)
      File "anaconda3/envs/text2mesh/lib/python3.9/site-packages/clip/clip.py", line 124, in load
        raise RuntimeError(f"Model {name} not found; available models = {available_models()}")
    RuntimeError: Model VIT-B/32 not found; available models = ['RN50', 'RN101', 'RN50x4', 'RN50x16', 'RN50x64', 'ViT-B/32', 'ViT-B/16', 'ViT-L/14', 'ViT-L/14@336px']
    

    Seems like this is due to a typo here, VIT should be ViT.

    Same command post-fix:

    (text2mesh) cpacker@...:text2mesh$ ./demo/run_alien_cobble.sh
    ModuleList(
      (0): FourierFeatureTransform()
      (1): ProgressiveEncoding()
      (2): Linear(in_features=515, out_features=256, bias=True)
      (3): ReLU()
      (4): Linear(in_features=256, out_features=256, bias=True)
      (5): ReLU()
      (6): Linear(in_features=256, out_features=256, bias=True)
      (7): ReLU()
      (8): Linear(in_features=256, out_features=256, bias=True)
      (9): ReLU()
      (10): Linear(in_features=256, out_features=256, bias=True)
      (11): ReLU()
    )
    ModuleList(
      (0): Linear(in_features=256, out_features=256, bias=True)
      (1): ReLU()
      (2): Linear(in_features=256, out_features=256, bias=True)
      (3): ReLU()
      (4): Linear(in_features=256, out_features=3, bias=True)
    )
    ModuleList(
      (0): Linear(in_features=256, out_features=256, bias=True)
      (1): ReLU()
      (2): Linear(in_features=256, out_features=256, bias=True)
      (3): ReLU()
      (4): Linear(in_features=256, out_features=1, bias=True)
    )
      0%|                                                                                                                                           | 0/1500 [00:00<?, ?it/s]
    
    opened by cpacker 0
  • Load different clip models, jit option

    Load different clip models, jit option

    Update parser options to take in any CLIP model + with JIT. The render image resolution is updated according to the model specifications. Note: as new CLIP models become public, for the sake of efficiency it will probably be good to just replace the render preprocessing (e.g. setting the render resolution) with the preprocess function given in the CLIP model loading. We can just set the default render resolution to be the current max of the models to prevent too much blurring from upsampling.

    Results shown on the ninja with each model with the same options + seed: RN50 Screen Shot 2022-08-26 at 12 00 09 PM RN101 Screen Shot 2022-08-26 at 12 00 23 PM RN50x4 Screen Shot 2022-08-26 at 12 00 32 PM RN50x16 Screen Shot 2022-08-26 at 12 00 47 PM RN50x64 Screen Shot 2022-08-26 at 12 00 57 PM VIT-B/32 Screen Shot 2022-08-26 at 12 01 08 PM VIT-B/16 Screen Shot 2022-08-26 at 12 01 16 PM VIT-L/14 Screen Shot 2022-08-26 at 12 01 24 PM VIT-L/14@336px Screen Shot 2022-08-26 at 12 01 31 PM

    opened by factoryofthesun 0
  • About clip and kaolin problem

    About clip and kaolin problem

    Instead of using the pip installation method of yml, install it according to the methods on the clip and kaolin homepages, just make sure that there are clip1.0 and kaolin 0.12.0 in the conda list.

    1. Install clip1.0: pip install git+https://github.com/openai/CLIP.git 2. Install kaolin0.12.0: vi ~/.bashrc export CUDA_HOME=/usr/local/cuda source ~/.bashrc conda activate text2mesh git clone --recursive https://github.com/NVIDIAGameWorks/kaolin cd kaolin git checkout v0.12.0 python setup.py develop

    Please refer to the following URL clip 1.0 kaolin 0.12.0 For more details please refer to my blog My Blog

    opened by Doggerlas 3
  • clip and kaolin have not been imported in the conda env

    clip and kaolin have not been imported in the conda env

    Hello,

    I have no errors when I run conda env create --file text2mesh.yml and conda activate text2mesh commands.

    But when I execute the ./demo/run_candle.sh file, I get this error : Traceback (most recent call last): File "C:\Users\matth\text2mesh\main.py", line 3, in <module> import kaolin.ops.mesh ModuleNotFoundError: No module named 'kaolin'

    I have also the same error with the clip module.

    I am supposed to be in the env with everything perfectly installed.

    Please help me

    opened by mattmaxXXX 4
  • How do I provide an argument to use an image target rather than a text prompt?

    How do I provide an argument to use an image target rather than a text prompt?

    I removed --prompt from the settings of run_shoe.sh and set values for --no_prompt and --image, and the quality of results were really bad. Details are as follows:

    case1

    • run_shoe.sh python main.py --run branch --obj_path data/source_meshes/shoe.obj --output_dir results/demo/shoe/texture/brick --no_prompt --image data/target_texture/brick_texture.jpg --sigma 5.0 --clamp tanh --n_normaugs 4 --n_augs 1 --normmincrop 0.1 --normmaxcrop 0.1 --geoloss --colordepth 2 --normdepth 2 --frontview --frontview_std 4 --clipavg view --lr_decay 0.9 --clamp tanh --normclamp tanh --maxcrop 1.0 --save_render --seed 11 --n_iter 1500 --learning_rate 0.0005 --normal_learning_rate 0.0005 --background 1 1 1 --frontview_center 0.5 0.6283

    • brick_texture.jpg image

    • result image

    case2

    • run_shoe.sh python main.py --run branch --obj_path data/source_meshes/shoe.obj --output_dir results/demo/shoe/texture2/cactus --no_prompt --image data/target_texture/cactus_texture.jpg --sigma 5.0 --clamp tanh --n_normaugs 4 --n_augs 1 --normmincrop 0.1 --normmaxcrop 0.1 --geoloss --colordepth 2 --normdepth 2 --frontview --frontview_std 4 --clipavg view --lr_decay 0.9 --clamp tanh --normclamp tanh --maxcrop 1.0 --save_render --seed 11 --n_iter 1500 --learning_rate 0.0005 --normal_learning_rate 0.0005 --background 1 1 1 --frontview_center 0.5 0.6283

    • cactus_texture.jpg image

    • result image

    Did I set the parameter wrong? Or is there something in main.py that needs to be modified? Or is it a randomness issue in optimization?

    When --no_prompt is set to True and --image is set to image path string in main.py, the loss code corresponding to 'local to global' and 'local to displacement' in the paper is not understood. Should I change this part?

    opened by sj978 0
  • What is the function of progressive encoding?

    What is the function of progressive encoding?

    Thanks for your excellent work! When I run your demo, I notice that the input points are sent to progressive encoding. If I disable this module, the final result dosen't change a lot. So why do we need this module? Here is the two picture of results( promt: an image of a shoe made of cactus, obj: shoe), the first one is WITH progressive encoding, the second one is WITHOUT progressive encoding.

    PE

    NO_PE

    opened by cyw-3d 1
Owner
Threedle (University of Chicago)
Threedle (University of Chicago)
This is the official Pytorch implementation of the paper "Diverse Motion Stylization for Multiple Style Domains via Spatial-Temporal Graph-Based Generative Model"

Diverse Motion Stylization (Official) This is the official Pytorch implementation of this paper. Diverse Motion Stylization for Multiple Style Domains

Soomin Park 28 Dec 16, 2022
A repo that contains all the mesh keys needed for mesh backend, along with a code example of how to use them in python

Mesh-Keys A repo that contains all the mesh keys needed for mesh backend, along with a code example of how to use them in python Have been seeing alot

Joseph 53 Dec 13, 2022
CoSMA: Convolutional Semi-Regular Mesh Autoencoder. From Paper "Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes"

Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes Implementation of CoSMA: Convolutional Semi-Regular Mesh Autoencoder arXiv p

Fraunhofer SCAI 10 Oct 11, 2022
Given a 2D triangle mesh, we could randomly generate cloud points that fill in the triangle mesh

generate_cloud_points Given a 2D triangle mesh, we could randomly generate cloud points that fill in the triangle mesh. Run python disp_mesh.py Or you

Peng Yu 2 Dec 24, 2021
AI Face Mesh: This is a simple face mesh detection program based on Artificial intelligence.

AI Face Mesh: This is a simple face mesh detection program based on Artificial Intelligence which made with Python. It's able to detect 468 different

Md. Rakibul Islam 1 Jan 13, 2022
A few stylization coreML models that I've trained with CreateML

CoreML-StyleTransfer A few stylization coreML models that I've trained with CreateML You can open and use the .mlmodel files in the "models" folder in

Doron Adler 8 Aug 18, 2022
Official code for paper Exemplar Based 3D Portrait Stylization.

3D-Portrait-Stylization This is the official code for the paper "Exemplar Based 3D Portrait Stylization". You can check the paper on our project websi

null 60 Dec 7, 2022
Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel order of RGB and BGR. Simple Channel Converter for ONNX.

scc4onnx Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel

Katsuya Hyodo 16 Dec 22, 2022
A PyTorch port of the Neural 3D Mesh Renderer

Neural 3D Mesh Renderer (CVPR 2018) This repo contains a PyTorch implementation of the paper Neural 3D Mesh Renderer by Hiroharu Kato, Yoshitaka Ushik

Daniilidis Group University of Pennsylvania 1k Jan 9, 2023
Official Pytorch implementation of "Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes", CVPR 2022

Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes / 3DCrowdNet News ?? 3DCrowdNet achieves the state-of-the-art accuracy on 3D

Hongsuk Choi 113 Dec 21, 2022
Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

SwinTextSpotter This is the pytorch implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text R

mxin262 183 Jan 3, 2023
[SIGGRAPH 2022 Journal Track] AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars

AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars Fangzhou Hong1*  Mingyuan Zhang1*  Liang Pan1  Zhongang Cai1,2,3  Lei Yang2 

Fangzhou Hong 749 Jan 4, 2023
PyTorch implementation of Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation.

ALiBi PyTorch implementation of Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation. Quickstart Clone this reposit

Jake Tae 4 Jul 27, 2022
Official PyTorch implementation of "IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos", CVPRW 2021

IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos Introduction This repo is official PyTorch implementatio

Gyeongsik Moon 29 Sep 24, 2022
This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis | Project Page | Paper | PyTorch implementation for the paper "AD-NeRF: Audio

null 551 Dec 29, 2022
Pytorch implementation of CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation"

MUST-GAN Code | paper The Pytorch implementation of our CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generat

TianxiangMa 46 Dec 26, 2022
The official implementation of NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021]. https://arxiv.org/pdf/2101.12378.pdf

NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021] Release Notes The offical PyTorch implementation of NeMo, p

Angtian Wang 76 Nov 23, 2022
[ECCV 2020] Reimplementation of 3DDFAv2, including face mesh, head pose, landmarks, and more.

Stable Head Pose Estimation and Landmark Regression via 3D Dense Face Reconstruction Reimplementation of (ECCV 2020) Towards Fast, Accurate and Stable

Remilia Scarlet 221 Dec 30, 2022
MediaPipeのPythonパッケージのサンプルです。2020/12/11時点でPython実装のある4機能(Hands、Pose、Face Mesh、Holistic)について用意しています。

mediapipe-python-sample MediaPipeのPythonパッケージのサンプルです。 2020/12/11時点でPython実装のある以下4機能について用意しています。 Hands Pose Face Mesh Holistic Requirement mediapipe 0.

KazuhitoTakahashi 217 Dec 12, 2022