SMPLpix: Neural Avatars from 3D Human Models

Overview
subject0_validation_poses.mp4

Left: SMPL-X human mesh registered with SMPLify-X, middle: SMPLpix render, right: ground truth video.

SMPLpix: Neural Avatars from 3D Human Models

SMPLpix neural rendering framework combines deformable 3D models such as SMPL-X with the power of image-to-image translation frameworks (aka pix2pix models).

Please check our WACV 2021 paper or a 5-minute explanatory video for more details on the framework.

Important note: this repository is a re-implementation of the original framework, made by the same author after the end of internship. It does not contain the original Amazon multi-subject, multi-view training data and code, and uses full mesh rasterizations as inputs rather than point projections (as described here).

Demo

Description Link
Process a video into a SMPLpix dataset Open In Colab
Train SMPLpix Open In Colab

Prepare the data

demo_openpose_simplifyx

We provide the Colab notebook for preparing SMPLpix training dataset. This will allow you to create your own neural avatar given monocular video of a human moving in front of the camera.

Run demo training

We provide some preprocessed data which allows you to run and test the training pipeline right away:

git clone https://github.com/sergeyprokudin/smplpix
cd smplpix
python setup.py install
python smplpix/train.py --workdir='/content/smplpix_logs/' \
                        --data_url='https://www.dropbox.com/s/coapl05ahqalh09/smplpix_data_test_final.zip?dl=0'

Train on your own data

You can train SMPLpix on your own data by specifying the path to the root directory with data:

python smplpix/train.py --workdir='/content/smplpix_logs/' \
                        --data_dir='/path/to/data'

The directory should contain train, validation and test folders, each of which should contain input and output folders. Check the structure of the demo dataset for reference.

You can also specify various parameters of training via command line. E.g., to reproduce the results of the demo video:

python smplpix/train.py --workdir='/content/smplpix_logs/' \
                        --data_url='https://www.dropbox.com/s/coapl05ahqalh09/smplpix_data_test_final.zip?dl=0' \
                        --downsample_factor=2 \
                        --n_epochs=500 \
                        --sched_patience=2 \
                        --batch_size=4 \
                        --n_unet_blocks=5 \
                        --n_input_channels=3 \
                        --n_output_channels=3 \
                        --eval_every_nth_epoch=10

Check the args.py for the full list of parameters.

More examples

Animating with novel poses

subject0_test_poses.mp4

Left: poses from the test video sequence, right: SMPLpix renders.

Rendering faces

deca_smplpix_test_renders.mp4

Left: FLAME face model inferred with DECA, middle: ground truth test video, right: SMPLpix render.

Thanks to Maria Paola Forte for providing the sequence.

Few-shot artistic neural style transfer

kabarov_animations.mp4

Left: rendered AMASS motion sequence, right: generated SMPLpix animations. See the explanatory video for details.

Credits to Alexander Kabarov for providing the training sketches.

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{prokudin2021smplpix,
  title={SMPLpix: Neural Avatars from 3D Human Models},
  author={Prokudin, Sergey and Black, Michael J and Romero, Javier},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={1810--1819},
  year={2021}
}

License

See the LICENSE file.

Comments
  • Mediapipe pose use instead of openpose

    Mediapipe pose use instead of openpose

    Media pipe pose use instead of open pose because it throws building error

    https://colab.research.google.com/drive/1uCuA6We9T5r0WljspEHWPHXCT_2bMKUy

    https://google.github.io/mediapipe/solutions/holistic

    opened by 1kaiser 10
  • 'Flatten the video into frames' error message

    'Flatten the video into frames' error message

    Hello Sergey and thanks for the amazing work you did.

    I was trying to run your colab and was able to upload my video (from google drive), but when I got to the second part ("Flatten the video into frames") I'm getting a fail with the following error message below, thanks in advance for your help:

    ffmpeg version 3.4.8-0ubuntu0.2 Copyright (c) 2000-2020 the FFmpeg developers built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04) configuration: --prefix=/usr --extra-version=0ubuntu0.2 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared libavutil 55. 78.100 / 55. 78.100 libavcodec 57.107.100 / 57.107.100 libavformat 57. 83.100 / 57. 83.100 libavdevice 57. 10.100 / 57. 10.100 libavfilter 6.107.100 / 6.107.100 libavresample 3. 7. 0 / 3. 7. 0 libswscale 4. 8.100 / 4. 8.100 libswresample 2. 9.100 / 2. 9.100 libpostproc 54. 7.100 / 54. 7.100 : No such file or directory


    IndexError Traceback (most recent call last)

    in () 23 return np.asarray(Image.open(img_path))/255 24 ---> 25 test_img_path = os.path.join(FRAMES_DIR, os.listdir(FRAMES_DIR)[0]) 26 27 test_img = load_img(test_img_path)

    IndexError: list index out of range

    opened by arnaudskyvr 6
  • Get 3d textured model

    Get 3d textured model

    How do I get the corresponding SMPL textured model for my video with vertex color or UV map for my video input? As in #16 smplx_verts_colors.txt and the output from where do I get those

    opened by kashyappiyush1998 5
  • IndexError when Flattening the video into frames

    IndexError when Flattening the video into frames

        Hi @sergeyprokudin.
    

    This is similar error but it should be from a different source. I checked that the video I uploaded was done correctly in the previous cell. Any advise on this?

    
    `ffmpeg version 3.4.11-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers
      built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
      configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared
      libavutil      55. 78.100 / 55. 78.100
      libavcodec     57.107.100 / 57.107.100
      libavformat    57. 83.100 / 57. 83.100
      libavdevice    57. 10.100 / 57. 10.100
      libavfilter     6.107.100 /  6.107.100
      libavresample   3.  7.  0 /  3.  7.  0
      libswscale      4.  8.100 /  4.  8.100
      libswresample   2.  9.100 /  2.  9.100
      libpostproc    54.  7.100 / 54.  7.100
    Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/content/Salvadorian_1.mp4':
      Metadata:
        major_brand     : mp42
        minor_version   : 0
        compatible_brands: mp42mp41
        creation_time   : 2021-08-25T23:28:15.000000Z
      Duration: 00:00:31.51, start: 0.000000, bitrate: 10955 kb/s
        Stream #0:0(eng): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 10646 kb/s, 59.94 fps, 59.94 tbr, 60k tbn, 119.88 tbc (default)
        Metadata:
          creation_time   : 2021-08-25T23:28:15.000000Z
          handler_name    : Alias Data Handler
          encoder         : AVC Coding
        Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 317 kb/s (default)
        Metadata:
          creation_time   : 2021-08-25T23:28:15.000000Z
          handler_name    : Alias Data Handler
    Stream mapping:
      Stream #0:0 -> #0:0 (h264 (native) -> png (native))
    Press [q] to stop, [?] for help
    Output #0, image2, to '$FRAMES_DIR/%05d.png':
      Metadata:
        major_brand     : mp42
        minor_version   : 0
        compatible_brands: mp42mp41
        encoder         : Lavf57.83.100
        Stream #0:0(eng): Video: png, rgb24, 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 1 fps, 1 tbn, 1 tbc (default)
        Metadata:
          creation_time   : 2021-08-25T23:28:15.000000Z
          handler_name    : Alias Data Handler
          encoder         : Lavc57.107.100 png
    [image2 @ 0x55a7d606be00] Could not open file : $FRAMES_DIR/00001.png
    av_interleaved_write_frame(): Input/output error
    frame=    4 fps=0.0 q=-0.0 Lsize=N/A time=00:00:01.00 bitrate=N/A speed=1.64x    
    video:1475kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
    Conversion failed!
    ---------------------------------------------------------------------------
    IndexError                                Traceback (most recent call last)
    [<ipython-input-5-73a0be7e8d8e>](https://localhost:8080/#) in <module>
         23   return np.asarray(Image.open(img_path))/255
         24 
    ---> 25 test_img_path = os.path.join(FRAMES_DIR, os.listdir(FRAMES_DIR)[0])
         26 
         27 test_img = load_img(test_img_path)
    
    IndexError: list index out of range`
    

    Originally posted by @ss8319 in https://github.com/sergeyprokudin/smplpix/issues/15#issuecomment-1321855384

    opened by ss8319 4
  • Runtime error while training for a custom video

    Runtime error while training for a custom video

    Hey, I was trying to train the model using a custom dataset generated from my video and I'm getting this runtime error

    starting training... /usr/local/lib/python3.7/dist-packages/torchvision/models/_utils.py:209: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead. f"The parameter '{pretrained_param}' is deprecated since 0.13 and will be removed in 0.15, " /usr/local/lib/python3.7/dist-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or None for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing weights=VGG16_Weights.IMAGENET1K_V1. You can also use weights=VGG16_Weights.DEFAULT to get the most up-to-date weights. warnings.warn(msg) Downloading: "https://download.pytorch.org/models/vgg16-397923af.pth" to /root/.cache/torch/hub/checkpoints/vgg16-397923af.pth 100% 528M/528M [00:55<00:00, 9.98MB/s] 0% 0/50 [00:08<?, ?it/s] Traceback (most recent call last): File "smplpix/train.py", line 129, in main() File "smplpix/train.py", line 112, in main init_lr=args.learning_rate) File "/usr/local/lib/python3.7/dist-packages/smplpix-1.0-py3.7.egg/smplpix/training.py", line 30, in train for batch_idx, (x, ytrue, img_names) in enumerate(train_dataloader): File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 652, in next data = self._next_data() File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 692, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch return self.collate_fn(data) File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 175, in default_collate return [default_collate(samples) for samples in transposed] # Backwards compatibility. File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 175, in return [default_collate(samples) for samples in transposed] # Backwards compatibility. File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 141, in default_collate return torch.stack(batch, 0, out=out) RuntimeError: stack expects each tensor to be equal size, but got [3, 270, 300] at entry 0 and [3, 149, 84] at entry 1

    Any specific reason behind it?

    opened by Alwinseb01 4
  • Unable to upload video in colab demo

    Unable to upload video in colab demo

    @sergeyprokudin Hi,thanks for releasing the code.I tried to use the colab to get an understanding of how the code works,but it seems I am unable to upload a youtube video. When I pass the youtube url https://www.youtube.com/watch?v=gFr0_ywVdhY,it gives me an error YOTUBE_VIDEO_URL = #@param. The same holds when I select google drive/upload my own video. Any idea why this is happening.

    Also,as mentioned in issue #4 ,is it possible to use mediapipe instead of openpose,if yes then I would really appreciate it if you could give me some suggestions on how to use the mediapipe results ,because the output of mediapipe and openpose are different. Thanks

    opened by sparshgarg23 4
  • Colorful mesh input

    Colorful mesh input

    Hi, I'm getting hard to understand how to export the colorful mesh from Smplify-X I'm succeed to export .obj file and overlay mesh image but not the colorful mesh that smplpix train on. Can you shed light on how to do it?

    opened by korenleven 3
  • render the SCALE output

    render the SCALE output

    Hi, i have a sequence of .ply files of dense point sets (output of the SCALE paper) how can i render them with smplpix to come up with real images ? they mentioned in the paper that this is possible so any help?

    opened by omarmohamed101 3
  • Value error while training the mdoel

    Value error while training the mdoel

    Hey there! While running the script to train the model on demo dataset, I'm facing a value error:

    Traceback (most recent call last): File "smplpix/train.py", line 129, in main() File "smplpix/train.py", line 112, in main init_lr=args.learning_rate) File "/usr/local/lib/python3.7/dist-packages/smplpix-1.0-py3.7.egg/smplpix/training.py", line 30, in train for batch_idx, (x, ytrue, img_names) in enumerate(train_dataloader): File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 652, in next data = self._next_data() File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 692, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/usr/local/lib/python3.7/dist-packages/smplpix-1.0-py3.7.egg/smplpix/dataset.py", line 96, in getitem x, y = self._augment_images(x, y) File "/usr/local/lib/python3.7/dist-packages/smplpix-1.0-py3.7.egg/smplpix/dataset.py", line 77, in _augment_images shear=0, fill=self.input_fill_color) File "/usr/local/lib/python3.7/dist-packages/torchvision/transforms/functional.py", line 1204, in affine return F_pil.affine(img, matrix=matrix, interpolation=pil_interpolation, fill=fill) File "/usr/local/lib/python3.7/dist-packages/torchvision/transforms/functional_pil.py", line 326, in affine opts = _parse_fill(fill, img) File "/usr/local/lib/python3.7/dist-packages/torchvision/transforms/functional_pil.py", line 309, in _parse_fill fill = int(fill) ValueError: invalid literal for int() with base 10: 'white'

    Any help regarding the same would be highly appreciated. Thanks!

    opened by Alwinseb01 2
  • Train Wrong

    Train Wrong

    Traceback (most recent call last): File "D:\Users\17718\Desktop\smplpix-main\build\lib\smplpix\train.py", line 129, in main() File "D:\Users\17718\Desktop\smplpix-main\build\lib\smplpix\train.py", line 64, in main train_dir = os.path.join(args.data_dir, 'train') File "C:\Users\17718.conda\envs\py\lib\ntpath.py", line 78, in join path = os.fspath(path) TypeError: expected str, bytes or os.PathLike object, not NoneType

    opened by szh-1598 2
  • Some problems with custom data!

    Some problems with custom data!

    Thank you for your great work! When I train on my own data, I find that there are always color darkness inconsistencies between the rendered images and the ground truth. 000537 I also trained on the demo data you provided and met the same problem. 00000

    opened by JanaldoChen 2
  • Rendering meshes.

    Rendering meshes.

    Hey first of all great work!

    Im getting issues when trying to Run SMPLify-X, the program is processing, but the output is the mesh file but not an image render which can be used for training. How do I solve this?

    opened by AIMads 2
Owner
Sergey Prokudin
Postdoctoral researcher in computer vision and machine learning
Sergey Prokudin
Avatarify Python - Avatars for Zoom, Skype and other video-conferencing apps.

Avatarify Python - Avatars for Zoom, Skype and other video-conferencing apps.

Ali Aliev 15.3k Jan 5, 2023
Generate pixel-style avatars with python.

face2pixel Generate pixel-style avatars with python. Run: Clone the project: git clone https://github.com/theodorecooper/face2pixel install requiremen

Theodore Cooper 2 May 11, 2022
[SIGGRAPH 2022 Journal Track] AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars

AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars Fangzhou Hong1*  Mingyuan Zhang1*  Liang Pan1  Zhongang Cai1,2,3  Lei Yang2 

Fangzhou Hong 749 Jan 4, 2023
[CVPR2021] UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles

UAV-Human Official repository for CVPR2021: UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicle Paper arXiv Res

null 129 Jan 4, 2023
Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors, CVPR 2021

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors Human POSEitioning System (H

Aymen Mir 66 Dec 21, 2022
Human Action Controller - A human action controller running on different platforms.

Human Action Controller (HAC) Goal A human action controller running on different platforms. Fun Easy-to-use Accurate Anywhere Fun Examples Mouse Cont

null 27 Jul 20, 2022
Python scripts for performing 3D human pose estimation using the Mobile Human Pose model in ONNX.

Python scripts for performing 3D human pose estimation using the Mobile Human Pose model in ONNX.

Ibai Gorordo 99 Dec 31, 2022
StyleGAN-Human: A Data-Centric Odyssey of Human Generation

StyleGAN-Human: A Data-Centric Odyssey of Human Generation Abstract: Unconditional human image generation is an important task in vision and graphics,

stylegan-human 762 Jan 8, 2023
This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

The Neural Process Family This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CN

DeepMind 892 Dec 28, 2022
Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loop Story Generation"

Storium GPT-2 Models This is the official repository for the GPT-2 models described in the EMNLP 2020 paper [STORIUM: A Dataset and Evaluation Platfor

Nader Akoury 27 Dec 20, 2022
A large-scale video dataset for the training and evaluation of 3D human pose estimation models

ASPset-510 ASPset-510 (Australian Sports Pose Dataset) is a large-scale video dataset for the training and evaluation of 3D human pose estimation mode

Aiden Nibali 36 Oct 30, 2022
A large-scale video dataset for the training and evaluation of 3D human pose estimation models

ASPset-510 (Australian Sports Pose Dataset) is a large-scale video dataset for the training and evaluation of 3D human pose estimation models. It contains 17 different amateur subjects performing 30 sports-related actions each, for a total of 510 action clips.

Aiden Nibali 25 Jun 20, 2021
MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images

MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images This repository contains the implementation of our paper MetaAvatar: Learni

sfwang 96 Dec 13, 2022
XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale

XtremeDistilTransformers for Distilling Massive Multilingual Neural Networks ACL 2020 Microsoft Research [Paper] [Video] Releasing [XtremeDistilTransf

Microsoft 125 Jan 4, 2023
Reimplementation of the paper `Human Attention Maps for Text Classification: Do Humans and Neural Networks Focus on the Same Words? (ACL2020)`

Human Attention for Text Classification Re-implementation of the paper Human Attention Maps for Text Classification: Do Humans and Neural Networks Foc

Shunsuke KITADA 15 Dec 13, 2021
Repo for "Event-Stream Representation for Human Gaits Identification Using Deep Neural Networks"

Summary This is the code for the paper Event-Stream Representation for Human Gaits Identification Using Deep Neural Networks by Yanxiang Wang, Xian Zh

zhangxian 54 Jan 3, 2023
"Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback"

This is code repo for our EMNLP 2017 paper "Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback", which implements the A2C algorithm on top of a neural encoder-decoder model and benchmarks the combination under simulated noisy rewards.

Khanh Nguyen 131 Oct 21, 2022
Shallow Convolutional Neural Networks for Human Activity Recognition using Wearable Sensors

-IEEE-TIM-2021-1-Shallow-CNN-for-HAR [IEEE TIM 2021-1] Shallow Convolutional Neural Networks for Human Activity Recognition using Wearable Sensors All

Wenbo Huang 1 May 17, 2022