SMPLpix: Neural Avatars from 3D Human Models

Sergey Prokudin

Last update: Dec 30, 2022

Related tags

Deep Learning smplpix

Overview

subject0_validation_poses.mp4

Left: SMPL-X human mesh registered with SMPLify-X, middle: SMPLpix render, right: ground truth video.

SMPLpix: Neural Avatars from 3D Human Models

SMPLpix neural rendering framework combines deformable 3D models such as SMPL-X with the power of image-to-image translation frameworks (aka pix2pix models).

Please check our WACV 2021 paper or a 5-minute explanatory video for more details on the framework.

Important note: this repository is a re-implementation of the original framework, made by the same author after the end of internship. It does not contain the original Amazon multi-subject, multi-view training data and code, and uses full mesh rasterizations as inputs rather than point projections (as described here).

Demo

Description	Link
Process a video into a SMPLpix dataset
Train SMPLpix

Prepare the data

We provide the Colab notebook for preparing SMPLpix training dataset. This will allow you to create your own neural avatar given monocular video of a human moving in front of the camera.

Run demo training

We provide some preprocessed data which allows you to run and test the training pipeline right away:

git clone https://github.com/sergeyprokudin/smplpix
cd smplpix
python setup.py install
python smplpix/train.py --workdir='/content/smplpix_logs/' \
                        --data_url='https://www.dropbox.com/s/coapl05ahqalh09/smplpix_data_test_final.zip?dl=0'

Train on your own data

You can train SMPLpix on your own data by specifying the path to the root directory with data:

python smplpix/train.py --workdir='/content/smplpix_logs/' \
                        --data_dir='/path/to/data'

The directory should contain train, validation and test folders, each of which should contain input and output folders. Check the structure of the demo dataset for reference.

You can also specify various parameters of training via command line. E.g., to reproduce the results of the demo video:

python smplpix/train.py --workdir='/content/smplpix_logs/' \
                        --data_url='https://www.dropbox.com/s/coapl05ahqalh09/smplpix_data_test_final.zip?dl=0' \
                        --downsample_factor=2 \
                        --n_epochs=500 \
                        --sched_patience=2 \
                        --batch_size=4 \
                        --n_unet_blocks=5 \
                        --n_input_channels=3 \
                        --n_output_channels=3 \
                        --eval_every_nth_epoch=10

Check the args.py for the full list of parameters.

More examples

Animating with novel poses

subject0_test_poses.mp4

Left: poses from the test video sequence, right: SMPLpix renders.

Rendering faces

deca_smplpix_test_renders.mp4

Left: FLAME face model inferred with DECA, middle: ground truth test video, right: SMPLpix render.

Thanks to Maria Paola Forte for providing the sequence.

Few-shot artistic neural style transfer

kabarov_animations.mp4

Left: rendered AMASS motion sequence, right: generated SMPLpix animations. See the explanatory video for details.

Credits to Alexander Kabarov for providing the training sketches.

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{prokudin2021smplpix,
  title={SMPLpix: Neural Avatars from 3D Human Models},
  author={Prokudin, Sergey and Black, Michael J and Romero, Javier},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={1810--1819},
  year={2021}
}

License

See the LICENSE file.

Comments

Mediapipe pose use instead of openpose

Media pipe pose use instead of open pose because it throws building error

https://colab.research.google.com/drive/1uCuA6We9T5r0WljspEHWPHXCT_2bMKUy

https://google.github.io/mediapipe/solutions/holistic

opened by 1kaiser 10
'Flatten the video into frames' error message

Hello Sergey and thanks for the amazing work you did.

I was trying to run your colab and was able to upload my video (from google drive), but when I got to the second part ("Flatten the video into frames") I'm getting a fail with the following error message below, thanks in advance for your help:

ffmpeg version 3.4.8-0ubuntu0.2 Copyright (c) 2000-2020 the FFmpeg developers built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04) configuration: --prefix=/usr --extra-version=0ubuntu0.2 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared libavutil 55. 78.100 / 55. 78.100 libavcodec 57.107.100 / 57.107.100 libavformat 57. 83.100 / 57. 83.100 libavdevice 57. 10.100 / 57. 10.100 libavfilter 6.107.100 / 6.107.100 libavresample 3. 7. 0 / 3. 7. 0 libswscale 4. 8.100 / 4. 8.100 libswresample 2. 9.100 / 2. 9.100 libpostproc 54. 7.100 / 54. 7.100 : No such file or directory

IndexError Traceback (most recent call last)

in () 23 return np.asarray(Image.open(img_path))/255 24 ---> 25 test_img_path = os.path.join(FRAMES_DIR, os.listdir(FRAMES_DIR)[0]) 26 27 test_img = load_img(test_img_path)

IndexError: list index out of range

opened by arnaudskyvr 6
Get 3d textured model

How do I get the corresponding SMPL textured model for my video with vertex color or UV map for my video input? As in #16 smplx_verts_colors.txt and the output from where do I get those

opened by kashyappiyush1998 5

IndexError when Flattening the video into frames

    Hi @sergeyprokudin.

This is similar error but it should be from a different source. I checked that the video I uploaded was done correctly in the previous cell. Any advise on this?


`ffmpeg version 3.4.11-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
  configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared
  libavutil      55. 78.100 / 55. 78.100
  libavcodec     57.107.100 / 57.107.100
  libavformat    57. 83.100 / 57. 83.100
  libavdevice    57. 10.100 / 57. 10.100
  libavfilter     6.107.100 /  6.107.100
  libavresample   3.  7.  0 /  3.  7.  0
  libswscale      4.  8.100 /  4.  8.100
  libswresample   2.  9.100 /  2.  9.100
  libpostproc    54.  7.100 / 54.  7.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/content/Salvadorian_1.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: mp42mp41
    creation_time   : 2021-08-25T23:28:15.000000Z
  Duration: 00:00:31.51, start: 0.000000, bitrate: 10955 kb/s
    Stream #0:0(eng): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 10646 kb/s, 59.94 fps, 59.94 tbr, 60k tbn, 119.88 tbc (default)
    Metadata:
      creation_time   : 2021-08-25T23:28:15.000000Z
      handler_name    : Alias Data Handler
      encoder         : AVC Coding
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 317 kb/s (default)
    Metadata:
      creation_time   : 2021-08-25T23:28:15.000000Z
      handler_name    : Alias Data Handler
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> png (native))
Press [q] to stop, [?] for help
Output #0, image2, to '$FRAMES_DIR/%05d.png':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: mp42mp41
    encoder         : Lavf57.83.100
    Stream #0:0(eng): Video: png, rgb24, 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 1 fps, 1 tbn, 1 tbc (default)
    Metadata:
      creation_time   : 2021-08-25T23:28:15.000000Z
      handler_name    : Alias Data Handler
      encoder         : Lavc57.107.100 png
[image2 @ 0x55a7d606be00] Could not open file : $FRAMES_DIR/00001.png
av_interleaved_write_frame(): Input/output error
frame=    4 fps=0.0 q=-0.0 Lsize=N/A time=00:00:01.00 bitrate=N/A speed=1.64x    
video:1475kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Conversion failed!
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
[<ipython-input-5-73a0be7e8d8e>](https://localhost:8080/#) in <module>
     23   return np.asarray(Image.open(img_path))/255
     24 
---> 25 test_img_path = os.path.join(FRAMES_DIR, os.listdir(FRAMES_DIR)[0])
     26 
     27 test_img = load_img(test_img_path)

IndexError: list index out of range`

Originally posted by @ss8319 in https://github.com/sergeyprokudin/smplpix/issues/15#issuecomment-1321855384

opened by ss8319 4

Runtime error while training for a custom video

Hey, I was trying to train the model using a custom dataset generated from my video and I'm getting this runtime error

starting training... /usr/local/lib/python3.7/dist-packages/torchvision/models/_utils.py:209: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead. f"The parameter '{pretrained_param}' is deprecated since 0.13 and will be removed in 0.15, " /usr/local/lib/python3.7/dist-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or None for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing weights=VGG16_Weights.IMAGENET1K_V1. You can also use weights=VGG16_Weights.DEFAULT to get the most up-to-date weights. warnings.warn(msg) Downloading: "https://download.pytorch.org/models/vgg16-397923af.pth" to /root/.cache/torch/hub/checkpoints/vgg16-397923af.pth 100% 528M/528M [00:55<00:00, 9.98MB/s] 0% 0/50 [00:08<?, ?it/s] Traceback (most recent call last): File "smplpix/train.py", line 129, in main() File "smplpix/train.py", line 112, in main init_lr=args.learning_rate) File "/usr/local/lib/python3.7/dist-packages/smplpix-1.0-py3.7.egg/smplpix/training.py", line 30, in train for batch_idx, (x, ytrue, img_names) in enumerate(train_dataloader): File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 652, in next data = self._next_data() File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 692, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch return self.collate_fn(data) File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 175, in default_collate return [default_collate(samples) for samples in transposed] # Backwards compatibility. File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 175, in return [default_collate(samples) for samples in transposed] # Backwards compatibility. File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 141, in default_collate return torch.stack(batch, 0, out=out) RuntimeError: stack expects each tensor to be equal size, but got [3, 270, 300] at entry 0 and [3, 149, 84] at entry 1

Any specific reason behind it?

opened by Alwinseb01 4
Unable to upload video in colab demo

@sergeyprokudin Hi,thanks for releasing the code.I tried to use the colab to get an understanding of how the code works,but it seems I am unable to upload a youtube video. When I pass the youtube url https://www.youtube.com/watch?v=gFr0_ywVdhY,it gives me an error YOTUBE_VIDEO_URL = #@param. The same holds when I select google drive/upload my own video. Any idea why this is happening.

Also,as mentioned in issue #4 ,is it possible to use mediapipe instead of openpose,if yes then I would really appreciate it if you could give me some suggestions on how to use the mediapipe results ,because the output of mediapipe and openpose are different. Thanks

opened by sparshgarg23 4
Colorful mesh input

Hi, I'm getting hard to understand how to export the colorful mesh from Smplify-X I'm succeed to export .obj file and overlay mesh image but not the colorful mesh that smplpix train on. Can you shed light on how to do it?

opened by korenleven 3
render the SCALE output

Hi, i have a sequence of .ply files of dense point sets (output of the SCALE paper) how can i render them with smplpix to come up with real images ? they mentioned in the paper that this is possible so any help?

opened by omarmohamed101 3
Value error while training the mdoel

Hey there! While running the script to train the model on demo dataset, I'm facing a value error:

Traceback (most recent call last): File "smplpix/train.py", line 129, in main() File "smplpix/train.py", line 112, in main init_lr=args.learning_rate) File "/usr/local/lib/python3.7/dist-packages/smplpix-1.0-py3.7.egg/smplpix/training.py", line 30, in train for batch_idx, (x, ytrue, img_names) in enumerate(train_dataloader): File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 652, in next data = self._next_data() File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 692, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/usr/local/lib/python3.7/dist-packages/smplpix-1.0-py3.7.egg/smplpix/dataset.py", line 96, in getitem x, y = self._augment_images(x, y) File "/usr/local/lib/python3.7/dist-packages/smplpix-1.0-py3.7.egg/smplpix/dataset.py", line 77, in _augment_images shear=0, fill=self.input_fill_color) File "/usr/local/lib/python3.7/dist-packages/torchvision/transforms/functional.py", line 1204, in affine return F_pil.affine(img, matrix=matrix, interpolation=pil_interpolation, fill=fill) File "/usr/local/lib/python3.7/dist-packages/torchvision/transforms/functional_pil.py", line 326, in affine opts = _parse_fill(fill, img) File "/usr/local/lib/python3.7/dist-packages/torchvision/transforms/functional_pil.py", line 309, in _parse_fill fill = int(fill) ValueError: invalid literal for int() with base 10: 'white'

Any help regarding the same would be highly appreciated. Thanks!

opened by Alwinseb01 2
Train Wrong

Traceback (most recent call last): File "D:\Users\17718\Desktop\smplpix-main\build\lib\smplpix\train.py", line 129, in main() File "D:\Users\17718\Desktop\smplpix-main\build\lib\smplpix\train.py", line 64, in main train_dir = os.path.join(args.data_dir, 'train') File "C:\Users\17718.conda\envs\py\lib\ntpath.py", line 78, in join path = os.fspath(path) TypeError: expected str, bytes or os.PathLike object, not NoneType

opened by szh-1598 2
Some problems with custom data!

Thank you for your great work! When I train on my own data, I find that there are always color darkness inconsistencies between the rendered images and the ground truth. I also trained on the demo data you provided and met the same problem.

opened by JanaldoChen 2
Rendering meshes.

Hey first of all great work!

Im getting issues when trying to Run SMPLify-X, the program is processing, but the output is the mesh file but not an image render which can be used for training. How do I solve this?

opened by AIMads 2

Owner

Sergey Prokudin

Postdoctoral researcher in computer vision and machine learning

GitHub

Avatarify Python - Avatars for Zoom, Skype and other video-conferencing apps.

15.3k Jan 5, 2023

Generate pixel-style avatars with python.

face2pixel Generate pixel-style avatars with python. Run: Clone the project: git clone https://github.com/theodorecooper/face2pixel install requiremen

2 May 11, 2022

[SIGGRAPH 2022 Journal Track] AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars

AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars Fangzhou Hong1* Mingyuan Zhang1* Liang Pan1 Zhongang Cai1,2,3 Lei Yang2

749 Jan 4, 2023

[CVPR2021] UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles

UAV-Human Official repository for CVPR2021: UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicle Paper arXiv Res

129 Jan 4, 2023

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors, CVPR 2021

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors Human POSEitioning System (H

66 Dec 21, 2022

Human Action Controller - A human action controller running on different platforms.

Human Action Controller (HAC) Goal A human action controller running on different platforms. Fun Easy-to-use Accurate Anywhere Fun Examples Mouse Cont

27 Jul 20, 2022

Python scripts for performing 3D human pose estimation using the Mobile Human Pose model in ONNX.

99 Dec 31, 2022

StyleGAN-Human: A Data-Centric Odyssey of Human Generation

StyleGAN-Human: A Data-Centric Odyssey of Human Generation Abstract: Unconditional human image generation is an important task in vision and graphics,

762 Jan 8, 2023

This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

The Neural Process Family This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CN

892 Dec 28, 2022

Ever felt tired after preprocessing the dataset, and not wanting to write any code further to train your model? Ever encountered a situation where you wanted to record the hyperparameters of the trained model and able to retrieve it afterward? Models Playground is here to help you do that. Models playground allows you to train your models right from the browser.

Models Playground ??️ Upload a Preprocessed Dataset ?? Choose whether to perform Classification or Regression ?? Enter the Dependent Variable ?

19 Dec 10, 2022

Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loop Story Generation"

Storium GPT-2 Models This is the official repository for the GPT-2 models described in the EMNLP 2020 paper [STORIUM: A Dataset and Evaluation Platfor

27 Dec 20, 2022

A large-scale video dataset for the training and evaluation of 3D human pose estimation models

ASPset-510 ASPset-510 (Australian Sports Pose Dataset) is a large-scale video dataset for the training and evaluation of 3D human pose estimation mode

36 Oct 30, 2022

A large-scale video dataset for the training and evaluation of 3D human pose estimation models

ASPset-510 (Australian Sports Pose Dataset) is a large-scale video dataset for the training and evaluation of 3D human pose estimation models. It contains 17 different amateur subjects performing 30 sports-related actions each, for a total of 510 action clips.

25 Jun 20, 2021

MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images

MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images This repository contains the implementation of our paper MetaAvatar: Learni

96 Dec 13, 2022

XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale

XtremeDistilTransformers for Distilling Massive Multilingual Neural Networks ACL 2020 Microsoft Research [Paper] [Video] Releasing [XtremeDistilTransf

125 Jan 4, 2023

Reimplementation of the paper `Human Attention Maps for Text Classification: Do Humans and Neural Networks Focus on the Same Words? (ACL2020)`

Human Attention for Text Classification Re-implementation of the paper Human Attention Maps for Text Classification: Do Humans and Neural Networks Foc

15 Dec 13, 2021

Repo for "Event-Stream Representation for Human Gaits Identification Using Deep Neural Networks"

Summary This is the code for the paper Event-Stream Representation for Human Gaits Identification Using Deep Neural Networks by Yanxiang Wang, Xian Zh

54 Jan 3, 2023

"Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback"

This is code repo for our EMNLP 2017 paper "Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback", which implements the A2C algorithm on top of a neural encoder-decoder model and benchmarks the combination under simulated noisy rewards.

131 Oct 21, 2022

Shallow Convolutional Neural Networks for Human Activity Recognition using Wearable Sensors

-IEEE-TIM-2021-1-Shallow-CNN-for-HAR [IEEE TIM 2021-1] Shallow Convolutional Neural Networks for Human Activity Recognition using Wearable Sensors All

1 May 17, 2022