Transfer Learning for Pose Estimation of Illustrated Characters

Shuhong Chen

Last update: Dec 28, 2022

Related tags

Deep Learning bizarre-pose-estimator

Overview

bizarre-pose-estimator

Transfer Learning for Pose Estimation of Illustrated Characters
Shuhong Chen *, Matthias Zwicker *
WACV2022
[arxiv] [video] [poster] [github]

Human pose information is a critical component in many downstream image processing tasks, such as activity recognition and motion tracking. Likewise, a pose estimator for the illustrated character domain would provide a valuable prior for assistive content creation tasks, such as reference pose retrieval and automatic character animation. But while modern data-driven techniques have substantially improved pose estimation performance on natural images, little work has been done for illustrations. In our work, we bridge this domain gap by efficiently transfer-learning from both domain-specific and task-specific source models. Additionally, we upgrade and expand an existing illustrated pose estimation dataset, and introduce two new datasets for classification and segmentation subtasks. We then apply the resultant state-of-the-art character pose estimator to solve the novel task of pose-guided illustration retrieval. All data, models, and code will be made publicly available.

download

Downloads can be found in this drive folder: wacv2022_bizarre_pose_estimator_release

Download bizarre_pose_models.zip and extract to the root project directory; the extracted file structure should merge with the ones in this repo.
Download bizarre_pose_dataset.zip and extract to ./_data. The images and annotations should be at ./_data/bizarre_pose_dataset/raw.
Download character_bg_seg_data.zip and extract to ./_data. Under ./_data/character_bg_seg, there are bg and fg folders. All foregrounds come from danbooru, and are indexed by the provided csv. While some backgrounds come from danbooru, we use several from jerryli27/pixiv_dataset; these are somewhat hard to download, so we provide the raw pixiv images in the zip.
Please refer to Gwern's Danbooru dataset to download danbooru images by ID.

Warning: While NSFW art was filtered out from these data by tag, it was not possible to manually inspect all the data for mislabeled safety ratings. Please use this data at your own risk.

setup

Make a copy of ./_env/machine_config.bashrc.template to ./_env/machine_config.bashrc, and set $PROJECT_DN to the absolute path of this repository folder. The other variables are optional.

This project requires docker with a GPU. Run these lines from the project directory to pull the image and enter a container; note these are bash scripts inside the ./make folder, not make commands. Alternatively, you can build the docker image yourself.

make/docker_pull
make/shell_docker
# OR
make/docker_build
make/shell_docker

danbooru tagging

The danbooru subset used to train the tagger and custom tag rulebook can be found under ./_data/danbooru/_filters. Run this line to tag a sample image:

python3 -m _scripts.danbooru_tagger ./_samples/megumin.png

character background segmentation

Run this line to segment a sample image and extract the bounding box:

python3 -m _scripts.character_segmenter ./_samples/megumin.png

pose estimation

There are several models available in ./_train/character_pose_estim/runs, corresponding to our models at the top of Table 1 in the paper. Run this line to estimate the pose of a sample image, using one of those models:

python3 -m _scripts.pose_estimator \
    ./_samples/megumin.png \
    ./_train/character_pose_estim/runs/feat_concat+data.ckpt

pose-based retrieval

Run this line to estimate the pose of a sample image, and get links to danbooru posts with similar poses:

python3 -m _scripts.pose_retrieval ./_samples/megumin.png

faq

Does this work for multiple characters in an image, or images that aren't full-body? Sorry but no, this project is focused just on single full-body characters; however we may release our instance-based models separately.
Can I do this without docker? Please use docker, it is very good. If you can't use docker, you can try to replicate the environment from ./_env/Dockerfile, but this is untested.
What does bn mean in the files/code? It's sort for "basename", or an ID for a single data sample.
What is the sauce for the artwork in ./_samples? Full artist attributions are in the supplementary of our paper, Tables 2 and 3; the retrieval figure is the first two rows of Fig. 2, and Megumin is entry (1,0) of Fig. 3.
Which part is best? Part 4.

Comments

Confusion

Not familiar with docker. Do I need to install docker first before following all the setup steps?

BTW, how to run this program on servers without root access, I tried the code: make/docker_pull make/shell_docker but permission denied

thanks : )

opened by zm-bisp 10
Question about generating the pose_descriptors for support set

Hi, thanks for your great work! The detection results are really amazing:smile: I have a question about the pose retrieval code. When you retrieval on the support set, the code load a pickle file. May I ask how did you generate this pickle file? Would you mind releasing the code for generating this pickle file?

https://github.com/ShuhongChen/bizarre-pose-estimator/blob/dace2253ee27ffcefbe7fa444dd88cc894cafd8e/_scripts/pose_retrieval.py#L126

Besides, I try to use real human pose to retrieval the support set. However, the detection result only has 17 keypoints that do not match the training data dimension(with 25 keypoint). Would you mind releasing the raw detection result for support set so that I can generate pose_descriptors with 17 keypoint by myself?

opened by mrbulb 3
annotations for custom training ?

Hello again ! I've been busy since my last message, on other projects (i'm still new in this field and python programming so everything takes forever), i'm now back to trying to make my ACGAN to generate manga pictures. (i'm pretty much a newbie in this field, i hope i won't ask stupid questions ! ) Your pre-trained model works wonderfully in most cases, but i'm getting a bit lost in your code, is there a fonction to record .json with keypoints after a detection somewhere ? Also, your model is trained on 25 keypoints right ? so there is no easy way (like a script already in the project ?) to actually generate results that could be manually corrected if a few keypoints are wrong and then used to re-train the model ? (your pre-trained model struggle on partial characters, like bust shots, and with foreshortening/strong perspectives, i guess it's not because of the architecture but just because of the dataset, so i'd like to train it a bit more with an extended dataset) Thank you !

opened by DHG-Dav 2
[Question] The license of the datasets

Hi!

Thank you for sharing impressive model.

I'm curious about the license of the datasets you provides. Is it AGPL3.0 too, or other license? For example, can I add any license to the models trained with your datasets?

Thank you!

opened by xiong-jie-y 2
When will the code & dataset be released ?

Hello, i've been hyped by your paper months ago, i'm working on a GAN to generate pictures based on tags/boxes/pose skeletons/segmentation and extra context, i was hoping to use your code / dataset rather than create one from scratch, but the time pass and i see no update, so i was wondering if you still plan to release these, and when ? Thank you !

opened by DHG-Dav 2

Owner

Shuhong Chen

GitHub

Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation Code repository for the paper: PoseAug: A Differentiable Pose Augme

328 Dec 17, 2022

This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

SO-Pose This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation This paper is basically an

52 Nov 25, 2022

Python scripts for performing 3D human pose estimation using the Mobile Human Pose model in ONNX.

99 Dec 31, 2022

Web service for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation based on OpenFace 2.0

OpenGaze: Web Service for OpenFace Facial Behaviour Analysis Toolkit Overview OpenFace is a fantastic tool intended for computer vision and machine le

4 Nov 3, 2022

OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.

OpenFace 2.2.0: a facial behavior analysis toolkit Over the past few years, there has been an increased interest in automatic facial behavior analysis

5.8k Dec 31, 2022

Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

Noise Contrastive Estimation for pyTorch Overview This repository contains a re-implementation of the Noise Contrastive Estimation algorithm, implemen

42 Nov 24, 2022

Transfer-Learn is an open-source and well-documented library for Transfer Learning.

Transfer-Learn is an open-source and well-documented library for Transfer Learning. It is based on pure PyTorch with high performance and friendly API. Our code is pythonic, and the design is consistent with torchvision. You can easily develop new algorithms, or readily apply existing algorithms.

2.2k Jan 3, 2023

The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

Deep High-Resolution Representation Learning for Human Pose Estimation (CVPR 2019) News [2020/07/05] A very nice blog from Towards Data Science introd

3.9k Jan 5, 2023

Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation

Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation This is the official repository for our paper Neural Reprojection Error

78 Dec 1, 2022

Deep Learning Head Pose Estimation using PyTorch.

Hopenet is an accurate and easy to use head pose estimation network. Models have been trained on the 300W-LP dataset and have been tested on real data with good qualitative performance.

1.3k Dec 26, 2022

Deep High-Resolution Representation Learning for Human Pose Estimation

Deep High-Resolution Representation Learning for Human Pose Estimation (accepted to CVPR2019) News If you are interested in internship or research pos

167 Dec 27, 2022

《Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement》(ECCV 2020) GitHub: [fig9]

Unsupervised 3D Human Pose Representation [Paper] The implementation of our paper Unsupervised 3D Human Pose Representation with Viewpoint and Pose Di

42 Nov 24, 2022

code for our paper "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer"

SHOT++ Code for our TPAMI submission "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer" that is ext

75 Dec 16, 2022

(ICCV 2021) Official code of "Dressing in Order: Recurrent Person Image Generation for Pose Transfer, Virtual Try-on and Outfit Editing."

Dressing in Order (DiOr) ?? [Paper] ?? [Webpage] ?? [Running this code] The official implementation of "Dressing in Order: Recurrent Person Image Gene

277 Dec 28, 2022

Transfer style api - An API to use with Tranfer Style App, where you can use two image and transfer the style

Transfer Style API It's an API to use with Tranfer Style App, where you can use

1 Feb 13, 2022

Applicator Kit for Modo allow you to apply Apple ARKit Face Tracking data from your iPhone or iPad to your characters in Modo.

Applicator Kit for Modo Applicator Kit for Modo allow you to apply Apple ARKit Face Tracking data from your iPhone or iPad with a TrueDepth camera to

3 Aug 24, 2021

Official PyTorch implementation of "Synthesis of Screentone Patterns of Manga Characters"

Manga Character Screentone Synthesis Official PyTorch implementation of "Synthesis of Screentone Patterns of Manga Characters" presented in IEEE ISM 2

2 Nov 20, 2021

The official implementation of NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021]. https://arxiv.org/pdf/2101.12378.pdf

NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021] Release Notes The offical PyTorch implementation of NeMo, p

76 Nov 23, 2022

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation Figure 1: We estimate the 6DoF rigid transformation of a 3D face (rendered in si

519 Dec 29, 2022

Transfer Learning for Pose Estimation of Illustrated Characters

Related tags

Overview

bizarre-pose-estimator

download

setup

danbooru tagging

character background segmentation

pose estimation

pose-based retrieval

faq

Comments

Confusion

Question about generating the pose_descriptors for support set

annotations for custom training ?

[Question] The license of the datasets

When will the code & dataset be released ?

Owner

Shuhong Chen

Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

Python scripts for performing 3D human pose estimation using the Mobile Human Pose model in ONNX.

Web service for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation based on OpenFace 2.0

OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.

Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

Transfer-Learn is an open-source and well-documented library for Transfer Learning.

The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation

Deep Learning Head Pose Estimation using PyTorch.

Deep High-Resolution Representation Learning for Human Pose Estimation

《Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement》(ECCV 2020) GitHub: [fig9]

code for our paper "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer"

(ICCV 2021) Official code of "Dressing in Order: Recurrent Person Image Generation for Pose Transfer, Virtual Try-on and Outfit Editing."

Transfer style api - An API to use with Tranfer Style App, where you can use two image and transfer the style

Applicator Kit for Modo allow you to apply Apple ARKit Face Tracking data from your iPhone or iPad to your characters in Modo.

Official PyTorch implementation of "Synthesis of Screentone Patterns of Manga Characters"

The official implementation of NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021]. https://arxiv.org/pdf/2101.12378.pdf

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation