Deep Learning Head Pose Estimation using PyTorch.

Nataniel Ruiz

Last update: Dec 26, 2022

Related tags

Deep Learning deep-neural-networks deep-learning head gaze head-pose-estimation gaze-estimation head-pose face-pose

Overview

Hopenet

Hopenet is an accurate and easy to use head pose estimation network. Models have been trained on the 300W-LP dataset and have been tested on real data with good qualitative performance.

For details about the method and quantitative results please check the CVPR Workshop paper.

new GoT trailer example video

new Conan-Cruise-Car example video

To use please install PyTorch and OpenCV (for video) - I believe that's all you need apart from usual libraries such as numpy. You need a GPU to run Hopenet (for now).

To test on a video using dlib face detections (center of head will be jumpy):

python code/test_on_video_dlib.py --snapshot PATH_OF_SNAPSHOT --face_model PATH_OF_DLIB_MODEL --video PATH_OF_VIDEO --output_string STRING_TO_APPEND_TO_OUTPUT --n_frames N_OF_FRAMES_TO_PROCESS --fps FPS_OF_SOURCE_VIDEO

To test on a video using your own face detections (we recommend using dockerface, center of head will be smoother):

python code/test_on_video_dockerface.py --snapshot PATH_OF_SNAPSHOT --video PATH_OF_VIDEO --bboxes FACE_BOUNDING_BOX_ANNOTATIONS --output_string STRING_TO_APPEND_TO_OUTPUT --n_frames N_OF_FRAMES_TO_PROCESS --fps FPS_OF_SOURCE_VIDEO

Face bounding box annotations should be in Dockerface format (n_frame x_min y_min x_max y_max confidence).

Pre-trained models:

300W-LP, alpha 1

300W-LP, alpha 2

300W-LP, alpha 1, robust to image quality

For more information on what alpha stands for please read the paper. First two models are for validating paper results, if used on real data we suggest using the last model as it is more robust to image quality and blur and gives good results on video.

Please open an issue if you have an problem.

Some very cool implementations of this work on other platforms by some cool people:

Gluon

MXNet

TensorFlow with Keras

A really cool lightweight version of HopeNet:

Deep Head Pose Light

If you find Hopenet useful in your research please cite:

@InProceedings{Ruiz_2018_CVPR_Workshops,
author = {Ruiz, Nataniel and Chong, Eunji and Rehg, James M.},
title = {Fine-Grained Head Pose Estimation Without Keypoints},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2018}
}

Nataniel Ruiz, Eunji Chong, James M. Rehg

Georgia Institute of Technology

Comments

The loss cannot get decreased during the training

Hi natanielruiz:

I was trying to repeated your paper's result in recent days however I found I cannot get the loss decreased when I trained your model on 300W_LP dataset. I used the same parameters you provided in your paper where

alpha = 1, lr = 1e-5 and default parameters for Adam Optimizer.

I ran your network for 25 epochs and the losses for Yaw is vibrating around 3000 which means the MSE loss is still too large for the yaw degree.

Do you have any idea how to debug the network or solve this issue? Thank you very much for your help!

opened by developer-mayuan 15

Having runtime error when train your Hopenet

Hi natanielruiz:

Firstly, I want to say thank you for your great work! I tested your pretrained model on my own dataset and it works great. The result is accurate and robust. Then currently I would like to fine-tune your network with my own dataset, however, I found I cannot do it.

I did prepared the 300W_LP dataset and generated the filelist based on the input of your code. (By the way, maybe you can provide the filelist generation code in your repository, which will make it self-contained.)

Then, we I ran your train_hopenet.py code, sometimes I can got result for 1 or 2 epochs, however, it will always gave me the following error message:

Loading data.
Ready to train network.
Epoch [1/5], Iter [100/7653] Losses: Yaw 4.5354, Pitch 4.0671, Roll 4.2844
/opt/conda/conda-bld/pytorch_1503966894950/work/torch/lib/THCUNN/ClassNLLCriterion.cu:57: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [11,0,0] Assertion `t >= 0 && t < n_classes` failed.
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1503966894950/work/torch/lib/THCUNN/generic/ClassNLLCriterion.cu line=87 error=59 : device-side assert triggered
Traceback (most recent call last):
  File "/home/foo/Academy/deep-head-pose/code/train_hopenet.py", line 166, in <module>
    alpha = args.alpha
  File "/home/foo/Ordnance/anaconda2/envs/Hopenet/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/foo/Ordnance/anaconda2/envs/Hopenet/lib/python2.7/site-packages/torch/nn/modules/loss.py", line 482, in forward
    self.ignore_index)
  File "/home/foo/Ordnance/anaconda2/envs/Hopenet/lib/python2.7/site-packages/torch/nn/functional.py", line 746, in cross_entropy
    return nll_loss(log_softmax(input), target, weight, size_average, ignore_index)
  File "/home/foo/Ordnance/anaconda2/envs/Hopenet/lib/python2.7/site-packages/torch/nn/functional.py", line 672, in nll_loss
    return _functions.thnn.NLLLoss.apply(input, target, weight, size_average, ignore_index)
  File "/home/foo/Ordnance/anaconda2/envs/Hopenet/lib/python2.7/site-packages/torch/nn/_functions/thnn/auto.py", line 47, in forward
    output, *ctx.additional_args)
RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1503966894950/work/torch/lib/THCUNN/generic/ClassNLLCriterion.cu:87

I did some search, and the most promising answer is the following link: https://discuss.pytorch.org/t/runtimeerror-cuda-runtime-error-59-device-side-assert-triggered-at-opt-conda-conda-bld-pytorch-1503970438496-work-torch-lib-thc-generic-thcstorage-c-32/9669/5

It sees like in some case your output is out of the bound of the target. The following is my running environment:

Python 2.7.14 (with Anaconda) Using conda virtual environment pytorch 0.2.0 py27hc03bea1_4cu80 [cuda80] soumith torchvision 0.1.9 py27hdb88a65_1 soumith

I would like to know if you meet this kind of problem before and if you can provide me some ideas about how to solving this problem? Thank you very much for your help!

opened by developer-mayuan 14

torch.autograd.backward(loss_seq, grad_seq) RuntimeError: invalid gradient at index 0 - expected shape [] but got [1]

Hi: When I run train_hopenet.py, I got this error "RuntimeError: invalid gradient at index 0 - expected shape [] but got [1]", can you tell me how to solve it? thanks very much!!

opened by gnnbest 12
Different results from the paper (AFW dataset)

Hi, thank you for your great work. I checked the performance of the pretrained model (300W-LP, alpha 1 and 300W-LP, alpha 2), the result on AFLW2000 are the same as the paper. But when I test the models on the AFW dataset, the results are very different. I wrote my own code to compute the discrete predictions that rounds of the the nearest 15 degree, the yaw accuracy is only 21.15%, and I also check the mean absolute error is 53.3674. So even if I made some mistake calculating the discrete predictions, the MAE of yaw seems too large. I am wondering which step was missing to reproduce the result in the paper. (I have make sure that the input format are the same as the one required in the datasets.py)

opened by invigen 12
FaceDetection+CustomTraining+Validation +Performance validation
@natanielruiz Hi thanks for the awesome work and sharing just had few queries

For face detection should we always use the FRCNN or is there any possibility of using some other detection technique

How to validate the output params like the pitch,roll and yaw values obtained from the algorithm

Are the values of pitch,roll and yaw being generated based on the center of the face detection

Can we use ur algorithm for training it on custom dataset or the model share can be used on generic dataset

For custom training can you provide some references steps to achieve this

I ran your code on the CPU and its very slow is there any provision in the future for running it on the CPU

Thanks for the awesome work and sharing
opened by abhigoku10 9
The performance of the pretrained model you provided is somewhat different

Hi, i tested the pretrained model (300W-LP, alpha 2) you provided on AFLW2000, and found that the performance is somewhat different from shown on the paper? is this model the best of your algorithm?

opened by tfygg 9
test on AFLW2000 error is so large

Hi, firstly, very thanks for your greate work! I hava some question for this work: (1)I coding and train a tensorflow model acording this work, when I test on the train dataset(300w-lp) its error is yaw=2.1 pitch=1.9 roll=1.7, but when I test the model on the test dataset(AFLW2000) its error is so large, yaw = 35.3 pitch=11.4 roll=12.5, I can't find where the problem~ (2)in AFLW2000 dataset, the face point have some cordinate is -1, how you crop the face roi use to test head pose? (3)in 300w-lp, the big pose face have error face point, how you process this dataset?

very thanks!

opened by flyduck 8
why combined loss is better？

I want to get an answer about my question. But i can't find answer from the artical. In my opinion, the answer may be that combined loss is suitable for training models. Many comment point out that regression loss is hard for training,cls loss is easier. So,how do you think about it?

opened by xubaoquan33 8
I made an error when I was executing Python

File "code/test_on_video_dlib.py", line 7, in import torch File "/usr/local/lib/python2.7/dist-packages/torch/init.py", line 53, in from torch._C import * ImportError: dlopen: cannot load any more object with static TLS

opened by ytgcljj 6
Format, PEPify and improve Python 2/3 compatibility

This patch does not introduce any functionality changes. Instead, it focuses on PEP compliance (mostly solving line width limit violations and unused imports) and Python 2/3 compatibility (mostly wrapping prints into braces and using Python 3's range).

opened by kirillbobyrev 5
SegmentationFault in GPU

@natanielruiz Hi i ran your code on the GPU but i get Ready to test network. 1 Segmentation fault (core dumped)

i am using the following "python code/test_on_video_dlib.py --snapshot "/home/teai/abhilashsk/hpe/snapshot/hopenet_alpha1.pkl" --face_model "/home/teai/abhilashsk/hpe/model/mmod_human_face_detector.dat" --video "/home/teai/abhilashsk/hpe/Data/1-FemaleNoGlasses-Normal.mp4" --output_string "output" --n_frames 300 --fps 20" error and the code is not using the gpu completely , i think there is some codes changes to be made for it to work on the gpu can you pls let me know what have to be done

opened by mayanks888 5
How to convert trained models into inference-models ?

Hi

I trained the model and then saved it as shown in the figure (right side), and I would like to convert it as shown in the figure (left side).

To see the figure here

Thank you in advance!

opened by Algabri 0

RuntimeError: Mismatch in shape

I am trying to run train_hopenet.py

python3 train_hopenet.py --dataset AFLW2000 --data_dir datasets/AFLW2000 --filename_list datasets/AFLW2000/files.txt --output_string er

I got this error:

Loading data.

/home/redhwan/.local/lib/python3.8/site-packages/torch/optim/adam.py:90: UserWarning: optimizer contains a parameter group with duplicate parameters; in future, this will cause an error; see github.com/pytorch/pytorch/issues/40967 for more information
  super(Adam, self).__init__(params, defaults)
Ready to train network.
Traceback (most recent call last):
  File "train_hopenet.py", line 193, in <module>
    torch.autograd.backward(loss_seq, grad_seq)
  File "/home/redhwan/.local/lib/python3.8/site-packages/torch/autograd/__init__.py", line 166, in backward
    grad_tensors_ = _make_grads(tensors, grad_tensors_, is_grads_batched=False)
  File "/home/redhwan/.local/lib/python3.8/site-packages/torch/autograd/__init__.py", line 50, in _make_grads
    raise RuntimeError("Mismatch in shape: grad_output["
RuntimeError: Mismatch in shape: grad_output[0] has a shape of torch.Size([1]) and output[0] has a shape of torch.Size([]).

How can I solve it?

Note: torch.version = 1.12.0+cu102

opened by Algabri 1

Datasets Preprocessing and Data Loading from RAW Websites Downloaded Data

Dear authors of the code and GitHub community, can you please let me know about the preprocessing steps to make the data readable by the data loaders proposed in this repository? If you've some sample codes, I'd highly appreciate it.

The datasets that I want to preprocess include AFLW, BIWI, and Pointing04.

opened by tanveer-hussain 0
i have created a colab file to test it out quickly

Hi,

I request you to add this to this repo if you feel relevant. Thanks! It uses mtcnn for face detection.

https://colab.research.google.com/drive/1vvntbLyVxxBHoVN0e6-pfs7gB3pp-VUS?usp=sharing

Thanks!

opened by maylad31 0
AFLW labels are broken?

ERROR: I am completely confused. It seems like my labels are either -1 or bigger than num_classes. But I don't know why... It's just that I the label files are correct, but when it's calculating the labels at that it's breaking apart...

I would be pleased to get some help. Thank you.

opened by ilyii 1
softmax

code/test_on_video_dlib.py:146: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument. yaw_predicted = F.softmax(yaw) code/test_on_video_dlib.py:147: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument. pitch_predicted = F.softmax(pitch) code/test_on_video_dlib.py:148: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument. roll_predicted = F.softmax(roll)

dim is 0,1,or 2?

opened by song6cy 0

Owner

Nataniel Ruiz

PhD candidate at Boston University doing Computer Vision and ML. M.S. from Georgia Tech, BA/M.S. from Ecole Polytechnique

GitHub

OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.

OpenFace 2.2.0: a facial behavior analysis toolkit Over the past few years, there has been an increased interest in automatic facial behavior analysis

5.8k Dec 31, 2022

Official Pytorch implementation of 6DRepNet: 6D Rotation representation for unconstrained head pose estimation.

6D Rotation Representation for Unconstrained Head Pose Estimation (Pytorch) Paper Thorsten Hempel and Ahmed A. Abdelrahman and Ayoub Al-Hamadi, "6D Ro

284 Dec 23, 2022

Human head pose estimation using Keras over TensorFlow.

RealHePoNet: a robust single-stage ConvNet for head pose estimation in the wild.

71 Jan 5, 2023

WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose

WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose Yijun Zhou and James Gregson - BMVC2020 Abstract: We present an end-to-end head-pos

368 Dec 26, 2022

Code for BMVC2021 "MOS: A Low Latency and Lightweight Framework for Face Detection, Landmark Localization, and Head Pose Estimation"

MOS-Multi-Task-Face-Detect Introduction This repo is the official implementation of "MOS: A Low Latency and Lightweight Framework for Face Detection,

104 Dec 8, 2022

Python scripts for performing 3D human pose estimation using the Mobile Human Pose model in ONNX.

99 Dec 31, 2022

SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation

SE3 Pose Interpolation Pose estimated from SLAM system are always discrete, and

4 Dec 15, 2022

Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation

SimplePose Code and pre-trained models for our paper, “Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation”, a

256 Dec 24, 2022

Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation Code repository for the paper: PoseAug: A Differentiable Pose Augme

328 Dec 17, 2022

This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

SO-Pose This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation This paper is basically an

52 Nov 25, 2022

Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

Noise Contrastive Estimation for pyTorch Overview This repository contains a re-implementation of the Noise Contrastive Estimation algorithm, implemen

42 Nov 24, 2022

[ECCV 2020] Reimplementation of 3DDFAv2, including face mesh, head pose, landmarks, and more.

Stable Head Pose Estimation and Landmark Regression via 3D Dense Face Reconstruction Reimplementation of (ECCV 2020) Towards Fast, Accurate and Stable

221 Dec 30, 2022

The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

Deep High-Resolution Representation Learning for Human Pose Estimation (CVPR 2019) News [2020/07/05] A very nice blog from Towards Data Science introd

3.9k Jan 5, 2023

This repository is the offical Pytorch implementation of ContextPose: Context Modeling in 3D Human Pose Estimation: A Unified Perspective (CVPR 2021).

Context Modeling in 3D Human Pose Estimation: A Unified Perspective (CVPR 2021) Introduction This repository is the offical Pytorch implementation of

37 Nov 21, 2022

Deep Learning Head Pose Estimation using PyTorch.

Related tags

Overview

Hopenet

Comments

Owner

Nataniel Ruiz

OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.

Official Pytorch implementation of 6DRepNet: 6D Rotation representation for unconstrained head pose estimation.

Human head pose estimation using Keras over TensorFlow.

WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose

Code for BMVC2021 "MOS: A Low Latency and Lightweight Framework for Face Detection, Landmark Localization, and Head Pose Estimation"

Python scripts for performing 3D human pose estimation using the Mobile Human Pose model in ONNX.

SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation

Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation

Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

[ECCV 2020] Reimplementation of 3DDFAv2, including face mesh, head pose, landmarks, and more.

The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

Deep High-Resolution Representation Learning for Human Pose Estimation

Deep Dual Consecutive Network for Human Pose Estimation (CVPR2021)

PyTorch Implementation of Realtime Multi-Person Pose Estimation project.

PyTorch implementation for 3D human pose estimation

A PyTorch toolkit for 2D Human Pose Estimation.

This repository is the offical Pytorch implementation of ContextPose: Context Modeling in 3D Human Pose Estimation: A Unified Perspective (CVPR 2021).