Deep Learning Head Pose Estimation using PyTorch.

Overview

Hopenet



Hopenet is an accurate and easy to use head pose estimation network. Models have been trained on the 300W-LP dataset and have been tested on real data with good qualitative performance.

For details about the method and quantitative results please check the CVPR Workshop paper.



new GoT trailer example video

new Conan-Cruise-Car example video

To use please install PyTorch and OpenCV (for video) - I believe that's all you need apart from usual libraries such as numpy. You need a GPU to run Hopenet (for now).

To test on a video using dlib face detections (center of head will be jumpy):

python code/test_on_video_dlib.py --snapshot PATH_OF_SNAPSHOT --face_model PATH_OF_DLIB_MODEL --video PATH_OF_VIDEO --output_string STRING_TO_APPEND_TO_OUTPUT --n_frames N_OF_FRAMES_TO_PROCESS --fps FPS_OF_SOURCE_VIDEO

To test on a video using your own face detections (we recommend using dockerface, center of head will be smoother):

python code/test_on_video_dockerface.py --snapshot PATH_OF_SNAPSHOT --video PATH_OF_VIDEO --bboxes FACE_BOUNDING_BOX_ANNOTATIONS --output_string STRING_TO_APPEND_TO_OUTPUT --n_frames N_OF_FRAMES_TO_PROCESS --fps FPS_OF_SOURCE_VIDEO

Face bounding box annotations should be in Dockerface format (n_frame x_min y_min x_max y_max confidence).

Pre-trained models:

300W-LP, alpha 1

300W-LP, alpha 2

300W-LP, alpha 1, robust to image quality

For more information on what alpha stands for please read the paper. First two models are for validating paper results, if used on real data we suggest using the last model as it is more robust to image quality and blur and gives good results on video.

Please open an issue if you have an problem.

Some very cool implementations of this work on other platforms by some cool people:

Gluon

MXNet

TensorFlow with Keras

A really cool lightweight version of HopeNet:

Deep Head Pose Light

If you find Hopenet useful in your research please cite:

@InProceedings{Ruiz_2018_CVPR_Workshops,
author = {Ruiz, Nataniel and Chong, Eunji and Rehg, James M.},
title = {Fine-Grained Head Pose Estimation Without Keypoints},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2018}
}

Nataniel Ruiz, Eunji Chong, James M. Rehg

Georgia Institute of Technology

Comments
  • The loss cannot get decreased during the training

    The loss cannot get decreased during the training

    Hi natanielruiz:

    I was trying to repeated your paper's result in recent days however I found I cannot get the loss decreased when I trained your model on 300W_LP dataset. I used the same parameters you provided in your paper where

    alpha = 1, lr = 1e-5 and default parameters for Adam Optimizer.

    I ran your network for 25 epochs and the losses for Yaw is vibrating around 3000 which means the MSE loss is still too large for the yaw degree.

    Do you have any idea how to debug the network or solve this issue? Thank you very much for your help!

    opened by developer-mayuan 15
  • Having runtime error when train your Hopenet

    Having runtime error when train your Hopenet

    Hi natanielruiz:

    Firstly, I want to say thank you for your great work! I tested your pretrained model on my own dataset and it works great. The result is accurate and robust. Then currently I would like to fine-tune your network with my own dataset, however, I found I cannot do it.

    I did prepared the 300W_LP dataset and generated the filelist based on the input of your code. (By the way, maybe you can provide the filelist generation code in your repository, which will make it self-contained.)

    Then, we I ran your train_hopenet.py code, sometimes I can got result for 1 or 2 epochs, however, it will always gave me the following error message:

    Loading data.
    Ready to train network.
    Epoch [1/5], Iter [100/7653] Losses: Yaw 4.5354, Pitch 4.0671, Roll 4.2844
    /opt/conda/conda-bld/pytorch_1503966894950/work/torch/lib/THCUNN/ClassNLLCriterion.cu:57: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [11,0,0] Assertion `t >= 0 && t < n_classes` failed.
    THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1503966894950/work/torch/lib/THCUNN/generic/ClassNLLCriterion.cu line=87 error=59 : device-side assert triggered
    Traceback (most recent call last):
      File "/home/foo/Academy/deep-head-pose/code/train_hopenet.py", line 166, in <module>
        alpha = args.alpha
      File "/home/foo/Ordnance/anaconda2/envs/Hopenet/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/foo/Ordnance/anaconda2/envs/Hopenet/lib/python2.7/site-packages/torch/nn/modules/loss.py", line 482, in forward
        self.ignore_index)
      File "/home/foo/Ordnance/anaconda2/envs/Hopenet/lib/python2.7/site-packages/torch/nn/functional.py", line 746, in cross_entropy
        return nll_loss(log_softmax(input), target, weight, size_average, ignore_index)
      File "/home/foo/Ordnance/anaconda2/envs/Hopenet/lib/python2.7/site-packages/torch/nn/functional.py", line 672, in nll_loss
        return _functions.thnn.NLLLoss.apply(input, target, weight, size_average, ignore_index)
      File "/home/foo/Ordnance/anaconda2/envs/Hopenet/lib/python2.7/site-packages/torch/nn/_functions/thnn/auto.py", line 47, in forward
        output, *ctx.additional_args)
    RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1503966894950/work/torch/lib/THCUNN/generic/ClassNLLCriterion.cu:87
    

    I did some search, and the most promising answer is the following link: https://discuss.pytorch.org/t/runtimeerror-cuda-runtime-error-59-device-side-assert-triggered-at-opt-conda-conda-bld-pytorch-1503970438496-work-torch-lib-thc-generic-thcstorage-c-32/9669/5

    It sees like in some case your output is out of the bound of the target. The following is my running environment:

    Python 2.7.14 (with Anaconda) Using conda virtual environment pytorch 0.2.0 py27hc03bea1_4cu80 [cuda80] soumith torchvision 0.1.9 py27hdb88a65_1 soumith

    I would like to know if you meet this kind of problem before and if you can provide me some ideas about how to solving this problem? Thank you very much for your help!

    opened by developer-mayuan 14
  • torch.autograd.backward(loss_seq, grad_seq)      RuntimeError: invalid gradient at index 0 - expected shape [] but got [1]

    torch.autograd.backward(loss_seq, grad_seq) RuntimeError: invalid gradient at index 0 - expected shape [] but got [1]

    Hi: When I run train_hopenet.py, I got this error "RuntimeError: invalid gradient at index 0 - expected shape [] but got [1]", can you tell me how to solve it? thanks very much!!

    opened by gnnbest 12
  • Different results from the paper (AFW dataset)

    Different results from the paper (AFW dataset)

    Hi, thank you for your great work. I checked the performance of the pretrained model (300W-LP, alpha 1 and 300W-LP, alpha 2), the result on AFLW2000 are the same as the paper. But when I test the models on the AFW dataset, the results are very different. I wrote my own code to compute the discrete predictions that rounds of the the nearest 15 degree, the yaw accuracy is only 21.15%, and I also check the mean absolute error is 53.3674. So even if I made some mistake calculating the discrete predictions, the MAE of yaw seems too large. I am wondering which step was missing to reproduce the result in the paper. (I have make sure that the input format are the same as the one required in the datasets.py)

    opened by invigen 12
  • FaceDetection+CustomTraining+Validation +Performance validation

    FaceDetection+CustomTraining+Validation +Performance validation

    @natanielruiz Hi thanks for the awesome work and sharing just had few queries

    1. For face detection should we always use the FRCNN or is there any possibility of using some other detection technique
    2. How to validate the output params like the pitch,roll and yaw values obtained from the algorithm
    3. Are the values of pitch,roll and yaw being generated based on the center of the face detection
    4. Can we use ur algorithm for training it on custom dataset or the model share can be used on generic dataset
    5. For custom training can you provide some references steps to achieve this
    6. I ran your code on the CPU and its very slow is there any provision in the future for running it on the CPU

    Thanks for the awesome work and sharing

    opened by abhigoku10 9
  • The performance of the pretrained model you provided is somewhat different

    The performance of the pretrained model you provided is somewhat different

    Hi, i tested the pretrained model (300W-LP, alpha 2) you provided on AFLW2000, and found that the performance is somewhat different from shown on the paper? is this model the best of your algorithm?

    image

    opened by tfygg 9
  • test on AFLW2000 error is so large

    test on AFLW2000 error is so large

    Hi, firstly, very thanks for your greate work! I hava some question for this work: (1)I coding and train a tensorflow model acording this work, when I test on the train dataset(300w-lp) its error is yaw=2.1 pitch=1.9 roll=1.7, but when I test the model on the test dataset(AFLW2000) its error is so large, yaw = 35.3 pitch=11.4 roll=12.5, I can't find where the problem~ (2)in AFLW2000 dataset, the face point have some cordinate is -1, how you crop the face roi use to test head pose? (3)in 300w-lp, the big pose face have error face point, how you process this dataset?

    very thanks!

    opened by flyduck 8
  • why combined loss is better?

    why combined loss is better?

    I want to get an answer about my question. But i can't find answer from the artical. In my opinion, the answer may be that combined loss is suitable for training models. Many comment point out that regression loss is hard for training,cls loss is easier. So,how do you think about it?

    opened by xubaoquan33 8
  • I made an error when I was executing Python

    I made an error when I was executing Python

    File "code/test_on_video_dlib.py", line 7, in import torch File "/usr/local/lib/python2.7/dist-packages/torch/init.py", line 53, in from torch._C import * ImportError: dlopen: cannot load any more object with static TLS

    opened by ytgcljj 6
  • Format, PEPify and improve Python 2/3 compatibility

    Format, PEPify and improve Python 2/3 compatibility

    This patch does not introduce any functionality changes. Instead, it focuses on PEP compliance (mostly solving line width limit violations and unused imports) and Python 2/3 compatibility (mostly wrapping prints into braces and using Python 3's range).

    opened by kirillbobyrev 5
  • SegmentationFault in GPU

    SegmentationFault in GPU

    @natanielruiz Hi i ran your code on the GPU but i get Ready to test network. 1 Segmentation fault (core dumped)

    i am using the following "python code/test_on_video_dlib.py --snapshot "/home/teai/abhilashsk/hpe/snapshot/hopenet_alpha1.pkl" --face_model "/home/teai/abhilashsk/hpe/model/mmod_human_face_detector.dat" --video "/home/teai/abhilashsk/hpe/Data/1-FemaleNoGlasses-Normal.mp4" --output_string "output" --n_frames 300 --fps 20" error and the code is not using the gpu completely , i think there is some codes changes to be made for it to work on the gpu can you pls let me know what have to be done

    opened by mayanks888 5
  • How to convert trained models  into inference-models ?

    How to convert trained models into inference-models ?

    Hi

    I trained the model and then saved it as shown in the figure (right side), and I would like to convert it as shown in the figure (left side).

    To see the figure here

    Thank you in advance!

    opened by Algabri 0
  • RuntimeError: Mismatch in shape

    RuntimeError: Mismatch in shape

    I am trying to run train_hopenet.py

    python3 train_hopenet.py --dataset AFLW2000 --data_dir datasets/AFLW2000 --filename_list datasets/AFLW2000/files.txt --output_string er

    I got this error:

    Loading data.

    /home/redhwan/.local/lib/python3.8/site-packages/torch/optim/adam.py:90: UserWarning: optimizer contains a parameter group with duplicate parameters; in future, this will cause an error; see github.com/pytorch/pytorch/issues/40967 for more information
      super(Adam, self).__init__(params, defaults)
    Ready to train network.
    Traceback (most recent call last):
      File "train_hopenet.py", line 193, in <module>
        torch.autograd.backward(loss_seq, grad_seq)
      File "/home/redhwan/.local/lib/python3.8/site-packages/torch/autograd/__init__.py", line 166, in backward
        grad_tensors_ = _make_grads(tensors, grad_tensors_, is_grads_batched=False)
      File "/home/redhwan/.local/lib/python3.8/site-packages/torch/autograd/__init__.py", line 50, in _make_grads
        raise RuntimeError("Mismatch in shape: grad_output["
    RuntimeError: Mismatch in shape: grad_output[0] has a shape of torch.Size([1]) and output[0] has a shape of torch.Size([]).
    
    
    

    How can I solve it?

    Note: torch.version = 1.12.0+cu102

    opened by Algabri 1
  • Datasets Preprocessing and Data Loading from RAW Websites Downloaded Data

    Datasets Preprocessing and Data Loading from RAW Websites Downloaded Data

    Dear authors of the code and GitHub community, can you please let me know about the preprocessing steps to make the data readable by the data loaders proposed in this repository? If you've some sample codes, I'd highly appreciate it.

    The datasets that I want to preprocess include AFLW, BIWI, and Pointing04.

    opened by tanveer-hussain 0
  • i have created a colab file to test it out quickly

    i have created a colab file to test it out quickly

    Hi,

    I request you to add this to this repo if you feel relevant. Thanks! It uses mtcnn for face detection.

    https://colab.research.google.com/drive/1vvntbLyVxxBHoVN0e6-pfs7gB3pp-VUS?usp=sharing

    Thanks!

    opened by maylad31 0
  • AFLW labels are broken?

    AFLW labels are broken?

    ERROR: image I am completely confused. It seems like my labels are either -1 or bigger than num_classes. But I don't know why... It's just that I the label files are correct, but when it's calculating the labels at image that it's breaking apart...

    I would be pleased to get some help. Thank you.

    opened by ilyii 1
  • softmax

    softmax

    code/test_on_video_dlib.py:146: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument. yaw_predicted = F.softmax(yaw) code/test_on_video_dlib.py:147: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument. pitch_predicted = F.softmax(pitch) code/test_on_video_dlib.py:148: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument. roll_predicted = F.softmax(roll)

    dim is 0,1,or 2?

    opened by song6cy 0
Owner
Nataniel Ruiz
PhD candidate at Boston University doing Computer Vision and ML. M.S. from Georgia Tech, BA/M.S. from Ecole Polytechnique
Nataniel Ruiz
OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.

OpenFace 2.2.0: a facial behavior analysis toolkit Over the past few years, there has been an increased interest in automatic facial behavior analysis

Tadas Baltrusaitis 5.8k Dec 31, 2022
Official Pytorch implementation of 6DRepNet: 6D Rotation representation for unconstrained head pose estimation.

6D Rotation Representation for Unconstrained Head Pose Estimation (Pytorch) Paper Thorsten Hempel and Ahmed A. Abdelrahman and Ayoub Al-Hamadi, "6D Ro

Thorsten Hempel 284 Dec 23, 2022
Human head pose estimation using Keras over TensorFlow.

RealHePoNet: a robust single-stage ConvNet for head pose estimation in the wild.

Rafael Berral Soler 71 Jan 5, 2023
WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose

WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose Yijun Zhou and James Gregson - BMVC2020 Abstract: We present an end-to-end head-pos

null 368 Dec 26, 2022
Code for BMVC2021 "MOS: A Low Latency and Lightweight Framework for Face Detection, Landmark Localization, and Head Pose Estimation"

MOS-Multi-Task-Face-Detect Introduction This repo is the official implementation of "MOS: A Low Latency and Lightweight Framework for Face Detection,

null 104 Dec 8, 2022
Python scripts for performing 3D human pose estimation using the Mobile Human Pose model in ONNX.

Python scripts for performing 3D human pose estimation using the Mobile Human Pose model in ONNX.

Ibai Gorordo 99 Dec 31, 2022
SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation

SE3 Pose Interpolation Pose estimated from SLAM system are always discrete, and

Ran Cheng 4 Dec 15, 2022
Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation

SimplePose Code and pre-trained models for our paper, “Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation”, a

Jia Li 256 Dec 24, 2022
Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation Code repository for the paper: PoseAug: A Differentiable Pose Augme

Pyjcsx 328 Dec 17, 2022
This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

SO-Pose This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation This paper is basically an

shangbuhuan 52 Nov 25, 2022
Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

Noise Contrastive Estimation for pyTorch Overview This repository contains a re-implementation of the Noise Contrastive Estimation algorithm, implemen

Denis Emelin 42 Nov 24, 2022
[ECCV 2020] Reimplementation of 3DDFAv2, including face mesh, head pose, landmarks, and more.

Stable Head Pose Estimation and Landmark Regression via 3D Dense Face Reconstruction Reimplementation of (ECCV 2020) Towards Fast, Accurate and Stable

Remilia Scarlet 221 Dec 30, 2022
The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

Deep High-Resolution Representation Learning for Human Pose Estimation (CVPR 2019) News [2020/07/05] A very nice blog from Towards Data Science introd

Leo Xiao 3.9k Jan 5, 2023
Deep High-Resolution Representation Learning for Human Pose Estimation

Deep High-Resolution Representation Learning for Human Pose Estimation (accepted to CVPR2019) News If you are interested in internship or research pos

HRNet 167 Dec 27, 2022
Deep Dual Consecutive Network for Human Pose Estimation (CVPR2021)

Deep Dual Consecutive Network for Human Pose Estimation (CVPR2021) Introduction This is the official code of Deep Dual Consecutive Network for Human P

null 295 Dec 29, 2022
PyTorch Implementation of Realtime Multi-Person Pose Estimation project.

PyTorch Realtime Multi-Person Pose Estimation This is a pytorch version of Realtime_Multi-Person_Pose_Estimation, origin code is here Realtime_Multi-P

Dave Fang 157 Nov 12, 2022
PyTorch implementation for 3D human pose estimation

Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach This repository is the PyTorch implementation for the network presented in:

Xingyi Zhou 579 Dec 22, 2022
A PyTorch toolkit for 2D Human Pose Estimation.

PyTorch-Pose PyTorch-Pose is a PyTorch implementation of the general pipeline for 2D single human pose estimation. The aim is to provide the interface

Wei Yang 1.1k Dec 30, 2022
This repository is the offical Pytorch implementation of ContextPose: Context Modeling in 3D Human Pose Estimation: A Unified Perspective (CVPR 2021).

Context Modeling in 3D Human Pose Estimation: A Unified Perspective (CVPR 2021) Introduction This repository is the offical Pytorch implementation of

null 37 Nov 21, 2022