A PyTorch toolkit for 2D Human Pose Estimation.

Wei Yang

Last update: Dec 30, 2022

Related tags

Deep Learning pytorch human-pose-estimation mpii-dataset pose-estimation pose mscoco-keypoint hourglass-network

Overview

PyTorch-Pose

PyTorch-Pose is a PyTorch implementation of the general pipeline for 2D single human pose estimation. The aim is to provide the interface of the training/inference/evaluation, and the dataloader with various data augmentation options for the most popular human pose databases (e.g., the MPII human pose, LSP and FLIC).

Some codes for data preparation and augmentation are brought from the Stacked hourglass network. Thanks to the original author.

Update: this repository is compatible with PyTorch 0.4.1/1.0 now!

Features

Multi-thread data loading
Multi-GPU training
Logger
Training/testing results visualization

Installation

PyTorch (>= 0.4.1): Please follow the installation instruction of PyTorch. Note that the code is developed with Python2 and has not been tested with Python3 yet.

Clone the repository with submodule

git clone --recursive https://github.com/bearpaw/pytorch-pose.git

Create a symbolic link to the images directory of the MPII dataset:
```
ln -s PATH_TO_MPII_IMAGES_DIR data/mpii/images
```
For training/testing on COCO, please refer to COCO Readme.

Download annotation file:
- (MPII) Download mpii_annotations.json and save it to data/mpii
- (MSCOCO) Download coco_annotations_2014.json or/and coco_annotations_2017.json and save it to data/mscoco

Usage

Please refer to TRAINING.md for detailed training recipes!

Testing

You may download our pretrained models (e.g., 2-stack hourglass model) for a quick start.

Run the following command in terminal to evaluate the model on MPII validation split (The train/val split is from Tompson et al. CVPR 2015).

CUDA_VISIBLE_DEVICES=0 python example/main.py --dataset mpii -a hg --stacks 2 --blocks 1 --checkpoint checkpoint/mpii/hg_s2_b1 --resume checkpoint/mpii/hg_s2_b1/model_best.pth.tar -e -d

-a specifies a network architecture
--resume will load the weight from a specific model
-e stands for evaluation only
-d will visualize the network output. It can be also used during training

The result will be saved as a .mat file (preds_valid.mat), which is a 2958x16x2 matrix, in the folder specified by --checkpoint.

Evaluate the [email protected] score

Evaluate with MATLAB

You may use the matlab script evaluation/eval_PCKh.m to evaluate your predictions. The evaluation code is ported from Tompson et al. CVPR 2015.

The results ([email protected] score) trained using this code is reported in the following table.

Model	Head	Shoulder	Elbow	Wrist	Hip	Knee	Ankle	Mean
hg_s2_b1 (last)	95.80	94.57	88.12	83.31	86.24	80.88	77.44	86.76
hg_s2_b1 (best)	95.87	94.68	88.27	83.64	86.29	81.20	77.70	86.95
hg_s8_b1 (last)	96.79	95.19	90.08	85.32	87.48	84.26	80.73	88.64
hg_s8_b1 (best)	96.79	95.28	90.27	85.56	87.57	84.3	81.06	88.78

Training / validation curve is visualized as follows.

Evaluate with Python

You may also evaluate the result by running python evaluation/eval_PCKh.py to evaluate the predictions. It will produce exactly the same result as that of the MATLAB. Thanks @sssruhan1 for the contribution.

Training

Run the following command in terminal to train an 8-stack of hourglass network on the MPII human pose dataset.

CUDA_VISIBLE_DEVICES=0 python example/main.py --dataset mpii -a hg --stacks 8 --blocks 1 --checkpoint checkpoint/mpii/hg8 -j 4

Here,

CUDA_VISIBLE_DEVICES=0 identifies the GPU devices you want to use. For example, use CUDA_VISIBLE_DEVICES=0,1 if you want to use two GPUs with ID 0 and 1.
-j specifies how many workers you want to use for data loading.
--checkpoint specifies where you want to save the models, the log and the predictions to.

Miscs

Supported dataset

Supported models

Stacked Hourglass networks
Xiao et al., Simple Baselines for Human Pose Estimation and Tracking, ECCV 2018 (PDF | GitHub)

Contribute

Please create a pull request if you want to contribute.

Comments

Bug in crop method

Hi, I was visualizing the heatmap inputs for the model and I'm not sure what I'm doing wrong but the crop method doesn't seem to work. This is the original crop method from this repo:

def crop(img, center, scale, res, rot=0):
    img = im_to_numpy(img)

    # Preprocessing for efficient cropping
    ht, wd = img.shape[0], img.shape[1]
    sf = scale * 200.0 / res[0]
    if sf < 2:
        sf = 1
    else:
        new_size = int(np.math.floor(max(ht, wd) / sf))
        new_ht = int(np.math.floor(ht / sf))
        new_wd = int(np.math.floor(wd / sf))
        if new_size < 2:
            return torch.zeros(res[0], res[1], img.shape[2]) \
                        if len(img.shape) > 2 else torch.zeros(res[0], res[1])
        else:
            img = cv2.resize(img, (new_ht,new_wd))
            #img = scipy.misc.imresize(img, [new_ht, new_wd])
            center = center * 1.0 / sf
            scale = scale / sf

    # Upper left point
    ul = np.array(transform([0, 0], center, scale, res, invert=1))
    # Bottom right point
    br = np.array(transform(res, center, scale, res, invert=1))

    # Padding so that when rotated proper amount of context is included
    pad = int(np.linalg.norm(br - ul) / 2 - float(br[1] - ul[1]) / 2)
    if not rot == 0:
        ul -= pad
        br += pad

    new_shape = [br[1] - ul[1], br[0] - ul[0]]
    if len(img.shape) > 2:
        new_shape += [img.shape[2]]
    new_img = np.zeros(new_shape)

    # Range to fill new array
    new_x = max(0, -ul[0]), min(br[0], img.shape[1]) - ul[0]
    new_y = max(0, -ul[1]), min(br[1], img.shape[0]) - ul[1]
    # Range to sample from original image
    old_x = max(0, ul[0]), min(img.shape[1], br[0])
    old_y = max(0, ul[1]), min(img.shape[0], br[1])
    new_img[new_y[0]:new_y[1], new_x[0]:new_x[1]] = img[old_y[0]:old_y[1], old_x[0]:old_x[1]]

    if not rot == 0:
        # Remove padding
        new_img = scipy.misc.imrotate(new_img, rot)
        new_img = new_img[pad:-pad, pad:-pad]

    #new_img = im_to_torch(scipy.misc.imresize(new_img, res))
    new_img = im_to_torch(cv2.resize(new_img, tuple(res)))
    return new_img

I've replaced scipy.misc with cv2. On the other hand, this is the crop method from https://github.com/princeton-vl/pytorch_stacked_hourglass/blob/master/utils/img.py

def crop_newell(img, center, scale, res, rot=0):
    img = im_to_numpy(img)
    # Upper left point
    ul = np.array(transform([0, 0], center, scale, res, invert=1))
    # Bottom right point
    br = np.array(transform(res, center, scale, res, invert=1))

    new_shape = [br[1] - ul[1], br[0] - ul[0]]
    if len(img.shape) > 2:
        print(img.shape)
        new_shape += [img.shape[2]]
    new_img = np.zeros(new_shape)

    # Range to fill new array
    new_x = max(0, -ul[0]), min(br[0], len(img[0])) - ul[0]
    new_y = max(0, -ul[1]), min(br[1], len(img)) - ul[1]
    # Range to sample from original image
    old_x = max(0, ul[0]), min(len(img[0]), br[0])
    old_y = max(0, ul[1]), min(len(img), br[1])
    new_img[new_y[0]:new_y[1], new_x[0]:new_x[1]] = img[old_y[0]:old_y[1], old_x[0]:old_x[1]]

    new_img = im_to_torch(cv2.resize(new_img, tuple(res)))
    return new_img

These are the results I get: (Left: crop_newell, Right: crop)

As you can see, the crop method sometimes works well, and sometimes doesn't. It's usually the latter. What could be the issue? Am I doing something wrong? @bearpaw

opened by pranavbudhwant 11

About function "transform" in transforms.py

Hi,

Thanks for your code. I have one question about some code in the "transform" function.

new_pt = np.array([pt[0] - 1, pt[1] - 1, 1.]).T new_pt = np.dot(t, new_pt) return new_pt[:2].astype(int) + 1

According to the above code, you first subtract 1 from the coordinates and then add 1 after the transformation. I don't see the reason of doing this. There are two places calling this "transform" function. The first place is in datasets/mpii.py function,

tpts[i, 0:2] = to_torch(transform(tpts[i, 0:2]+1, c, s, [self.out_res, self.out_res], rot=r)) target[i] = draw_labelmap(target[i], tpts[i]-1, self.sigma, type=self.label_type)

Here you first add 1 and then subtract 1 before and after calling the "transform" function, which just offset what you do inside it. For this case, we could remove the plus 1 and minus 1 for clarity.

Second, function "final_preds" calls function "transform_preds" which then calls "transform" as follows:

coords[p, 0:2] = to_torch(transform(coords[p, 0:2], center, scale, res, 1, 0))

In this case, I also read the original torch code: https://github.com/anewell/pose-hg-demo/blob/master/util.lua It seems they don't add 1 and subtract 1 afterwards. I think adding 1 is not equivalent to subtracting 1 after the trasformation. Could please explain your reason?

Thanks,

opened by zhiqiangdon 11
Does this code reproduce the results of 8 stacked hg in the original paper?

Hi,

Thanks for sharing your code! Does this code reproduce the results of 8 stacked HG in the original paper? If not, what's your results of 8 stacked HG? Any possible reasons between the gap?

Best,

opened by zhiqiangdon 11
Can anyone reproduce the same training accurarcy performance as claimed with pytorch 0.4?

I trained with the origin code and dataset on 2 different machine, one with a 1060 gpu and another with 2 1080Ti, but never have I got an accurarcy rate over 70% and it was growing pretty slow (some got 20% after 2 epochs, but mine is still way lower than 10%). I noticed someone mentioned in another issue said that he couldn't get good performance on pytorch 0.4.0 either, so I wonder if anyone got good performance. I really don't want to down-grade my pytorch version since I have been modifying the code to implement some points of a paper that couldn't work on lower version pytorch.

opened by gdjmck 10
(1)raise ValueError(reduction + " is not a valid value for reduction") (2)dists[c, n] = torch.dist(preds[n,c,:], target[n,c,:])/normalize[n] RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #2 'other'

There are two bugs in the latest repo updated on Jan 8th,2019.The first one is as the title described.I think it is caused by the pytorch version 0.4.1 vs 1.0 . I solved this problem by changing the param of the torch.nn.MSELoss() to 'size_average=True'

The second bug is : 'dists[c, n] = torch.dist(preds[n,c,:], target[n,c,:])/normalize[n] RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #2 'other'' It is caused by the type of the variable.I solved it by changing the type of variable 'target' from cuda() to cpu data.

opened by unclejokerjoker 7
Validation acc varies in different test batch size.

@bearpaw thank you for your wonderful work, I have trained my network. While, when I validated my best model in different test batch size, the validation acc suffered from different results, as I think the acc should be independent of test batch size.

test batchsize------Val Acc 1--------------------0.8743 6--------------------0.8660 16-------------------0.8685

And what's the test batch size do you use in the result you published？

opened by Bob130 7

got stuck in training.

Epoch: 54 | LR: 0.00005000
Processing |################################| (3708/3708) Data: 0.000265s | Batch: 0.286s | Total: 0:20:19 | ETA: 0:00:01 | Loss: 0.0033 | Acc:  0.7968
Processing |################################| (493/493) Data: 0.000184s | Batch: 0.127s | Total: 0:01:02 | ETA: 0:00:01 | Loss: 0.0035 | Acc:  0.8025

Epoch: 55 | LR: 0.00005000
Processing |################################| (3708/3708) Data: 0.000226s | Batch: 0.283s | Total: 0:20:22 | ETA: 0:00:01 | Loss: 0.0033 | Acc:  0.7994
Processing |################################| (493/493) Data: 0.000168s | Batch: 0.128s | Total: 0:01:02 | ETA: 0:00:01 | Loss: 0.0036 | Acc:  0.7947

Epoch: 56 | LR: 0.00005000
Processing |################################| (3708/3708) Data: 0.000174s | Batch: 0.249s | Total: 0:20:24 | ETA: 0:00:01 | Loss: 0.0033 | Acc:  0.7997
Processing |################################| (493/493) Data: 0.000158s | Batch: 0.128s | Total: 0:01:03 | ETA: 0:00:01 | Loss: 0.0038 | Acc:  0.8001

Epoch: 57 | LR: 0.00005000
Processing |################################| (3708/3708) Data: 0.000286s | Batch: 0.309s | Total: 0:20:31 | ETA: 0:00:01 | Loss: 0.0033 | Acc:  0.8026
Processing |################################| (493/493) Data: 0.000217s | Batch: 0.128s | Total: 0:01:03 | ETA: 0:00:01 | Loss: 0.0035 | Acc:  0.7993

Epoch: 58 | LR: 0.00005000
Processing |################################| (3708/3708) Data: 0.000279s | Batch: 0.296s | Total: 0:20:16 | ETA: 0:00:01 | Loss: 0.0033 | Acc:  0.8038
Processing |################################| (493/493) Data: 0.000223s | Batch: 0.122s | Total: 0:01:00 | ETA: 0:00:01 | Loss: 0.0036 | Acc:  0.7977

Epoch: 59 | LR: 0.00005000
Processing |######                          | (789/3708) Data: 0.000346s | Batch: 0.328s | Total: 0:04:18 | ETA: 0:15:57 | Loss: 0.0033 | Acc:  0.8042

it just stay here, and don't move any more.

opened by dongzhuoyao 7

Data loader is slow

The data loader seems to extremely slow for few batches. After every few batches (like after 10 or 20 batches), it takes few seconds (up to 15s) to load the data. I have tried increasing the number of data loader workers (via option -j 12) and increasing the train batch size, but this issue persists. Is this issue expected? Is it because of the data transforms? This issue becomes severe when I run the code on more than one GPU. Most of the times, the GPU's remains idle which increases the overall time taken for one epoch (which for me is 1hr, 20 mins).

My machine configurations are: 4x1080Ti, Intel Xeon E5-2640, and I am loading the data from an SSD.

opened by adityaarun1 7

evaluation.py

hi, i ran the training code and met the following errror:

Traceback (most recent call last):
  File "example/mpii.py", line 318, in <module>
    main(parser.parse_args())
  File "example/mpii.py", line 104, in main
    valid_loss, valid_acc, predictions = validate(val_loader, model, criterion, args.debug, args.flip)
  File "example/mpii.py", line 233, in validate
    acc = accuracy(score_map.cuda(), target, idx)
  File "~/pytorch-pose/pose/utils/evaluation.py", line 61, in accuracy
    acc[i+1] = dist_acc(dists[:, idxs[i]-1, :])
IndexError: index 10 is out of range for dimension 1 (of size 6)

is this a bug?

opened by weigq 7

ImportError: No module named pose

# Thanks for you code! could you give me some advice?

sun@sunwin:~$ cd /home/sun/pytorch-pose sun@sunwin:~/pytorch-pose$ ln -s PATH_TO_MPII_IMAGES_DIR data/mpii/images sun@sunwin:~/pytorch-pose$ CUDA_VISIBLE_DEVICES=0 python example/mpii.py -a hg4 --checkpoint checkpoint/mpii/hg4 --resume checkpoint/mpii/hg4/model_best.pth.tar -e -d Traceback (most recent call last): File "example/mpii.py", line 14, in from pose import Bar ImportError: No module named pose sun@sunwin:~/pytorch-pose$ CUDA_VISIBLE_DEVICES=0 python example/mpii.py -a hg4 --checkpoint checkpoint/mpii/hg4 --resume checkpoint/mpii/hg4/model_best.pth.tar -e -d Traceback (most recent call last): File "example/mpii.py", line 14, in from pose import Bar ImportError: No module named pose sun@sunwin:~/pytorch-pose$ CUDA_VISIBLE_DEVICES=0 python example/mpii.py -a hg1 --checkpoint checkpoint/mpii/hg1 -j 4 Traceback (most recent call last): File "example/mpii.py", line 14, in from pose import Bar ImportError: No module named pose sun@sunwin:~/pytorch-pose$ really need help! thank you !

opened by zilesazoyi2 7
train acc is low

hello, this is a really great work ! But I have a question about the train accuracy, why my train acc is always lower than the val acc ? hope you can help me @bearpaw

opened by syusukee 6
Question about calculating PCK in Python

parser = argparse.ArgumentParser(description='MPII PCKh Evaluation') parser.add_argument('-r', '--result', default='checkpoint/mpii/hg_s2_b1/preds.mat', type=str, metavar='PATH', help='path to result (default: checkpoint/mpii/hg_s2_b1/preds.mat)') args = parser.parse_args( I can`t find about preds.mat somethings. Hope to get some answers, thanks

opened by ssssleep 0
I cannot train!

==> creating model 'hg', stacks=1, blocks=1 => no checkpoint found at './checkpoint/mpii/hg-s1-b1/checkpoint.pth.tar' Total params: 3.59M Traceback (most recent call last): File "./example/main.py", line 431, in main(parser.parse_args()) File "./example/main.py", line 116, in main train_dataset = datasets.dict[args.dataset](is_train=True, **vars(args)) File "/home/ubuntu/zq/Projects/hourglass0122/pytorch-pose/example/../pose/datasets/mpii.py", line 138, in mpii return Mpii(**kwargs) File "/home/ubuntu/zq/Projects/hourglass0122/pytorch-pose/example/../pose/datasets/mpii.py", line 30, in init with open(self.jsonfile) as anno_file: IOError: [Errno 2] No such file or directory: ''

opened by ZhouQiang19980220 1
Minor bug fix in _make_fc function

function _make_fc gets parameters inplanes and outplanes the conv layer inside _make_fc is followed by bn because the input channels of bn must match with the output channels of conv layer, parameter of bn should be changed from inplanes to outplanes it did not cause any error because _make_fc is always used with same value of inplanes and outplanes even so, I think it should be fixed

opened by jscsmk 0

Owner

Wei Yang

NVIDIA Robotics Research Lab

GitHub

Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation Code repository for the paper: PoseAug: A Differentiable Pose Augme

328 Dec 17, 2022

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors, CVPR 2021

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors Human POSEitioning System (H

66 Dec 21, 2022

SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation

SE3 Pose Interpolation Pose estimated from SLAM system are always discrete, and

4 Dec 15, 2022

PyTorch implementation for 3D human pose estimation

Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach This repository is the PyTorch implementation for the network presented in:

579 Dec 22, 2022

This repository is the offical Pytorch implementation of ContextPose: Context Modeling in 3D Human Pose Estimation: A Unified Perspective (CVPR 2021).

Context Modeling in 3D Human Pose Estimation: A Unified Perspective (CVPR 2021) Introduction This repository is the offical Pytorch implementation of

37 Nov 21, 2022

This repo is official PyTorch implementation of MobileHumanPose: Toward real-time 3D human pose estimation in mobile devices(CVPRW 2021).

Github Code of "MobileHumanPose: Toward real-time 3D human pose estimation in mobile devices" Introduction This repo is official PyTorch implementatio

203 Jan 5, 2023

《Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement》(ECCV 2020) GitHub: [fig9]

Unsupervised 3D Human Pose Representation [Paper] The implementation of our paper Unsupervised 3D Human Pose Representation with Viewpoint and Pose Di

42 Nov 24, 2022

Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation

SimplePose Code and pre-trained models for our paper, “Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation”, a

256 Dec 24, 2022

This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

SO-Pose This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation This paper is basically an

52 Nov 25, 2022

Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

Noise Contrastive Estimation for pyTorch Overview This repository contains a re-implementation of the Noise Contrastive Estimation algorithm, implemen

42 Nov 24, 2022

The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

Deep High-Resolution Representation Learning for Human Pose Estimation (CVPR 2019) News [2020/07/05] A very nice blog from Towards Data Science introd

3.9k Jan 5, 2023

Human head pose estimation using Keras over TensorFlow.

RealHePoNet: a robust single-stage ConvNet for head pose estimation in the wild.

71 Jan 5, 2023

Deep Dual Consecutive Network for Human Pose Estimation (CVPR2021)

Deep Dual Consecutive Network for Human Pose Estimation （CVPR2021） Introduction This is the official code of Deep Dual Consecutive Network for Human P

295 Dec 29, 2022

Bottom-up Human Pose Estimation

Introduction This is the official code of Rethinking the Heatmap Regression for Bottom-up Human Pose Estimation. This paper has been accepted to CVPR2

108 Dec 1, 2022

This is an official implementation of our CVPR 2021 paper "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression" (https://arxiv.org/abs/2104.02300)

Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression Introduction In this paper, we are interested in the bottom-up paradigm of estima

367 Dec 27, 2022

HPRNet: Hierarchical Point Regression for Whole-Body Human Pose Estimation

HPRNet: Hierarchical Point Regression for Whole-Body Human Pose Estimation Official PyTroch implementation of HPRNet. HPRNet: Hierarchical Point Regre

53 Dec 4, 2022

A large-scale video dataset for the training and evaluation of 3D human pose estimation models

ASPset-510 ASPset-510 (Australian Sports Pose Dataset) is a large-scale video dataset for the training and evaluation of 3D human pose estimation mode

36 Oct 30, 2022

A large-scale video dataset for the training and evaluation of 3D human pose estimation models

ASPset-510 (Australian Sports Pose Dataset) is a large-scale video dataset for the training and evaluation of 3D human pose estimation models. It contains 17 different amateur subjects performing 30 sports-related actions each, for a total of 510 action clips.

25 Jun 20, 2021

The project is an official implementation of our paper "3D Human Pose Estimation with Spatial and Temporal Transformers".

3D Human Pose Estimation with Spatial and Temporal Transformers This repo is the official implementation for 3D Human Pose Estimation with Spatial and

363 Dec 28, 2022