Keras implementation of PersonLab for Multi-Person Pose Estimation and Instance Segmentation.

Overview

PersonLab

This is a Keras implementation of PersonLab for Multi-Person Pose Estimation and Instance Segmentation. The model predicts heatmaps and various offsets which allow for computation of joint locations and connections as well as pixel instance ids. See the paper for more details.

Training a model

If you want to use Resnet101 as the base, first download the imagenet initialization weights from here and copy it to your ~/.keras/models/ directory. (Over 100MB files cannot be hosted on github.)

First, construct the dataset in the correct format by running the generate_hdf5.py script. Before running, just set the ANNO_FILE and IMG_DIR constants at the top of the script to the paths to the COCO person_keypoints annotation file and the image folder respectively.

Edit the config.py to set options for training, e.g. input resolution, number of GPUs, whether to freeze the batchnorm weights, etc. More advanced options require altering the train.py script. For example, changing the base network can be done by adding an argument to the get_personlab() function, see the documentation there.

After eveything is configured to your liking, go ahead and run the train.py script.

Testing a model

See the demo.ipynb for sample inference and visualizations.

Technical Debts

Several parts of this codebase are borrowed from others. These include:

  • The Resnet-101 in Keras

  • The augmentation code (which is different from the procedure in the PersonLab paper) and data iterator code is heavily borrowed from this fork of the Keras implementation of CMU's "Realtime Multi-Person Pose Estimation". (The pose plotting function is also influenced by the one in that repo.)

  • The Polyak Averaging callback is just a lightly modified version of the EMA callback from here

Environment

This code was tested in the following environment and with the following software versions:

  • Ubuntu 16.04
  • CUDA 8.0 with cudNN 6.0
  • Python 2.7
  • Tensorflow 1.7
  • Keras 2.1.3
  • OpenCV 2.4.9
Comments
  • bilinear sampler

    bilinear sampler

    Grateful for your implementation! I have some questions about the bilinear sampler.

    https://github.com/octiapp/KerasPersonLab/blob/32d44dd1f33377128a87d6e074cf8214224f0174/model.py#L49 Why base = base + bilinear_sampler(offsets, base) instead of base = base + bilinear_sampler(base, offsets) ? Does this mean that we interpolate the offset according to the base ?

    https://github.com/octiapp/KerasPersonLab/blob/32d44dd1f33377128a87d6e074cf8214224f0174/bilinear.py#L43 If we iy0 = vy0 + h, then the iy0 will represent the destination position of offsets. Why using these positions when iy0 = tf.where(mask, tf.zeros_like(iy0), iy0) ?

    In this condition,

    x00 = tf.gather_nd(x, i00)
    x01 = tf.gather_nd(x, i01)
    x10 = tf.gather_nd(x, i10)
    x11 = tf.gather_nd(x, i11)
    

    will gather values from the destination positions of predicted offsets, shouldn't we gather values from the start positions of offsets?

    opened by yangsenius 2
  • Error when training!

    Error when training!

    hi,all One error happens when training from scratch.


    KerasPersonLab-updated_model_def/data_prep.py", line 16, in get_ground_truth

    assert(instance_masks.shape[-1] == len(all_keypoints)), '{} != {}'.format(instance_masks.shape[-1], len(all_keypoints))
    

    AssertionError: 7 != 8

    opened by Minotaur-CN 2
  • Assertion Error

    Assertion Error

    10993/16028 [===================>..........] - ETA: 59:31 - loss: 152848912.0000Traceback (most recent c all last): File "train.py", line 77, in epochs=config.NUM_EPOCHS, callbacks=callbacks) File "/root/anaconda3/envs/deep_source/lib/python3.6/site-packages/keras/engine/training.py", line 103 7, in fit validation_steps=validation_steps) File "/root/anaconda3/envs/deep_source/lib/python3.6/site-packages/keras/engine/training_arrays.py", l ine 154, in fit_loop outs = f(ins) File "/root/anaconda3/envs/deep_source/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py ", line 2666, in call return self._call(inputs) File "/root/anaconda3/envs/deep_source/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2636, in _call fetched = self._callable_fn(*array_vals) File "/root/anaconda3/envs/deep_source/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1382, in call run_metadata_ptr) File "/root/anaconda3/envs/deep_source/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 519, in exit c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.UnknownError: AssertionError: Traceback (most recent call last):

    File "/root/anaconda3/envs/deep_source/lib/python3.6/site-packages/tensorflow/python/ops/script_ops.py", line 206, in call ret = func(*args)

    File "/root/networks/KerasPersonLab/tf_data_generator.py", line 71, in transform_data kp_maps, short_offsets, mid_offsets, long_offsets = get_ground_truth(instance_masks, kp)

    File "/root/networks/KerasPersonLab/data_prep.py", line 16, in get_ground_truth assert(instance_masks.shape[-1] == len(all_keypoints))

    AssertionError

    The error pretty much explain itself, not sure why shape is mis-matching.

    opened by noumanriazkhan 2
  • Crash on a call to cv2.warpAffine()

    Crash on a call to cv2.warpAffine()

    Hi!

    First, thank you for your hard work!

    I am trying to run training and some time after the first epoch begins, I am running into an issue where the following call (in transformer.py) fails:

    masks = cv2.warpAffine(masks, M, cv_shape, flags=cv2.INTER_CUBIC, borderMode=cv2.BORDER_CONSTANT, borderValue=0)

    Specifically, there is an OpenCV assertion error, stating that "error: (-215) _src.channels() <= 4 || (interpolation != INTER_LANCZOS4 && interpolation != INTER_CUBIC) in function cv::warpAffine". Could you please point me at what might be wrong here?

    Thanks a lot!

    N.B. Looking at the code, at first glace it would so seem that several Mats are passed into warpAffine(), is this intentional?

    opened by PavelPr 2
  • About dilation convolution

    About dilation convolution

    It seems like that no dilation convolution in origin PersonLab paper, according to classical Resnet . Did I miss or ignore something important in paper?

    opened by MoonBunnyZZZ 1
  • If some keypoints of a single body are not detected, multiple skeletons for a single person will be generated?

    If some keypoints of a single body are not detected, multiple skeletons for a single person will be generated?

    https://github.com/octiapp/KerasPersonLab/blob/32d44dd1f33377128a87d6e074cf8214224f0174/post_proc.py#L97

    From my understanding,

    this_skel[kp['id'], :2] = kp['xy']
    this_skel[kp['id'], 2] = kp['conf']
    path = iterative_bfs(skeleton_graph, kp['id'])[1:]
    for edge in path:
          if this_skel[edge[0],2] == 0:
               continue
    

    the order of the path will be uniquely determined by the start keypoint due to the tree-structure of person. So, if some keypoints in the path are not detected, multiple skeletons will be generated for a single body because of this snippet:

    if this_skel[edge[0],2] == 0:
               continue
    

    for example, let's assume that the start (the highest score) keypoint is Lwrist and all keypoints except L_elbow are detected for a single person, then this for edge in path: loop will always goto continue and this_skeleton will only contain the Lwrist keypoint location.

    I don't know if I understand it correctly?

    opened by yangsenius 1
  • TypeError: 'KeysView' object does not support indexing

    TypeError: 'KeysView' object does not support indexing

    when I use python 3, I get the following Error: TypeError: 'KeysView' object does not support indexing so when use python 3, I should modify the function data_gen() in tf_data_generator.py in line87: key = root.keys() to key = list(root.keys()) and now key could be iterated. I think you should tell the environment detail in README so that other people could avoid such mistakes

    opened by miibotree 1
  • Error During Trianing Fit

    Error During Trianing Fit

    When running the following code:

    parallel_model.fit(steps_per_epoch=64115//batch_size, 
                       epochs=config.NUM_EPOCHS, 
                       callbacks=callbacks)
    

    Getting this error.

    WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
    Instructions for updating:
    Use tf.cast instead.
    Epoch 1/125
    ---------------------------------------------------------------------------
    UnknownError                              Traceback (most recent call last)
    <ipython-input-11-a6811e3578ae> in <module>
          1 parallel_model.fit(steps_per_epoch=64115//batch_size, 
          2                    epochs=config.NUM_EPOCHS,
    ----> 3                    callbacks=callbacks)
    
    /usr/local/lib/python3.7/site-packages/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, **kwargs)
       1037                                         initial_epoch=initial_epoch,
       1038                                         steps_per_epoch=steps_per_epoch,
    -> 1039                                         validation_steps=validation_steps)
       1040 
       1041     def evaluate(self, x=None, y=None,
    
    /usr/local/lib/python3.7/site-packages/keras/engine/training_arrays.py in fit_loop(model, f, ins, out_labels, batch_size, epochs, verbose, callbacks, val_f, val_ins, shuffle, callback_metrics, initial_epoch, steps_per_epoch, validation_steps)
        152                 batch_logs['size'] = 1
        153                 callbacks.on_batch_begin(step_index, batch_logs)
    --> 154                 outs = f(ins)
        155 
        156                 outs = to_list(outs)
    
    /usr/local/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py in __call__(self, inputs)
       2713                 return self._legacy_call(inputs)
       2714 
    -> 2715             return self._call(inputs)
       2716         else:
       2717             if py_any(is_tensor(x) for x in inputs):
    
    /usr/local/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py in _call(self, inputs)
       2673             fetched = self._callable_fn(*array_vals, run_metadata=self.run_metadata)
       2674         else:
    -> 2675             fetched = self._callable_fn(*array_vals)
       2676         return fetched[:len(self.outputs)]
       2677 
    
    /usr/local/lib/python3.7/site-packages/tensorflow/python/client/session.py in __call__(self, *args, **kwargs)
       1437           ret = tf_session.TF_SessionRunCallable(
       1438               self._session._session, self._handle, args, status,
    -> 1439               run_metadata_ptr)
       1440         if run_metadata:
       1441           proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)
    
    /usr/local/lib/python3.7/site-packages/tensorflow/python/framework/errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg)
        526             None, None,
        527             compat.as_text(c_api.TF_Message(self.status.status)),
    --> 528             c_api.TF_GetCode(self.status.status))
        529     # Delete the underlying status object from memory otherwise it stays alive
        530     # as there is a reference to status from this from the traceback due to
    
    UnknownError: OSError: Unable to open file (unable to lock file, errno = 35, error message = 'Resource temporarily unavailable')
    Traceback (most recent call last):
    
      File "/usr/local/lib/python3.7/site-packages/tensorflow/python/ops/script_ops.py", line 207, in __call__
        ret = func(*args)
    
      File "/usr/local/lib/python3.7/site-packages/tensorflow/python/data/ops/dataset_ops.py", line 449, in generator_py_func
        values = next(generator_state.get_iterator(iterator_id))
    
      File "/Users/ghafran/git/KerasPersonLab/tf_data_generator.py", line 85, in data_gen
        h5 = h5py.File(config.H5_DATASET, 'r')
    
      File "/usr/local/lib/python3.7/site-packages/h5py/_hl/files.py", line 394, in __init__
        swmr=swmr)
    
      File "/usr/local/lib/python3.7/site-packages/h5py/_hl/files.py", line 170, in make_fid
        fid = h5f.open(name, flags, fapl=fapl)
    
      File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
    
      File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
    
      File "h5py/h5f.pyx", line 85, in h5py.h5f.open
    
    OSError: Unable to open file (unable to lock file, errno = 35, error message = 'Resource temporarily unavailable')
    
    
    	 [[{{node PyFunc}}]]
    	 [[{{node IteratorGetNext}}]]
    
    opened by ghafran 0
  • bilinear interpolation in accumulate_votes

    bilinear interpolation in accumulate_votes

    Hi, thanks a lot for the implementation. there's something that i couldn't figure out after reading the paper and reviewing the code. during the computation of the heatmaps the accumulate_votes function is called and some sort of bilinear interpolation is made. could someone please clarify which function is interpolated (in the code it seems to me that the values of this function are stored in the variable ps) and where do the interpolation weights come from (dx,dy) and what do they mean? i'm referring to the following part of the code in the post_proc.py: def accumulate_votes(votes, shape): xs = votes[:,0] ys = votes[:,1] ps = votes[:,2] tl = [np.floor(ys).astype('int32'), np.floor(xs).astype('int32')] tr = [np.floor(ys).astype('int32'), np.ceil(xs).astype('int32')] bl = [np.ceil(ys).astype('int32'), np.floor(xs).astype('int32')] br = [np.ceil(ys).astype('int32'), np.ceil(xs).astype('int32')] dx = xs - tl[1] dy = ys - tl[0] tl_vals = ps*(1.-dx)(1.-dy) tr_vals = psdx*(1.-dy) bl_vals = psdy(1.-dx) br_vals = psdydx

    opened by AlonMendelson 0
  • Near constant loss

    Near constant loss

    Does someone have managed to get results with this repository?

    My loss is near constant, oscillating around 1.50. I should mention that I have removed the instance segmentation module, but that shouldn't impact the pose estimation module.

    opened by nscotto 5
  • About Inference Time of Large Image with Dozens Persons

    About Inference Time of Large Image with Dozens Persons

    Thank you for your reproducing of PersonLab. I used it to train model on COCO train2017 and got a not bad intermediate result. However, when I use the model to detect a 1K(1920x1080) image with about 40 persons, the Inference Time is very long(about 3s do all detections, and 13s do group matching) even with GPUs. I found it mainly caused by the stage of Group Joints, especially the function compute_heatmaps() with inputs kp_maps, short_offsets which is the refinement of Keypoints Heatmaps. Theoretically, Inference Time of bottom-up method will not grow linearly when persons increase. Is there a problem with the implementation of the function compute_heatmaps()?

    opened by hnuzhy 4
  • pretrained model

    pretrained model

    Hi @jricheimer

    Thank you so much for your implementation, It would be great if we can get access to pretrained model to compare it's accuracy with paper.

    Thanks,

    opened by nitba 1
  • build base and intermidiate layer help

    build base and intermidiate layer help

    I've encountered a couple of errors in my code I was hoping you could help me fix:

    1. I used get_resnet50_base and when i tried to load the weights this error popped up ValueError: You are trying to load a weight file containing 111 layers into a model with 116 layers.
    2. if I put get_resnet50_base as my build base, which intermidiate layer do I use?
    opened by nitzan1207 4
Owner
OCTI
OCTI
Code for "Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo"

Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo This repository includes the source code for our CVPR 2021 paper on multi-view mult

Jiahao Lin 66 Jan 4, 2023
PyTorch Implementation of Realtime Multi-Person Pose Estimation project.

PyTorch Realtime Multi-Person Pose Estimation This is a pytorch version of Realtime_Multi-Person_Pose_Estimation, origin code is here Realtime_Multi-P

Dave Fang 157 Nov 12, 2022
Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019

PoseNet of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image" Introduction This repo is official Py

Gyeongsik Moon 677 Dec 25, 2022
3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks

3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks Introduction This repository contains the code and models for the follo

null 124 Jan 6, 2023
Code repo for realtime multi-person pose estimation in CVPR'17 (Oral)

Realtime Multi-Person Pose Estimation By Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh. Introduction Code repo for winning 2016 MSCOCO Keypoints Cha

Zhe Cao 4.9k Dec 31, 2022
PoseViz – Multi-person, multi-camera 3D human pose visualization tool built using Mayavi.

PoseViz – 3D Human Pose Visualizer Multi-person, multi-camera 3D human pose visualization tool built using Mayavi. As used in MeTRAbs visualizations.

István Sárándi 79 Dec 30, 2022
TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

yifan liu 147 Dec 3, 2022
SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation

SE3 Pose Interpolation Pose estimated from SLAM system are always discrete, and

Ran Cheng 4 Dec 15, 2022
Human head pose estimation using Keras over TensorFlow.

RealHePoNet: a robust single-stage ConvNet for head pose estimation in the wild.

Rafael Berral Soler 71 Jan 5, 2023
Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation Code repository for the paper: PoseAug: A Differentiable Pose Augme

Pyjcsx 328 Dec 17, 2022
This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

SO-Pose This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation This paper is basically an

shangbuhuan 52 Nov 25, 2022
Python scripts for performing 3D human pose estimation using the Mobile Human Pose model in ONNX.

Python scripts for performing 3D human pose estimation using the Mobile Human Pose model in ONNX.

Ibai Gorordo 99 Dec 31, 2022
Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Mask R-CNN for Object Detection and Segmentation This is an implementation of Mask R-CNN on Python 3, Keras, and TensorFlow. The model generates bound

Matterport, Inc 22.5k Jan 4, 2023
Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

Noise Contrastive Estimation for pyTorch Overview This repository contains a re-implementation of the Noise Contrastive Estimation algorithm, implemen

Denis Emelin 42 Nov 24, 2022
Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation This paper has been accepted and early accessed

Yun Liu 39 Sep 20, 2022
OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation

Build Type Linux MacOS Windows Build Status OpenPose has represented the first real-time multi-person system to jointly detect human body, hand, facia

null 25.7k Jan 9, 2023
Web service for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation based on OpenFace 2.0

OpenGaze: Web Service for OpenFace Facial Behaviour Analysis Toolkit Overview OpenFace is a fantastic tool intended for computer vision and machine le

Sayom Shakib 4 Nov 3, 2022
OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.

OpenFace 2.2.0: a facial behavior analysis toolkit Over the past few years, there has been an increased interest in automatic facial behavior analysis

Tadas Baltrusaitis 5.8k Dec 31, 2022
FishNet: One Stage to Detect, Segmentation and Pose Estimation

FishNet FishNet: One Stage to Detect, Segmentation and Pose Estimation Introduction In this project, we combine target detection, instance segmentatio

null 1 Oct 5, 2022