Deep Learning for Human Part Discovery in Images - Chainer implementation

Shintaro Shiba

Last update: Sep 25, 2022

Related tags

Deep Learning deep-learning-for-human-part-discovery-in-images

Overview

Deep Learning for Human Part Discovery in Images - Chainer implementation

NOTE: This is not official implementation. Original paper is Deep Learning for Human Part Discovery in Images.

We are now reproducing the experiments in the original paper. Any contribution will be welcomed!

Requirements

Python 2.7.11+
- Chainer 1.10+
- numpy 1.9+
- scipy 0.16+
- six
- matplotlib
- tqdm
- cv2 (opencv)

Preparation

Data

bash prepare.sh

This script downloads VOC 2010 dataset (http://host.robots.ox.ac.uk/pascal/VOC/voc2010/VOCtrainval_03-May-2010.tar) and the authors' original dataset (http://www2.informatik.uni-freiburg.de/~oliveira/datasets/Sitting.tar.gz).

Model

You can download pre-trained FCN model from here.

We will use weights of this model and train new model on VOC dataset.

Start training

python train.py -g 0 -b 3 -e 3000 -l on -s on

Possible options

python train.py --help

GPU memory requirement

Citation from the original paper:

Each minibatch consists of just one image. The learning rate and momentum are fixed to 1e 10 and 0.99, respectively. We train the refinement layer by layer, which takes two days per refinement layer. Thus, the overall training starting from the pre-trained VGG network took 10 days on a single GPU.

Current maximum batchsize is 3 for 12 GB memory GPU.

Also it was confirmed that MBP (Late 2016, memory 16 GiB) can run with batchsize 1.

Result

Now in prep.

Visualize Prediction

python visualize.py -f PATH_TO_IMAGE_FILE

LICENSE

MIT LICENSE.

Author

shiba24, August 2016.

Contributors

bobye

Comments

Any tips to reduce training time
I am using p2.x.large machine on aws .It has 12gb gpu memory

For batch size > 1 I see high gpu utility and gpu memory usage .Thus ending up with out of memory issues

With batch size = 1 ,an epoch takes 4 hrs ,thus 3000 epochs = 500 days :)

Is there a pretrained model which I can use for training

Is such High usage normal.I see that each image is 10kb and mask is 2.5kb.I am using chainer for the first time .Am i doing something wrong .
opened by cricket1 7
fcn-8s-pascalcontext_W_and_b.pkl pre trained model not accesible

The URL for the pre-trained model fcn-8s-pascalcontext_W_and_b.pkl 'https://googledrive.com/host/0BxSyYt1jT6LhUlhITjdicDFyNHM' not accessible .I also checked in https://github.com/shiba24/pretrained-model-collections

opened by cricket1 3
Comparison raises notimplementederror in Chainer

When using all the suggested parameters, but CPU, instead of GPU and the initial model in pkl format provided by you, I get this error:

loading VGG model... ('training datasets: ', 9788, 'test datasets: ', 516) Epoch 1 : training... 0it [00:00, ?it/s]Create Iterator settings

Traceback (most recent call last): File "train.py", line 102, in model, optimizer, train_mean_loss, train_ac = train(model, optimizer, MiniBatchLoader, train_mean_loss, train_ac) File "train.py", line 31, in train optimizer.update(model, x, t) File "/usr/local/lib/python2.7/dist-packages/chainer/optimizer.py", line 392, in update loss = lossfun(_args, *_kwds) File "/home/tamme/Desktop/humanparts/model.py", line 86, in call self.accuracy = self.calculate_accuracy(h, t) File "/home/tamme/Desktop/humanparts/model.py", line 168, in calculate_accuracy mask = truths != -1 File "/usr/local/lib/python2.7/dist-packages/chainer/variable.py", line 463, in ne raise NotImplementedError() NotImplementedError

Is it a bug or have I done something incorrectly? I don't seem to find a related problem in google either.

opened by Tamme 3
On reproducing some fcn experiments

I am now working to reproduce some experiments reported in the fcn paper. https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdf

Compared to the layer-by-layer refinement training reported in the paper, the current model is trained end-to-end. It is interesting to see how much differences it make. One thing I observed is that the bilinear interpolation initialization for deconvolution layer is essential, otherwise the model can not be properly trained.

I will also include some pretrained models if needed.

opened by bobye 1
important changes to the model

I add some important changes to the initialization. Now the current model is successfully trained on a simplified setting (> 90 pixel accuracy at 20 epochs), which is the binary segmentation of human silhouette. I will try the four parts and fourteen parts segmentation later.

opened by bobye 1
ValueError: axes don't match array
I am using a p2 instance on aws .I ran python train.py -g 0 -b 5 -e 3000 -l on -s on

And I am getting

loading VGG model... ('training datasets: ', 9788, 'test datasets: ', 516) Epoch 1 : training... 0it [00:00, ?it/s]Create Iterator settings 1it [00:04, 4.19s/it]Traceback (most recent call last): File "train.py", line 102, in model, optimizer, train_mean_loss, train_ac = train(model, optimizer, MiniBatchLoader, train_mean_loss, train_ac) File "train.py", line 27, in train for X, y in tqdm(MiniBatchLoader): File "/usr/local/lib/python2.7/dist-packages/tqdm/_tqdm.py", line 816, in iter for obj in iterable: File "/home/ubuntu/deep-learning-for-human-part-discovery-in-images/data.py", line 83, in next minibatch_X, minibatch_y = self.process_batch(minibatch_X, minibatch_y) File "/home/ubuntu/deep-learning-for-human-part-discovery-in-images/data.py", line 145, in process_batch reshaped_X = np.transpose(self.standardize(processed_X), (0, 3, 1, 2)) # n_batch, n_channel, h, w File "/usr/local/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 556, in transpose return transpose(axes) ValueError: axes don't match array

Update - I am not able to reproduce the issue .Everything seems to be working fine now
opened by cricket1 1
TypeError: __init__() got unexpected keyword argument(s) 'wscale' heeeeellllllppppppppppppppp

loading VGG model... Traceback (most recent call last): File "train.py", line 81, in model = HumanPartsNet(n_class=15) File "/home/leisther/Downloads/deep-learning-for-human-part-discovery-in-images-master/model.py", line 53, in init upsample_pool1=L.Convolution2D(64, self.n_class, ksize=1, stride=1, pad=0, wscale=0.01), File "/home/leisther/anaconda2/envs/venv/lib/python2.7/site-packages/chainer/links/connection/convolution_2d.py", line 120, in init deterministic="deterministic argument is not supported anymore. " File "/home/leisther/anaconda2/envs/venv/lib/python2.7/site-packages/chainer/utils/argument.py", line 19, in parse_kwargs raise TypeError(message) TypeError: init() got unexpected keyword argument(s) 'wscale'

opened by leisther 1
MiniBatchLoader

Hello, I'm having an issue regarding the dataset preparation process here is the error:

loading VGG model... training datasets: 9273 test datasets: 1031 scanning all images for human part labels ... 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10304/10304 [07:30<00:00, 22.87it/s] training datasets: 0 test datasets: 0 found 3785 images Epoch 1 : training... 0it [00:00, ?it/s]Traceback (most recent call last): File "train.py", line 113, in model, optimizer, train_mean_loss, train_ac, train_IoU = train(model, optimizer, MiniBatchLoader, train_mean_loss, train_ac, train_IoU) File "train.py", line 27, in train for X, y in tqdm(MiniBatchLoader): File "C:\Users\Lenovo\AppData\Local\Programs\Python\Python36\lib\site-packages\tqdm_tqdm.py", line 962, in iter for obj in iterable: TypeError: iter() returned non-iterator of type 'MiniBatchLoader'

could anyone help me please it's URGENT

opened by IbrahimMuh96 0
Any hints about using the pre-trained pkl file?

Hi, I am a newbie to ML and just followed the readme to download the FCN model, without any idea using it. My goal is to use the pre-trained model and see how it works, because my GPU is around 6GB which seems not enough for training. I googled the pkl file and it seems to be something working with python pickle model, so I:

python ActivePython 2.7.10.12 (ActiveState Software Inc.) based on Python 2.7.10 (default, Aug 21 2015, 12:07:58) [MSC v.1500 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information.

import pickle loaded_model = pickle.load(open('fcn-8s-pascalcontext_W_and_b.pkl', 'rb'))

and get the dictionary "loaded model", with a bunch of numbers, while the keys are:

loaded_model.keys() ['conv4_3_W', 'conv5_1_b', 'conv1_2_b', 'upsample_W', 'conv5_2_b', 'conv1_1_W', 'score-pool3_W', 'conv5_2_W', 'conv5_3_W', 'conv1_1_b', 'conv4_3_b', 'conv5_3_b', 'conv5_1_W', 'conv1_2_W', 'conv3_2_W', 'conv4_2_b', 'conv4_1_b', 'upscore2_W', 'conv3_3_W', 'conv2_1_b', 'conv3_1_b', 'conv2_2_W', 'fc6_b', 'score-pool4_W', 'fc7_b', 'score59_W', 'conv2_2_b', 'fc6_W', 'upsample-fused-16_W', 'score59_b', 'fc7_W', 'conv4_1_W', 'conv3_2_b', 'conv4_2_W', 'score-pool4_b', 'conv3_3_b', 'conv3_1_W', 'conv2_1_W', 'score-pool3_b']

Then stuck here, not knowing what to do next. Could anyone please help telling how to use the pre-trained model? Any hints would be appreciative.

opened by adayoegi 6

Owner

Shintaro Shiba

Software engineer. MLOps / DataPipeline / ML / Autonomous driving / Event-based camera

GitHub

Tensorboard for pytorch (and chainer, mxnet, numpy, ...)

tensorboardX Write TensorBoard events with simple function call. The current release (v2.3) is tested on anaconda3, with PyTorch 1.8.1 / torchvision 0

7.5k Dec 28, 2022

This is the implementation of "SELF SUPERVISED REPRESENTATION LEARNING WITH DEEP CLUSTERING FOR ACOUSTIC UNIT DISCOVERY FROM RAW SPEECH" submitted to ICASSP 2022

CPC_DeepCluster This is the implementation of "SELF SUPERVISED REPRESENTATION LEARNING WITH DEEP CLUSTERING FOR ACOUSTIC UNIT DISCOVERY FROM RAW SPEEC

2 Sep 15, 2022

Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.

NeuralTextures This is repository with inference code for paper "StylePeople: A Generative Model of Fullbody Human Avatars" (CVPR21). This code is for

Visual Understanding Lab @ Samsung AI Center Moscow

18 Oct 6, 2022

TorchIO is a Medical image preprocessing and augmentation toolkit for deep learning. Part of the PyTorch Ecosystem.

Medical image preprocessing and augmentation toolkit for deep learning. Part of the PyTorch Ecosystem.

1.6k Jan 6, 2023

A python implementation of Physics-informed Spline Learning for nonlinear dynamics discovery

PiSL A python implementation of Physics-informed Spline Learning for nonlinear dynamics discovery. Sun, F., Liu, Y. and Sun, H., 2021. Physics-informe

8 Jul 13, 2022

Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery"

SegSwap Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery" [PDF] [Project page] If our project

41 Dec 10, 2022

source code for https://arxiv.org/abs/2005.11248 "Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics"

Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics This work will be published in Nature Biomedical

71 Nov 15, 2022

[CVPR2021] UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles

UAV-Human Official repository for CVPR2021: UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicle Paper arXiv Res

129 Jan 4, 2023

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors, CVPR 2021

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors Human POSEitioning System (H

66 Dec 21, 2022

Human Action Controller - A human action controller running on different platforms.

Human Action Controller (HAC) Goal A human action controller running on different platforms. Fun Easy-to-use Accurate Anywhere Fun Examples Mouse Cont

27 Jul 20, 2022

Python scripts for performing 3D human pose estimation using the Mobile Human Pose model in ONNX.

99 Dec 31, 2022

StyleGAN-Human: A Data-Centric Odyssey of Human Generation

StyleGAN-Human: A Data-Centric Odyssey of Human Generation Abstract: Unconditional human image generation is an important task in vision and graphics,

762 Jan 8, 2023

A PyTorch implementation of "Pathfinder Discovery Networks for Neural Message Passing"

A PyTorch implementation of "Pathfinder Discovery Networks for Neural Message Passing" (WebConf 2021). Abstract In this work we propose Pathfind

49 Dec 1, 2022

Official implementation of "A Unified Objective for Novel Class Discovery", ICCV2021 (Oral)

A Unified Objective for Novel Class Discovery This is the official repository for the paper: A Unified Objective for Novel Class Discovery Enrico Fini

118 Dec 26, 2022

Pytorch implementation of the unsupervised object discovery method LOST.

LOST Pytorch implementation of the unsupervised object discovery method LOST. More details can be found in the paper: Localizing Objects with Self-Sup

189 Dec 25, 2022

MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images

MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images This repository contains the implementation of our paper MetaAvatar: Learni

96 Dec 13, 2022

The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

Deep High-Resolution Representation Learning for Human Pose Estimation (CVPR 2019) News [2020/07/05] A very nice blog from Towards Data Science introd

3.9k Jan 5, 2023

Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning

Human-Level Control through Deep Reinforcement Learning Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning. This imp

2.4k Dec 26, 2022

Neighborhood Contrastive Learning for Novel Class Discovery

Neighborhood Contrastive Learning for Novel Class Discovery This repository contains the official implementation of our paper: Neighborhood Contrastiv

56 Dec 9, 2022