ROMP: Monocular, One-stage, Regression of Multiple 3D People, ICCV21

Yu Sun

Last update: Jan 4, 2023

Related tags

Deep Learning pytorch bottom-up pose-estimation smpl multi-person 3d-mesh-recovery multi-person-3d-mesh-recovery

Overview

Monocular, One-stage, Regression of Multiple 3D People

ROMP, accepted by ICCV 2021, is a concise one-stage network for multi-person 3D mesh recovery from a single image.

Simple. Concise one-stage framework for simultaneous person detection and 3D body mesh recovery.
Fast. ROMP can achieve real-time inference on a 1070Ti GPU.
Strong. ROMP achieves superior performance on multiple challenging multi-person/occlusion benchmarks.
Easy to use. We provide user friendly testing API and webcam demos.

Contact: [email protected]. Feel free to contact me for related questions or discussions! arXiv paper.

Features
News
Getting Started
- Try on Google Colab
- Installation
Inference
Train
Evaluation
Bugs report
Citation
Contributor
Acknowledgement

Features

Running the examples on Google Colab.
Real-time online multi-person webcam demo for driving textured SMPL model. We also provide a wardrobe for changing clothes.
Batch processing images/videos via command line / jupyter notebook / calling ROMP as a python lib.
Exporting the captured single-person motion to FBX file for Blender/Unity usage.
Training and evaluation for re-implementing our results presented in paper.
Convenient API for 2D / 3D visualization, parsed datasets.

News

2021/12/2: Add optional renderers (pyrender or pytorch3D). Fix some bugs reported in issues.
✨ ✨ 2021/10/10: V1.1 released, including multi-person webcam, extracting , webcam temporal optimization, live blender character animation, interactive visualization. Let's try!
2021/9/13: Low FPS / args parsing bugs are fixed. Support calling as a python lib.
2021/9/10: Training code release. API optimization.
Old logs

Getting started

Try on Google Colab

It allows you to run the project in the cloud, free of charge. Let's give the prepared Google Colab demo a try.

Installation

Please refer to install.md for installation.

Inference

Currently, we support processing images, video or real-time webcam.
Pelease refer to config_guide.md for configurations.
ROMP can be called as a python lib inside the python code, jupyter notebook, or from command line / scripts, please refer to Google Colab demo for examples.

Processing images

To re-implement the demo results, please run

cd ROMP
# change the `inputs` in configs/image.yml to /path/to/your/image folder, then run 
sh scripts/image.sh
# or run the command like
python -m romp.predict.image --inputs=demo/images --output_dir=demo/image_results

Please refer to config_guide.md for saving the estimated mesh/Center maps/parameters dict.

For interactive visualization, please run

python -m romp.predict.image --inputs=demo/images --output_dir=demo/image_results --show_mesh_stand_on_image  --interactive_vis

Caution: To use show_mesh_stand_on_image and interactive_vis, you must run ROMP on a computer with visual desktop to support the rendering. Most remote servers without visual desktop is not supported. Please use save_visualization_on_img instead.

Here, we show an example of calling ROMP as a python lib to process images.

click here to show the code


```bash
# set the absolute path to ROMP
path_to_romp = '/path/to/ROMP'
import os,sys
sys.path.append(path_to_romp)
# set the detailed configurations
from romp.lib.config import ConfigContext, parse_args, args
ConfigContext.parsed_args = parse_args(["--configs_yml=configs/image.yml",'--inputs=/path/to/images_folder', '--output_dir=/path/to/save/image_results', '--save_centermap', False]) # Be caution that setting the bool configs needs two elements, ['--config', True/False]
# import the ROMP image processor
from romp.predict.image import Image_processor
processor = Image_processor(args_set=args())
results_dict = processor.run(args().inputs) # you can change the args().inputs to other /path/to/images_folder
```

Processing videos

cd ROMP
python -m romp.predict.video --inputs=demo/videos/sample_video.mp4 --output_dir=demo/sample_video_results --save_visualization_on_img --save_dict_results

# or you can set all configurations in configs/video.yml, then run 
sh scripts/video.sh

We notice that some users only want to extract the motion of the formost person, like this

To achieve this, please run

python -m romp.predict.video --inputs=demo/videos/demo_video_frames --output_dir=demo/demo_video_fp_results --show_largest_person_only --save_dict_results --show_mesh_stand_on_image

All functions can be combined or work individually. Welcome to try them.

Here, we show an example of calling ROMP as a python lib to process videos.

click here to show the code


```bash
# set the absolute path to ROMP
path_to_romp = '/path/to/ROMP'
import os,sys
sys.path.append(path_to_romp)
# set the detailed configurations
from romp.lib.config import ConfigContext, parse_args, args
ConfigContext.parsed_args = parse_args(["--configs_yml=configs/video.yml",'--inputs=/path/to/video', '--output_dir=/path/to/save/video_results', '--save_visualization_on_img',False]) # Be caution that setting the bool configs needs two elements, ['--config', True/False]
# import the ROMP image processor
from romp.predict.video import Video_processor
processor = Video_processor(args_set=args())
results_dict = processor.run(args().inputs) # you can change the args().inputs to other /path/to/video
```

Webcam

To do this you just need to run:

cd ROMP
sh scripts/webcam.sh

To drive a character in Blender, please refer to expert.md.

Export

Export to Blender FBX

Please refer to expert.md to export the results to fbx files for Blender usage. Currently, this function only support the single-person video cases. Therefore, please test it with demo/videos/sample_video2_results/sample_video2.mp4, whose results would be saved to demo/videos/sample_video2_results.

Blender Addons

Chuanhang Yan : developing an addon for driving character in Blender.
VLT Media creates a QuickMocap-BlenderAddon to read the .npz file created by ROMP. Clean & smooth the resulting keyframes.

Train

Please prepare the training datasets following dataset.md, and then refer to train.md for training.

Evaluation

Please refer to evaluation.md for evaluation on benchmarks.

Bugs report

Please refer to bug.md for solutions. Welcome to submit the issues for related bugs. I will solve them as soon as possible.

Citation

@InProceedings{ROMP,
author = {Sun, Yu and Bao, Qian and Liu, Wu and Fu, Yili and Michael J., Black and Mei, Tao},
title = {Monocular, One-stage, Regression of Multiple 3D People},
booktitle = {ICCV},
month = {October},
year = {2021}
}

Contributor

This repository is currently maintained by Yu Sun.

ROMP has also benefited from many developers, including

Marco Musy : help in the textured SMPL visualization.
Gavin Gray : adding support for an elegant context manager to run code in a notebook.
VLT Media : adding support for running on Windows & batch_videos.py.
Chuanhang Yan : developing an addon for driving character in Blender.

Acknowledgement

We thank Peng Cheng for his constructive comments on Center map training.

Here are some great resources we benefit:

SMPL models and layer is borrowed from MPII SMPL-X model.
Some functions are borrowed from HMR-pytorch and SPIN.
The evaluation code and GT annotations of 3DPW dataset is brought from 3dpw-eval and VIBE.
3D mesh visualization is supported by vedo, EasyMocap, minimal-hand, Open3D, and Pyrender.

Please consider citing their papers.

Comments

数据处理细节讨论

你好，我看你论文提到使用了Movi数据集，原始的Movi数据集只提供了3d以及AMASS通过fitting得到的关于SMPL-H的mesh参数。SMPL-H关于shape有16个值，pose有52个，我想知道你在使用Movi的数据集时候，是否使用了AMASS提供的pose和beta。我目前将beta的前10个值和pose前22个值导入到我们当前的SMPL里面，然后借助Movi提供的相机内参和外参以及参考Movi提供的教程，始终无法在我们这个程序中可视化成功。求指教。

如果没有使用，是不是因为AMASS提供的beta和pose不适用我们这个SMPL模型，但是看AMASS论文好像是可以的。 MoVi-Toolbox：MoVi-Toolbox https://github.com/saeed1262/MoVi-Toolbox/blob/master/MoCap/utils.py

按照Movi的教程以及将其mesh系数按照beta[0:10] pose[0:22] + 两个手部pose设置为0传入到我们当前的SMPL smpl_model_path = './centerHMR/models/smpl/SMPL_NEUTRAL.pkl' self.smplx = smpl_model.create(smpl_model_path, batch_size=self.batch_size,model_type=self.model_type, gender='neutral', use_face_contour=False, ext='npz', joint_mapper=joint_mapper,flat_hand_mean=True, use_pca=False).cuda()

@Arthur151
question

opened by zhLawliet 34
Reproduce Result

Hi. Thanks your work. Could you show me your training log? I can't reproduce paper's results. This is my log file and yaml file. I only change the batch-size to 48 because of my memory. Anything else is default. hrnet_cm64_V1_hrnet.log hrnet_cm64_V1_hrnet_yml.log

opened by panshaohua 28
A simple question about camera and coordinate system.
Hi, I have a simple question about ROMP. I have been struggling putting people into their correct relative position, but is it really possible using the root-aligned SMPL meshes without predicting their transl? (And if we have camera param K, will it be possible? )

What is the coordinate system of the vertices that are used for rendering? I think we are predicting camera coordinate system points but root-aligned, correct?

Following Q1, before rendering verts onto image, there is a trans added to verts ('cam_trans in projection.py') What is it? and what is estimate_translation actually doing? Is this estimating root's position? https://github.com/Arthur151/ROMP/blob/e30b7d17f13089fa9fa114df494192e31b0f43ed/romp/lib/visualization/visualization.py#L61

I tried to replace the verts +trans in Q2 with GT mesh, so verts=GT_verts, without any other changes to your code, but the results are not correct, I expect it to be fully matched the person on the image but there are always shifts, and I also can't use the same FOV otherwise it would be a very small mesh on the image.

Sorry if I understand anything wrong. I think rendering is the final part I didn't understand in your code. Looking forward you for your answer!

Zhengdi
opened by ZhengdiYu 24

Can't run demo with provided instructions

Hi there,

Really awesome work. Unfortunately, it can't run with the provided instructions. Trying to sh run.sh results in multiple errors.

First:

----------------
Traceback (most recent call last):
  File "core/test.py", line 2, in <module>
    from base import *
  File "/home/jb/Documents/python/CenterHMR/src/core/base.py", line 20, in <module>
    from dataset.mixed_dataset import SingleDataset
  File "/home/jb/Documents/python/CenterHMR/src/dataset/mixed_dataset.py", line 5, in <module>
    from dataset.internet import Internet
  File "/home/jb/Documents/python/CenterHMR/src/dataset/internet.py", line 9, in <module>
    import smplx
ModuleNotFoundError: No module named 'smplx'

I installed smplx from here: https://github.com/vchoutas/smplx

----------------
In Ubuntu, using osmesa mode for rendering
Traceback (most recent call last):
  File "/home/jb/anaconda3/envs/centerhmr/lib/python3.7/site-packages/OpenGL/platform/osmesa.py", line 25, in GL
    mode=ctypes.RTLD_GLOBAL 
  File "/home/jb/anaconda3/envs/centerhmr/lib/python3.7/site-packages/OpenGL/platform/ctypesloader.py", line 45, in loadLibrary
    return dllType( name, mode )
  File "/home/jb/anaconda3/envs/centerhmr/lib/python3.7/ctypes/__init__.py", line 364, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: ('OSMesa: cannot open shared object file: No such file or directory', 'OSMesa', None)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "core/test.py", line 2, in <module>
    from base import *
  File "/home/jb/Documents/python/CenterHMR/src/core/base.py", line 21, in <module>
    from visualization.visualization import Visualizer
  File "/home/jb/Documents/python/CenterHMR/src/visualization/visualization.py", line 15, in <module>
    from .renderer import get_renderer
  File "/home/jb/Documents/python/CenterHMR/src/visualization/renderer.py", line 19, in <module>
    import pyrender
  File "/home/jb/anaconda3/envs/centerhmr/lib/python3.7/site-packages/pyrender/__init__.py", line 3, in <module>
    from .light import Light, PointLight, DirectionalLight, SpotLight
  File "/home/jb/anaconda3/envs/centerhmr/lib/python3.7/site-packages/pyrender/light.py", line 10, in <module>
    from OpenGL.GL import *
  File "/home/jb/anaconda3/envs/centerhmr/lib/python3.7/site-packages/OpenGL/GL/__init__.py", line 3, in <module>
    from OpenGL import error as _error
  File "/home/jb/anaconda3/envs/centerhmr/lib/python3.7/site-packages/OpenGL/error.py", line 12, in <module>
    from OpenGL import platform, _configflags
  File "/home/jb/anaconda3/envs/centerhmr/lib/python3.7/site-packages/OpenGL/platform/__init__.py", line 35, in <module>
    _load()
  File "/home/jb/anaconda3/envs/centerhmr/lib/python3.7/site-packages/OpenGL/platform/__init__.py", line 32, in _load
    plugin.install(globals())
  File "/home/jb/anaconda3/envs/centerhmr/lib/python3.7/site-packages/OpenGL/platform/baseplatform.py", line 92, in install
    namespace[ name ] = getattr(self,name,None)
  File "/home/jb/anaconda3/envs/centerhmr/lib/python3.7/site-packages/OpenGL/platform/baseplatform.py", line 14, in __get__
    value = self.fget( obj )
  File "/home/jb/anaconda3/envs/centerhmr/lib/python3.7/site-packages/OpenGL/platform/osmesa.py", line 66, in GetCurrentContext
    function = self.OSMesa.OSMesaGetCurrentContext
  File "/home/jb/anaconda3/envs/centerhmr/lib/python3.7/site-packages/OpenGL/platform/baseplatform.py", line 14, in __get__
    value = self.fget( obj )
  File "/home/jb/anaconda3/envs/centerhmr/lib/python3.7/site-packages/OpenGL/platform/osmesa.py", line 60, in OSMesa
    def OSMesa( self ): return self.GL
  File "/home/jb/anaconda3/envs/centerhmr/lib/python3.7/site-packages/OpenGL/platform/baseplatform.py", line 14, in __get__
    value = self.fget( obj )
  File "/home/jb/anaconda3/envs/centerhmr/lib/python3.7/site-packages/OpenGL/platform/osmesa.py", line 28, in GL
    raise ImportError("Unable to load OpenGL library", *err.args)
ImportError: ('Unable to load OpenGL library', 'OSMesa: cannot open shared object file: No such file or directory', 'OSMesa', None)

I had to install Mesa using the instructions from PyRender: https://pyrender.readthedocs.io/en/latest/install/

Traceback (most recent call last):
  File "core/test.py", line 50, in <module>
    main()
  File "core/test.py", line 46, in main
    demo.run(demo_image_folder)
  File "core/test.py", line 20, in run
    self.visualizer = Visualizer(model_type=self.model_type,resolution =vis_size, input_size=self.input_size, result_img_dir = test_save_dir,with_renderer=True)
  File "/home/jb/Documents/python/CenterHMR/src/visualization/visualization.py", line 23, in __init__
    self.renderer = get_renderer(model_type=model_type,resolution=self.resolution)
  File "/home/jb/Documents/python/CenterHMR/src/visualization/renderer.py", line 138, in get_renderer
    renderer = Renderer(faces,resolution=resolution[:2])
  File "/home/jb/Documents/python/CenterHMR/src/visualization/renderer.py", line 69, in __init__
    point_size=1.0)
  File "/home/jb/anaconda3/envs/centerhmr/lib/python3.7/site-packages/pyrender/offscreen.py", line 31, in __init__
    self._create()
  File "/home/jb/anaconda3/envs/centerhmr/lib/python3.7/site-packages/pyrender/offscreen.py", line 149, in _create
    self._platform.init_context()
  File "/home/jb/anaconda3/envs/centerhmr/lib/python3.7/site-packages/pyrender/platforms/osmesa.py", line 19, in init_context
    from OpenGL.osmesa import (
ImportError: cannot import name 'OSMesaCreateContextAttribs' from 'OpenGL.osmesa' (/home/jb/anaconda3/envs/centerhmr/lib/python3.7/site-packages/OpenGL/osmesa/__init__.py)

This one stumped me, couldn't get it to work. Maybe my Mesa installation is still not correct?

bug

opened by jbohnslav 20

smpl_mesh_root_align

Hi, I notice that your ROMP_HRNet_32.pkl was trained on smpl_mesh_root_align=False. But in v1.yml, smpl_mesh_root_align is not set, so it's default value True.

So My questions are:

(Solved✔) Firstly I found my model perform having the same issue as resnet (mesh shift), then I found the reason: Image.yml is initially designed for ROMP_HRNet_32.pkl, which was trained on smpl_mesh_root_align=False. If we want to test on image using our model trained from pre-trained model using hrnet and v1.yml, the smpl_mesh_root_align in image.yml should also be set to True, just like resnet #106 . So this was solved.
When should smpl_mesh_root_align be True or False? Why did you set it to True for v1.yml and resnet, although it's false for ROMP_HRNet_32.pkl? I think for 3D joints loss, it doesn't matter as long as we would do another alignment before calculating MPJPE/PAMPJPE. And for the 2D part, the weak camera parameters will be automatically learnt to project those 3D joints to align with GT_2d as long as it's consistent all the time. ~So the last question is:
During fine-tuning from your model: ROMP_HRNet_32.pkl using v1_hrnet_3dpw_ft.yml. the smpl_mesh_root_align is also default value True, However, ROMP_HRNet_32.pkl was trained with smpl_mesh_root_align=True.

As we know from question1: if we use different setting of smpl_mesh_root_align, the visualization will be shifted, I think this could be a problem for training and fine-tuning.

And I tried to train with smpl_mesh_root_align from scratch, but it's ended up with error below:

Traceback (most recent call last):
  File "/home2/rctv12/miniconda3/envs/ROMP/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home2/rctv12/miniconda3/envs/ROMP/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home2/rctv12/projects/ROMP/multi-person/romp/train.py", line 148, in <module>
    main()
  File "/home2/rctv12/projects/ROMP/multi-person/romp/train.py", line 145, in main
    trainer.train()
  File "/home2/rctv12/projects/ROMP/multi-person/romp/train.py", line 33, in train
    self.train_epoch(epoch)
  File "/home2/rctv12/projects/ROMP/multi-person/romp/train.py", line 94, in train_epoch
    self.train_log_visualization(outputs, loss, run_time, data_time, losses, losses_dict, epoch, iter_index)
  File "/home2/rctv12/projects/ROMP/multi-person/romp/train.py", line 74, in train_log_visualization
    vis_cfg={'settings': ['save_img'], 'vids': vis_ids, 'save_dir':self.train_img_dir, 'save_name':save_name, 'verrors': [vis_errors], 'error_names':['E']})
  File "/home2/rctv12/projects/ROMP/multi-person/romp/lib/models/../utils/../visualization/visualization.py", line 102, in visulize_result
    rendered_imgs = self.visualize_renderer_verts_list(per_img_verts_list, images=org_imgs.copy(), trans=mesh_trans)
  File "/home2/rctv12/projects/ROMP/multi-person/romp/lib/models/../utils/../visualization/visualization.py", line 62, in visualize_renderer_verts_list
    rendered_img = self.renderer(verts, faces, colors=color, focal_length=args().focal_length, cam_params=cam_params)
  File "/home2/rctv12/projects/ROMP/multi-person/romp/lib/models/../utils/../visualization/renderer_pt3d.py", line 102, in __call__
    images = self.renderer(meshes)
  File "/home2/rctv12/miniconda3/envs/ROMP/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home2/rctv12/miniconda3/envs/ROMP/lib/python3.7/site-packages/pytorch3d/renderer/mesh/renderer.py", line 59, in forward
    fragments = self.rasterizer(meshes_world, **kwargs)
  File "/home2/rctv12/miniconda3/envs/ROMP/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home2/rctv12/miniconda3/envs/ROMP/lib/python3.7/site-packages/pytorch3d/renderer/mesh/rasterizer.py", line 168, in forward
    meshes_proj = self.transform(meshes_world, **kwargs)
  File "/home2/rctv12/miniconda3/envs/ROMP/lib/python3.7/site-packages/pytorch3d/renderer/mesh/rasterizer.py", line 147, in transform
    verts_world, eps=eps
  File "/home2/rctv12/miniconda3/envs/ROMP/lib/python3.7/site-packages/pytorch3d/transforms/transform3d.py", line 336, in transform_points
    points_out = _broadcast_bmm(points_batch, composed_matrix)
  File "/home2/rctv12/miniconda3/envs/ROMP/lib/python3.7/site-packages/pytorch3d/transforms/transform3d.py", line 753, in _broadcast_bmm
    return a.bmm(b)
RuntimeError: expected scalar type Half but found Float

I'm still debugging anyway.

opened by ZhengdiYu 19

作者您好，关于 root_align=True 和投影问题的请教您一下

作者您好，我是新研究这个领域的，有个困惑想请教您一下，请问设置 root_align=True 后，得到的是root-relative mesh吧，然后经过SMPL映射矩阵得到的3D pose 也是root-relative 3d pose吗？弱透视投影应该是将绝对的 3D human pose 投影到图片中与2D pose对齐的吧，应该是要加上root position的坐标吧？那是怎么得出相机空间中绝对的 root position的？

opened by Rookienovice 13
lsp与mpiinf数据集问题

作者您好，您所提供的7个数据集中我有两个训练有问题，皆为FileNotFoundError。

其中lsp的报错如下： Traceback (most recent call last): File "/home/omnisky/anaconda3/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/home/omnisky/anaconda3/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/data01/wyjh/ROMP/romp/train.py", line 149, in main() File "/data01/wyjh/ROMP/romp/train.py", line 145, in main trainer = Trainer() File "/data01/wyjh/ROMP/romp/train.py", line 14, in init self.loader = self._create_data_loader(train_flag=True) File "/data01/wyjh/ROMP/romp/base.py", line 133, in _create_data_loader datasets = MixedDataset(train_flag=train_flag) File "/data01/wyjh/ROMP/romp/lib/models/../utils/../dataset/mixed_dataset.py", line 36, in init self.datasets = [dataset_dictds for ds in datasets_used] File "/data01/wyjh/ROMP/romp/lib/models/../utils/../dataset/mixed_dataset.py", line 36, in self.datasets = [dataset_dictds for ds in datasets_used] File "/data01/wyjh/ROMP/romp/lib/models/../utils/../dataset/lsp.py", line 11, in init self.load_data() File "/data01/wyjh/ROMP/romp/lib/models/../utils/../dataset/lsp.py", line 38, in load_data self.load_eft_annots(os.path.join(config.project_dir, 'data/eft_fit/LSPet_ver01.json')) File "/data01/wyjh/ROMP/romp/lib/models/../utils/../dataset/lsp.py", line 44, in load_eft_annots annots = json.load(open(annot_file_path,'r'))['data'] FileNotFoundError: [Errno 2] No such file or directory: '/data01/wyjh/ROMP/data/eft_fit/LSPet_ver01.json' （不知道为什么会去data里找这个文件，我并没有data这个路径）

mpiinf的报错如下： Traceback (most recent call last): File "/home/omnisky/anaconda3/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/home/omnisky/anaconda3/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/data01/wyjh/ROMP/romp/train.py", line 149, in main() File "/data01/wyjh/ROMP/romp/train.py", line 145, in main trainer = Trainer() File "/data01/wyjh/ROMP/romp/train.py", line 14, in init self.loader = self._create_data_loader(train_flag=True) File "/data01/wyjh/ROMP/romp/base.py", line 133, in _create_data_loader datasets = MixedDataset(train_flag=train_flag) File "/data01/wyjh/ROMP/romp/lib/models/../utils/../dataset/mixed_dataset.py", line 36, in init self.datasets = [dataset_dictds for ds in datasets_used] File "/data01/wyjh/ROMP/romp/lib/models/../utils/../dataset/mixed_dataset.py", line 36, in self.datasets = [dataset_dictds for ds in datasets_used] File "/data01/wyjh/ROMP/romp/lib/models/../utils/../dataset/mpi_inf_3dhp.py", line 17, in init self.pack_data(annots_file_path) File "/data01/wyjh/ROMP/romp/lib/models/../utils/../dataset/mpi_inf_3dhp.py", line 111, in pack_data annot2 = sio.loadmat(annot_file_path)['annot2'] File "/home/omnisky/wyj/lib/python3.7/site-packages/scipy/io/matlab/mio.py", line 224, in loadmat with _open_file_context(file_name, appendmat) as f: File "/home/omnisky/anaconda3/lib/python3.7/contextlib.py", line 112, in enter return next(self.gen) File "/home/omnisky/wyj/lib/python3.7/site-packages/scipy/io/matlab/mio.py", line 17, in _open_file_context f, opened = _open_file(file_like, appendmat, mode) File "/home/omnisky/wyj/lib/python3.7/site-packages/scipy/io/matlab/mio.py", line 45, in _open_file return open(file_like, mode), True FileNotFoundError: [Errno 2] No such file or directory: '/data01/wyjh/ROMP/romp/lib/dataset/mpi_inf_3dhp/S1/Seq1/annot.mat' （同样，mpi_inf_3dhp中也没有S1这个文件夹）

我试着找了官网所提供的数据集发现并没有所缺内容，希望作者可以给我指点一下解决问题的方向。谢谢您！

opened by Wyethjjj 13
OpenGL.error.GLError

Thanks for your work. I am running your demo with CUDA_VISIBLE_DEVICES=0 python core/test.py --gpu=0 --configs_yml=configs/single_image.yml But an error occurs. I am running with ubuntu16.04. The python verison is 3.6.9. How should I check what's happening?

(romp) jack@jack-System-Product-Name:~/Documents/ROMP/src$ CUDA_VISIBLE_DEVICES=0 python core/test.py --configs_yml=configs/single_image.yml pygame 2.0.1 (SDL 2.0.14, Python 3.6.13) Hello from the pygame community. https://www.pygame.org/contribute.html INFO:root:{'tab': 'hrnet_cm64_single_image_test', 'configs_yml': 'configs/single_image.yml', 'demo_image_folder': '/path/to/image_folder', 'local_rank': 0, 'model_version': 1, 'multi_person': True, 'collision_aware_centermap': False, 'collision_factor': 0.2, 'kp3d_format': 'smpl24', 'eval': False, 'max_person': 64, 'input_size': 512, 'Rot_type': '6D', 'rot_dim': 6, 'centermap_conf_thresh': 0.25, 'centermap_size': 64, 'deconv_num': 0, 'model_precision': 'fp32', 'backbone': 'hrnet', 'gmodel_path': '../trained_models/ROMP_hrnet32.pkl', 'print_freq': 50, 'fine_tune': True, 'gpu': '0', 'batch_size': 64, 'val_batch_size': 1, 'nw': 4, 'calc_PVE_error': False, 'dataset_rootdir': '/home/jack/Documents/dataset/', 'high_resolution': True, 'save_best_folder': '/home/jack/Documents/checkpoints/', 'log_path': '/home/jack/Documents/log/', 'total_param_count': 85, 'smpl_mean_param_path': '/home/jack/Documents/ROMP/models/satistic_data/neutral_smpl_mean_params.h5', 'smpl_model': '/home/jack/Documents/ROMP/models/statistic_data/neutral_smpl_with_cocoplus_reg.txt', 'smplx_model': True, 'cam_dim': 3, 'beta_dim': 10, 'smpl_joint_num': 22, 'smpl_model_path': '/home/jack/Documents/ROMP/models', 'smpl_J_reg_h37m_path': '/home/jack/Documents/ROMP/models/smpl/J_regressor_h36m.npy', 'smpl_J_reg_extra_path': '/home/jack/Documents/ROMP/models/smpl/J_regressor_extra.npy', 'kernel_sizes': [5], 'GPUS': 0, 'use_coordmaps': True, 'webcam': False, 'video_or_frame': False, 'save_visualization_on_img': True, 'output_dir': '/path/to/outputdir', 'save_mesh': True, 'save_centermap': True, 'save_dict_results': True, 'multiprocess': False} INFO:root:------------------------------------------------------------------ INFO:root:start building model. Using ROMP v1 INFO:root:using fine_tune model: ../trained_models/ROMP_hrnet32.pkl INFO:root:finished build model. Traceback (most recent call last): File "core/test.py", line 225, in main() File "core/test.py", line 205, in main demo = Demo() File "core/test.py", line 7, in init self.prepare_modules() File "core/test.py", line 14, in prepare_modules self.visualizer = Visualizer(resolution=self.vis_size, input_size=self.input_size,with_renderer=True) File "/home/jack/Documents/ROMP/src/core/../lib/models/../utils/../maps_utils/../dataset/../dataset/../dataset/../visualization/visualization.py", line 23, in init self.renderer = get_renderer(resolution=resolution) File "/home/jack/Documents/ROMP/src/core/../lib/models/../utils/../maps_utils/../dataset/../dataset/../dataset/../visualization/../visualization/renderer.py", line 142, in get_renderer renderer = Renderer(faces,resolution=resolution[:2]) File "/home/jack/Documents/ROMP/src/core/../lib/models/../utils/../maps_utils/../dataset/../dataset/../dataset/../visualization/../visualization/renderer.py", line 72, in init point_size=1.0) File "/home/jack/anaconda3/envs/romp/lib/python3.6/site-packages/pyrender/offscreen.py", line 31, in init self._create() File "/home/jack/anaconda3/envs/romp/lib/python3.6/site-packages/pyrender/offscreen.py", line 149, in _create self._platform.init_context() File "/home/jack/anaconda3/envs/romp/lib/python3.6/site-packages/pyrender/platforms/egl.py", line 188, in init_context EGL_NO_CONTEXT, context_attributes File "/home/jack/anaconda3/envs/romp/lib/python3.6/site-packages/OpenGL/platform/baseplatform.py", line 402, in call return self( *args, **named ) File "/home/jack/anaconda3/envs/romp/lib/python3.6/site-packages/OpenGL/error.py", line 232, in glCheckError baseOperation = baseOperation, OpenGL.error.GLError: GLError( err = 12297, baseOperation = eglCreateContext, cArguments = ( <OpenGL._opaque.EGLDisplay_pointer object at 0x7ff367d4e268>, <OpenGL._opaque.EGLConfig_pointer object at 0x7ff367d4e1e0>, <OpenGL._opaque.EGLContext_pointer object at 0x7ff367e84d08>, <OpenGL.arrays.lists.c_int_Array_7 object at 0x7ff367e64d08>, ), result = <OpenGL._opaque.EGLContext_pointer object at 0x7ff367d311e0> )

opened by NoLookDefense 13
About bpy environment

Hey bro. I am coming again for the another problem. I want to run the new feature which can export the mesh to .fbx file, and it seems that "bpy" package is needed to download. But when I use pip install bpy, some CMake error always occurs. I wonder that which version of CMake you are using when you install bpy?

opened by NoLookDefense 12
How to render results with weak perspective camera
Thanks a lot for this great and easy-to-use repo!

I'm trying to render the results using the weak perspective camera model. My question relates to these issues:

https://github.com/Arthur151/ROMP/issues/134

https://github.com/Arthur151/ROMP/issues/241

https://github.com/Arthur151/ROMP/issues/300

However none of these issues gave me the answer I was looking for. I am using the weak perspective camera parameters stored in cam and as suggested in this issue I multiply them with 2. I also pad the image to be square as mentioned here. I then convert the weak perspective camera model to a projection matrix the same way I used to do it for VIBE, which worked well there. However, for the ROMP output I'm still getting a slight misalignment, as you can see in the following screenshot. The light model is what I am rendering and the blue model in the background is the visualization output from ROMP. I think it's because I should somehow account for cam_trans but I don't know how exactly. Can you help me with this?
opened by kaufManu 11
How to convert model to pth?
Can I convert .pkl file to .pth? For further conversion to ptl?

I tried to convert using torch.jit.script, but I have error

Compiled functions can't take variable number of arguments or use keyword-only arguments with defaults
opened by nikkorejz 11
你好，我想只输出3D位置显示，该如何处理输出结果？
请问大佬，如何绑定真实世界坐标系 https://github.com/Arthur151/ROMP/issues/372#issuecomment-1345247684，看了这个回复，还是不太懂，如何实现

当我镜头里只有两个人时候会出现缩放现象（人物比例不一，就像镜头拉近了一样），这个如何消除。无论几个人我只想保持原始比例不变。
opened by zhanghongyong123456 6
No such file or directory:ROMP/model_data/parameters/J_regressor_extra.npy

Hello! An error occurred while running the following code. python -m romp.predict.image --inputs=demo/images --output_dir=demo/image_results --show_mesh_stand_on_image --interactive_vis Can't find ROMP/model_data/parameters/J_regressor_extra.npy.

opened by Hrforeverqqqqqq 3

Issues training with CMU_Panoptic

Hello,

I am trying to train the model starting from pretrained resent on the cmu_panoptic dataset. However, I get the following error:

Traceback (most recent call last):
  File "HumanObj_videos_ResNet/train.py", line 277, in <module>
    main()
  File "HumanObj_videos_ResNet/train.py", line 273, in main
    trainer.train()
  File "HumanObj_videos_ResNet/train.py", line 77, in train
    self.train_epoch(epoch)
  File "HumanObj_videos_ResNet/train.py", line 192, in train_epoch
    for iter_index, meta_data in enumerate(self.loader):
  File "/z/home/mkhoshle/env/romp2/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 521, in __next__
    data = self._next_data()
  File "/z/home/mkhoshle/env/romp2/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1203, in _next_data
    return self._process_data(data)
  File "/z/home/mkhoshle/env/romp2/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1229, in _process_data
    data.reraise()
  File "/z/home/mkhoshle/env/romp2/lib/python3.8/site-packages/torch/_utils.py", line 434, in reraise
    raise exception
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/z/home/mkhoshle/env/romp2/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "/z/home/mkhoshle/env/romp2/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/z/home/mkhoshle/env/romp2/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/z/home/mkhoshle/Human_object_transform/HumanObj_videos_ResNet/lib/dataset/mixed_dataset.py", line 79, in __getitem__
    annots = self.datasets[dataset_id][index_sample]
  File "/z/home/mkhoshle/Human_object_transform/HumanObj_videos_ResNet/lib/dataset/image_base.py", line 375, in __getitem__
    return self.get_item_single_frame(index)
  File "/z/home/mkhoshle/Human_object_transform/HumanObj_videos_ResNet/lib/dataset/image_base.py", line 123, in get_item_single_frame
    kp3d, valid_masks[:,1] = self.process_kp3ds(info['kp3ds'], used_person_inds, \
  File "/z/home/mkhoshle/Human_object_transform/HumanObj_videos_ResNet/lib/dataset/image_base.py", line 284, in process_kp3ds
    kp3d_processed[inds] = kp3d
ValueError: could not broadcast input array from shape (17,3) into shape (54,3)

Do you know what I need to do to avoid this error?

Also, does the cmu_panoptic have the 2d pose annotation for all the people appearing in every image?

I would appreciate it if you could help me with this, Thanks,

opened by mkhoshle 5

Releases(V2.1)

V2.1(Jun 21, 2022)

Checkpoints, data used for training BEV.
Source code(tar.gz)
Source code(zip)
BEV_HRNet32_V6.pkl(177.67 MB)
cmu_panoptic_predictions.npz(14.94 MB)
model_data.zip(118.09 MB)
V2.0(Mar 14, 2022)

Model checkpoints of ROMP & BEV in simple-romp.
Source code(tar.gz)
Source code(zip)
BEV.pth(137.75 MB)
BEV_ft_agora.pth(137.75 MB)
ROMP.onnx(110.90 MB)
ROMP.pkl(111.61 MB)
smil_packed_info.pth(19.29 MB)
smpl.onnx(19.02 MB)
SMPLA_NEUTRAL.pth(20.16 MB)
smpla_packed_info.pth(20.16 MB)
SMPL_FEMALE.pth(19.29 MB)
SMPL_MALE.pth(19.29 MB)
SMPL_NEUTRAL.pth(19.29 MB)
smpl_packed_info.npz(36.56 MB)
smpl_packed_info.pth(19.29 MB)
v1.1(Sep 10, 2021)
Official 1.1 version of ROMP!

Training code and more evaluation.

Multi-person webcam demo.

Temporal tracking and optimization for webcam demo.

Live one character animation in blender.

Source code(tar.gz)
Source code(zip)
demo_videos.zip(7.54 MB)
model_data.zip(118.09 MB)
pose_higher_hrnet_w32_512.pth(109.83 MB)
pytorch3d-0.6.1-cp37-cp37m-linux_x86_64.whl(37.56 MB)
pytorch3d-0.6.1-cp38-cp38-linux_x86_64.whl(37.56 MB)
pytorch3d-0.6.1-cp39-cp39-linux_x86_64.whl(14.13 MB)
ROMP.zip(769.81 KB)
trained_models.zip(688.40 MB)
trained_models_try.zip(102.69 MB)
v1.0(Mar 31, 2021)

Formal ROMP V1.0 that integrates all functions, including ResNet-50 model and benchmark evaluation.
Source code(tar.gz)
Source code(zip)
ROMP_data.zip(497.89 MB)
ROMP_v1.0.zip(542.00 MB)
v0.1(Sep 12, 2020)

Add real-time webcam support.
Source code(tar.gz)
Source code(zip)
CenterHMR_v0.1.zip(178.62 MB)
v0.0(Sep 4, 2020)

Demo code for internet images. Let's try.
Source code(tar.gz)
Source code(zip)
CenterHMR.zip(174.97 MB)
CenterHMR_data.zip(157.42 MB)

Owner

Yu Sun

I am a Ph.D. student at HIT, an intern at JDAI-CV, working on monocular 3D human mesh recovery.

GitHub

Code for Two-stage Identifier: "Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition"

Code for Two-stage Identifier: "Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition", accepted at ACL 2021. For details of the model and experiments, please see our paper.

87 Dec 16, 2022

Virtual Dance Reality Stage: a feature that offers you to share a stage with another user virtually

Portrait Segmentation using Tensorflow This script removes the background from an input image. You can read more about segmentation here Setup The scr

291 Dec 24, 2022

People log into different sites every day to get information and browse through these sites one by one

HyperLink People log into different sites every day to get information and browse through these sites one by one. And they are exposed to advertisemen

0 Feb 17, 2022

Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)

Introduction Pytorch implementation of Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Expert. | paper Song Park1

97 Dec 23, 2022

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. (CVPR 2021)

GDR-Net This repo provides the PyTorch implementation of the work: Gu Wang, Fabian Manhardt, Federico Tombari, Xiangyang Ji. GDR-Net: Geometry-Guided

169 Jan 7, 2023

Quantile Regression DQN a Minimal Working Example, Distributional Reinforcement Learning with Quantile Regression

Quantile Regression DQN Quantile Regression DQN a Minimal Working Example, Distributional Reinforcement Learning with Quantile Regression (https://arx

80 Sep 17, 2022

Hitters Linear Regression - Hitters Linear Regression With Python

Hitters_Linear_Regression Kullanacağımız veri seti Carnegie Mellon Üniversitesi'

2 Jan 26, 2022

Ranking Models in Unlabeled New Environments （iccv21）

Ranking Models in Unlabeled New Environments Prerequisites This code uses the following libraries Python 3.7 NumPy PyTorch 1.7.0 + torchivision 0.8.1

14 Dec 17, 2021

[ICCV21] Self-Calibrating Neural Radiance Fields

Self-Calibrating Neural Radiance Fields, ICCV, 2021 Project Page | Paper | Video Author Information Yoonwoo Jeong [Google Scholar] Seokjun Ahn [Google

381 Dec 30, 2022

[ICCV21] Code for RetrievalFuse: Neural 3D Scene Reconstruction with a Database

RetrievalFuse Paper | Project Page | Video RetrievalFuse: Neural 3D Scene Reconstruction with a Database Yawar Siddiqui, Justus Thies, Fangchang Ma, Q

75 Dec 22, 2022

Code for one-stage adaptive set-based HOI detector AS-Net.

AS-Net Code for one-stage adaptive set-based HOI detector AS-Net. Mingfei Chen*, Yue Liao*, Si Liu, Zhiyuan Chen, Fei Wang, Chen Qian. "Reformulating

45 Dec 9, 2022

(CVPR2021) DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation

DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation CVPR2021(oral) [arxiv] Requirements python3.7 pytorch==

85 Dec 7, 2022

[CVPR2021] Look before you leap: learning landmark features for one-stage visual grounding.

LBYL-Net This repo implements paper Look Before You Leap: Learning Landmark Features For One-Stage Visual Grounding CVPR 2021. Getting Started Prerequ

45 Dec 12, 2022

TOOD: Task-aligned One-stage Object Detection, ICCV2021 Oral

One-stage object detection is commonly implemented by optimizing two sub-tasks: object classification and localization, using heads with two parallel branches, which might lead to a certain level of spatial misalignment in predictions between the two tasks.

264 Jan 9, 2023

ROMP: Monocular, One-stage, Regression of Multiple 3D People, ICCV21

Related tags

Overview

Monocular, One-stage, Regression of Multiple 3D People

Table of contents

Features

News

Getting started

Try on Google Colab

Installation

Inference

Processing images

Processing videos

Webcam

Export

Export to Blender FBX

Blender Addons

Train

Evaluation

Bugs report

Citation

Contributor

Acknowledgement

Comments

Releases(V2.1)

V2.1(Jun 21, 2022)

V2.0(Mar 14, 2022)

v1.1(Sep 10, 2021)

v1.0(Mar 31, 2021)

v0.1(Sep 12, 2020)

v0.0(Sep 4, 2020)

Owner

Yu Sun

Code for Two-stage Identifier: "Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition"

Virtual Dance Reality Stage: a feature that offers you to share a stage with another user virtually

People log into different sites every day to get information and browse through these sites one by one

Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. (CVPR 2021)

Quantile Regression DQN a Minimal Working Example, Distributional Reinforcement Learning with Quantile Regression

Hitters Linear Regression - Hitters Linear Regression With Python

Ranking Models in Unlabeled New Environments （iccv21）

[ICCV21] Self-Calibrating Neural Radiance Fields

[ICCV21] Code for RetrievalFuse: Neural 3D Scene Reconstruction with a Database

Code for one-stage adaptive set-based HOI detector AS-Net.

(CVPR2021) DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation

[CVPR2021] Look before you leap: learning landmark features for one-stage visual grounding.

TOOD: Task-aligned One-stage Object Detection, ICCV2021 Oral

DAFNe: A One-Stage Anchor-Free Deep Model for Oriented Object Detection

A Fast and Accurate One-Stage Approach to Visual Grounding, ICCV 2019 (Oral)

FishNet: One Stage to Detect, Segmentation and Pose Estimation

Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022

Monocular Depth Estimation - Weighted-average prediction from multiple pre-trained depth estimation models