Camera calibration & 3D pose estimation tools for AcinoSet

Overview

AcinoSet: A 3D Pose Estimation Dataset and Baseline Models for Cheetahs in the WildCheetah

Daniel Joska, Liam Clark, Naoya Muramatsu, Ricardo Jericevich, Fred Nicolls, Alexander Mathis, Mackenzie W. Mathis, Amir Patel

AcinoSet is a dataset of free-running cheetahs in the wild that contains 119,490 frames of multi-view synchronized high-speed video footage, camera calibration files and 7,588 human-annotated frames. We utilize markerless animal pose estimation with DeepLabCut to provide 2D keypoints (in the 119K frames). Then, we use three methods that serve as strong baselines for 3D pose estimation tool development: traditional sparse bundle adjustment, an Extended Kalman Filter, and a trajectory optimization-based method we call Full Trajectory Estimation. The resulting 3D trajectories, human-checked 3D ground truth, and an interactive tool to inspect the data is also provided. We believe this dataset will be useful for a diverse range of fields such as ecology, robotics, biomechanics, as well as computer vision.

AcinoSet code by:

Prerequisites

  • Anaconda
  • The dependecies defined in conda_envs/*.yml

What we provide:

The following sections document how this was created by the code within this repo:

Pre-trained DeepLabCut Model:

  • You can use the full_cheetah model provided in the DLC Model Zoo to re-create the existing H5 files (or on new videos).
  • Here, we also already provide the videos and H5 outputs of all frames, here.

Labelling Cheetah Body Positions:

If you want to label more cheetah data, you can also do so within the DeepLabCut framework. We provide a conda file for an easy-install, but please see the repo for installation and instructions for use.

$ conda env create -f conda_envs/DLC.yml -n DLC

AcinoSet Setup:

Navigate to the AcinoSet folder and build the environment:

$ conda env create -f conda_envs/acinoset.yml

Launch Jupyter Lab:

$ jupyter lab

Camera Calibration and 3D Reconstruction:

Intrinsic and Extrinsic Calibration:

Open calib_with_gui.ipynb and follow the instructions.

Alternatively, if the checkerboard points detected in calib_with_gui.ipynb are unsatisfactory, open saveMatlabPointsForAcinoSet.m in MATLAB and follow the instructions. Note that this requires MATLAB 2020b or later.

Optionally: Manually defining the shared points for extrinsic calibration:

You can manually define points on each video in a scene with Argus Clicker. A quick tutorial is found here.

Build the environment:

$ conda env create -f conda_envs/argus.yml

Launch Argus Clicker:

$ python
>>> import argus_gui as ag; ag.ClickerGUI()

Keyboard Shortcuts (See documentation here for more):

  • G ... to a specific frame
  • X ... to switch the sync mode setting the windows to the same frame
  • O ... to bring up the options dialog
  • S ... to bring up a save dialog

Then you must convert the output data from Argus to work with the rest of the pipeline (here is an example):

$ python argus_converter.py \
    --data_dir ../data/2019_03_07/extrinsic_calib/argus_folder

3D Reconstruction:

To reconstruct a cheetah into 3D, we offer three different pose estimation options on top of standard triangulation (TRI):

  • Sparse Bundle Adjustment (SBA)
  • Extended Kalman Filter (EKF)
  • Full Trajectory Estimation (FTE)

You can run each option seperately. For example, simply open FTE.ipynb and follow the instructions! Otherwise, you can run all types of refinements in one go:

python all_optimizations.py --data_dir 2019_03_09/lily/run --start_frame 70 --end_frame 170 --dlc_thresh 0.5

NB: When running the FTE, we recommend that you use the MA86 solver. For details on how to set this up, see these instructions.

Citation

We ask that if you use our code or data, kindly cite (and note it is accepted to ICRA 2021, so please check back for an updated ref):

@misc{joska2021acinoset,
      title={AcinoSet: A 3D Pose Estimation Dataset and Baseline Models for Cheetahs in the Wild}, 
      author={Daniel Joska and Liam Clark and Naoya Muramatsu and Ricardo Jericevich and Fred Nicolls and Alexander Mathis and Mackenzie W. Mathis and Amir Patel},
      year={2021},
      eprint={2103.13282},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
Comments
  • Hand-checked 3D GT annotations

    Hand-checked 3D GT annotations

    Hi everyone.

    The following paragraph is extracted from your paper: "We used this tool to estimate the amount of human corrections needed to be “perfect” (most needed no corrections, and minor corrections were under 10 pixels, checked on n=600 frames) and the largest adjustments were typically in the extremities, such as the tail and ankles. The proportion of large outliers where the adjustment was over 100 mm in magnitude was only 0.71%".

    Are these 600 frames and their corrected annotations are available anywhere? Alternatively, are any hand-checked 3D GT frames/annotations available anywhere? Thanks in advance!

    opened by julianzille 4
  • 2D annotations in dlc_pw and fte_pw folders (Dropbox)

    2D annotations in dlc_pw and fte_pw folders (Dropbox)

    Hi everyone.

    Thank you for your work, and for making your data and results freely available.

    I'm trying to understand the difference between the 2D annotations stored in the dlc_pw and fte_pw folders in your Dropbox repository: are the 2D annotations in the .csv files in fte_pw obtained by re-projecting the 3D poses to 2D, and are the annotations in the .pickle files in dlc_pw generated by the DeepLabCut model which are used as GT (to compare with the re-projected 2D annotations)?

    Thanks in advance :)

    opened by julianzille 3
  • Deciding the range of frame used for the 3D reconstruction

    Deciding the range of frame used for the 3D reconstruction

    When args.end_frame is set -1, the start_frame and end_frame are automatically defined with the following process:

    max_idx = filtered_points_2d_df['frame'].max() + 1
    for i in range(max_idx):
        if num_marker(i) == len(target_markers):
            start_frame = i
            break
    for i in range(max_idx, 0, -1):
        if num_marker(i) == len(target_markers):
            end_frame = i
            break
    

    line 812 on all_optimizations.py.

    opened by DenDen047 1
  • Add useful scripts

    Add useful scripts

    • run.sh ... make and run the docker environment
    • count_frames.py ... display the number of frames
    • FTE.py ... run FTE with CLI

    You can run these like this

    $ python FTE.py --data data/2019_03_03/menya/flick
    $ python frame_count.py --video /data/2019_03_03/menya/flick/fte/fte.avi
    
    opened by DenDen047 0
  • Update README.md

    Update README.md

    readme updates

    nb_conda file missing from yaml; also @DJoska @rickyjericevich is deeplabcut really required? also why a "_windows" version of the yaml?

    opened by MMathisLab 0
  • DeepLabCut feature extractor

    DeepLabCut feature extractor

    Hi everyone.

    I was hoping someone might be able to help me with this: does the DeepLabCut feature extractor used (with ResNet152 backbone) estimate the cheetah's pose from an entire frame, or is the body first localised by calculating its bounding box, before performing feature extraction?

    Thanks in advance :)

    opened by julianzille 2
  • Coordinate System Problems

    Coordinate System Problems

    Hello, authors: Thank you for your contributions. Here are some questions that need your help.

    • First, is the 3D pose provided by AcinoSet in the world coordinate system (i.e., "fte.pickle"->traj_data["positions"])? Is the measurement in meters?
    • Second, can the 3D pose in the world coordinate system be converted to the camera coordinate system through the "R, T" parameter in "n_cam_scene_sba.json"?

    Thanks.

    opened by maicao2018 1
  • Keep all_optimizations.py and notebooks up to date with eachother

    Keep all_optimizations.py and notebooks up to date with eachother

    Initially when I created all_optimizations.py, I made sure that the functions for each optimization therein held the exact same code as it's corresponding notebook. Eg: the fte() func had the exact same code as FTE.ipynb. This made it relatively easy to transfer any changes made in the notebooks to all_optimizations.py and vice versa.

    The person(s) who recently made changes to all_optimizations.py did not transfer their changes to the notebooks, so the code in all_optimizations.py has become substantially different to the code in the notebooks (and rather messy imo). This makes it difficult and time consuming to add any new changes made in the notebooks to all_optimizations.py and vice versa.

    As of right now, all_optimizations.py will probably produce different reconstruction results compared to the notebooks, and that's a problem. This must be fixed before we can merge the improvements from develop into main

    I think the best (long-term) solution is to break up the code in each notebook into smaller, more manageable functions and place them in their own module. For example, there'll be one module for all the FTE code and one for EKF and so on. Perhaps the FTE module will have functions like plot_redescending_cost(), initialize_pyomo_model(), define_pyomo_constraints() or something similar.

    Thoughts?

    help wanted 
    opened by rickyjericevich 3
Owner
African Robotics Unit
A grouping of robotics researchers at the University of Cape Town who study problems we as Africans are uniquely positioned to solve
African Robotics Unit
SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation

SE3 Pose Interpolation Pose estimated from SLAM system are always discrete, and

Ran Cheng 4 Dec 15, 2022
Automatic Calibration for Non-repetitive Scanning Solid-State LiDAR and Camera Systems

ACSC Automatic extrinsic calibration for non-repetitive scanning solid-state LiDAR and camera systems. System Architecture 1. Dependency Tested with U

KINO 192 Dec 13, 2022
CTRL-C: Camera calibration TRansformer with Line-Classification

CTRL-C: Camera calibration TRansformer with Line-Classification This repository contains the official code and pretrained models for CTRL-C (Camera ca

null 57 Nov 14, 2022
Omnidirectional camera calibration in python

Omnidirectional Camera Calibration Key features pure python initial solution based on A Toolbox for Easily Calibrating Omnidirectional Cameras (Davide

Thomas Pönitz 12 Nov 22, 2022
Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation

Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation This is the official repository for our paper Neural Reprojection Error

Hugo Germain 78 Dec 1, 2022
Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019

PoseNet of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image" Introduction This repo is official Py

Gyeongsik Moon 677 Dec 25, 2022
Towards Multi-Camera 3D Human Pose Estimation in Wild Environment

PanopticStudio Toolbox This repository has a toolbox to download, process, and visualize the Panoptic Studio (Panoptic) data. Note: Sep-21-2020: Curre

null 335 Jan 9, 2023
Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation

SimplePose Code and pre-trained models for our paper, “Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation”, a

Jia Li 256 Dec 24, 2022
Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation Code repository for the paper: PoseAug: A Differentiable Pose Augme

Pyjcsx 328 Dec 17, 2022
This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

SO-Pose This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation This paper is basically an

shangbuhuan 52 Nov 25, 2022
Python scripts for performing 3D human pose estimation using the Mobile Human Pose model in ONNX.

Python scripts for performing 3D human pose estimation using the Mobile Human Pose model in ONNX.

Ibai Gorordo 99 Dec 31, 2022
Web service for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation based on OpenFace 2.0

OpenGaze: Web Service for OpenFace Facial Behaviour Analysis Toolkit Overview OpenFace is a fantastic tool intended for computer vision and machine le

Sayom Shakib 4 Nov 3, 2022
OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.

OpenFace 2.2.0: a facial behavior analysis toolkit Over the past few years, there has been an increased interest in automatic facial behavior analysis

Tadas Baltrusaitis 5.8k Dec 31, 2022
Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

Noise Contrastive Estimation for pyTorch Overview This repository contains a re-implementation of the Noise Contrastive Estimation algorithm, implemen

Denis Emelin 42 Nov 24, 2022
Camera-caps - Examine the camera capabilities for V4l2 cameras

camera-caps This is a graphical user interface over the v4l2-ctl command line to

Jetsonhacks 25 Dec 26, 2022
Back to the Feature: Learning Robust Camera Localization from Pixels to Pose (CVPR 2021)

Back to the Feature with PixLoc We introduce PixLoc, a neural network for end-to-end learning of camera localization from an image and a 3D model via

Computer Vision and Geometry Lab 610 Jan 5, 2023
Visualize Camera's Pose Using Extrinsic Parameter by Plotting Pyramid Model on 3D Space

extrinsic2pyramid Visualize Camera's Pose Using Extrinsic Parameter by Plotting Pyramid Model on 3D Space Intro A very simple and straightforward modu

JEONG HYEONJIN 106 Dec 28, 2022
A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

ManhattanSLAM Authors: Raza Yunus, Yanyan Li and Federico Tombari ManhattanSLAM is a real-time SLAM library for RGB-D cameras that computes the camera

null 117 Dec 28, 2022
PoseViz – Multi-person, multi-camera 3D human pose visualization tool built using Mayavi.

PoseViz – 3D Human Pose Visualizer Multi-person, multi-camera 3D human pose visualization tool built using Mayavi. As used in MeTRAbs visualizations.

István Sárándi 79 Dec 30, 2022