Official implementation of the Implicit Behavioral Cloning (IBC) algorithm

Related tags

Deep Learning ibc
Overview

Implicit Behavioral Cloning

This codebase contains the official implementation of the Implicit Behavioral Cloning (IBC) algorithm from our paper:

Implicit Behavioral Cloning (website link) (arXiv link)
Pete Florence, Corey Lynch, Andy Zeng, Oscar Ramirez, Ayzaan Wahid, Laura Downs, Adrian Wong, Johnny Lee, Igor Mordatch, Jonathan Tompson
Conference on Robot Learning (CoRL) 2021

Abstract

We find that across a wide range of robot policy learning scenarios, treating supervised policy learning with an implicit model generally performs better, on average, than commonly used explicit models. We present extensive experiments on this finding, and we provide both intuitive insight and theoretical arguments distinguishing the properties of implicit models compared to their explicit counterparts, particularly with respect to approximating complex, potentially discontinuous and multi-valued (set-valued) functions. On robotic policy learning tasks we show that implicit behavioral cloning policies with energy-based models (EBM) often outperform common explicit (Mean Square Error, or Mixture Density) behavioral cloning policies, including on tasks with high-dimensional action spaces and visual image inputs. We find these policies provide competitive results or outperform state-of-the-art offline reinforcement learning methods on the challenging human-expert tasks from the D4RL benchmark suite, despite using no reward information. In the real world, robots with implicit policies can learn complex and remarkably subtle behaviors on contact-rich tasks from human demonstrations, including tasks with high combinatorial complexity and tasks requiring 1mm precision.

Prerequisites

The code for this project uses python 3.7+ and the following pip packages:

python3 -m pip install --upgrade pip
pip install \
  absl-py==0.12.0 \
  gin-config==0.4.0 \
  matplotlib==3.4.3 \
  mediapy==1.0.3 \
  opencv-python==4.5.3.56 \
  pybullet==3.1.6 \
  scipy==1.7.1 \
  tensorflow==2.6.0 \
  tensorflow-probability==0.13.0 \
  tf-agents-nightly==0.10.0.dev20210930 \
  tqdm==4.62.2

(Optional): For Mujoco support, see docs/mujoco_setup.md. Recommended to skip it unless you specifically want to run the Adroit and Kitchen environments.

Quickstart: from 0 to a trained IBC policy in 10 minutes.

Step 1: Install listed Python packages above in Prerequisites.

Step 2: Run unit tests (should take less than a minute), and do this from the directory just above the top-level ibc directory:

./ibc/run_tests.sh

Step 3: Check that Tensorflow has GPU access:

python3 -c "import tensorflow as tf; print(tf.test.is_gpu_available())"

If the above prints False, see the following requirements, notably CUDA 11.2 and cuDNN 8.1.0: https://www.tensorflow.org/install/gpu#software_requirements.

Step 4: Let's do an example Block Pushing task, so first let's download oracle data (or see Tasks for how to generate it):

cd ibc/data
wget https://storage.googleapis.com/brain-reach-public/ibc_data/block_push_states_location.zip
unzip block_push_states_location.zip && rm block_push_states_location.zip
cd ../..

Step 5: Set PYTHONPATH to include the directory just above top-level ibc, so if you've been following the commands above it is:

export PYTHONPATH=$PYTHONPATH:${PWD}

Step 6: On that example Block Pushing task, we'll next do a training + evaluation with Implicit BC:

./ibc/ibc/configs/pushing_states/run_mlp_ebm.sh

Some notes:

  • On an example single-GPU machine (GTX 2080 Ti), the above trains at about 18 steps/sec, and should get to high success rates in 5,000 or 10,000 steps (roughly 5-10 minutes of training).
  • The mlp_ebm.gin is just one config, with is meant to be reasonably fast to train, with only 20 evals at each interval, and is not suitable for all tasks. See Tasks for more configs.
  • Due to the --video flag above, you can watch a video of the learned policy in action at: /tmp/ibc_logs/mlp_ebm/ibc_dfo/... navigate to the videos/ttl=7d subfolder, and by default there should be one example .mp4 video saved every time you do an evaluation interval.

(Optional) Step 7: For the pybullet-based tasks, we also have real-time interactive visualization set up through a visualization server, so in one terminal:

cd <path_to>/ibc/..
export PYTHONPATH=$PYTHONPATH:${PWD}
python3 -m pybullet_utils.runServer

And in a different terminal run the oracle a few times with the --shared_memory flag:

cd <path_to>/ibc/..
export PYTHONPATH=$PYTHONPATH:${PWD}
python3 ibc/data/policy_eval.py -- \
  --alsologtostderr \
  --shared_memory \
  --num_episodes=3 \
  --policy=oracle_push \
  --task=PUSH

You're done with Quickstart! See below for more Tasks, and also see docs/codebase_overview.md and docs/workflow.md for additional info.

Tasks

Task: Particle

In this task, the goal is for the agent (black dot) to first go to the green dot, then the blue dot.

Example IBC policy Example MSE policy

Get Data

We can either generate data from scratch, for example for 2D (takes 15 seconds):

./ibc/ibc/configs/particle/collect_data.sh

Or just download all the data for all different dimensions:

cd ibc/data/
wget https://storage.googleapis.com/brain-reach-public/ibc_data/particle.zip
unzip particle.zip && rm particle.zip
cd ../..

Train and Evaluate

Let's start with some small networks, on just the 2D version since it's easiest to visualize, and compare MSE and IBC. Here's a small-network (256x2) IBC-with-Langevin config, where 2 is the argument for the environment dimensionality.

./ibc/ibc/configs/particle/run_mlp_ebm_langevin.sh 2

And here's an idenitcally sized network (256x2) but with MSE config:

./ibc/ibc/configs/particle/run_mlp_mse.sh 2

For the above configurations, we suggest comparing the rollout videos, which you can find at /tmp/ibc_logs/...corresponding_directory../videos/. At the top of this section is shown a comparison at 10,000 training steps for the two different above configs.

And here are the best configs respectfully for IBC (with langevin) and MSE, in this case run on the 16-dimensional environment:

./ibc/ibc/configs/particle/run_mlp_ebm_langevin_best.sh 16
./ibc/ibc/configs/particle/run_mlp_mse_best.sh 16

Note: the _best config is kind of slow for Langevin to train, but even just ./ibc/ibc/configs/particle/run_mlp_ebm_langevin.sh 16 (smaller network) seems to solve the 16-D environment pretty well, and is much faster to train.

Task: Block Pushing (from state observations)

Get Data

We can either generate data from scratch (~2 minutes for 2,000 episodes: 200 each across 10 replicas):

./ibc/ibc/configs/pushing_states/collect_data.sh

Or we can download data from the web:

cd ibc/data/
wget https://storage.googleapis.com/brain-reach-public/ibc_data/block_push_states_location.zip
unzip 'block_push_states_location.zip' && rm block_push_states_location.zip
cd ../..

Train and Evaluate

Here's reasonably fast-to-train config for IBC with DFO:

./ibc/ibc/configs/pushing_states/run_mlp_ebm.sh

Or here's a config for IBC with Langevin:

./ibc/ibc/configs/pushing_states/run_mlp_ebm_langevin.sh

Or here's a comparable, reasonably fast-to-train config for MSE:

./ibc/ibc/configs/pushing_states/run_mlp_mse.sh

Or to run the best configs respectfully for IBC, MSE, and MDN (some of these might be slower to train than the above):

./ibc/ibc/configs/pushing_states/run_mlp_ebm_best.sh
./ibc/ibc/configs/pushing_states/run_mlp_mse_best.sh
./ibc/ibc/configs/pushing_states/run_mlp_mdn_best.sh

Task: Block Pushing (from image observations)

Get Data

Download data from the web:

cd ibc/data/
wget https://storage.googleapis.com/brain-reach-public/ibc_data/block_push_visual_location.zip
unzip 'block_push_visual_location.zip' && rm block_push_visual_location.zip
cd ../..

Train and Evaluate

Here is an IBC with Langevin configuration which should actually converge faster than the IBC-with-DFO that we reported in the paper:

./ibc/ibc/configs/pushing_pixels/run_pixel_ebm_langevin.sh

And here are the best configs respectfully for IBC (with DFO), MSE, and MDN:

./ibc/ibc/configs/pushing_pixels/run_pixel_ebm_best.sh
./ibc/ibc/configs/pushing_pixels/run_pixel_mse_best.sh
./ibc/ibc/configs/pushing_pixels/run_pixel_mdn_best.sh

Task: D4RL Adroit and Kitchen

Get Data

The D4RL human demonstration training data used for the paper submission can be downloaded using the commands below. This data has been processed into a .tfrecord format from the original D4RL data format:

cd ibc/data && mkdir -p d4rl_trajectories && cd d4rl_trajectories
wget https://storage.googleapis.com/brain-reach-public/ibc_data/door-human-v0.zip \
     https://storage.googleapis.com/brain-reach-public/ibc_data/hammer-human-v0.zip \
     https://storage.googleapis.com/brain-reach-public/ibc_data/kitchen-complete-v0.zip \
     https://storage.googleapis.com/brain-reach-public/ibc_data/kitchen-mixed-v0.zip \
     https://storage.googleapis.com/brain-reach-public/ibc_data/kitchen-partial-v0.zip \
     https://storage.googleapis.com/brain-reach-public/ibc_data/pen-human-v0.zip \
     https://storage.googleapis.com/brain-reach-public/ibc_data/relocate-human-v0.zip
unzip '*.zip' && rm *.zip
cd ../../..

Run Train Eval:

Here are the best configs respectfully for IBC (with Langevin), and MSE: On a 2080 Ti GPU test, this IBC config trains at only 1.7 steps/sec, but it is about 10x faster on TPUv3.

./ibc/ibc/configs/d4rl/run_mlp_ebm_langevin_best.sh pen-human-v0
./ibc/ibc/configs/d4rl/run_mlp_mse_best.sh pen-human-v0

The above commands will run on the pen-human-v0 environment, but you can swap this arg for whichever of the provided Adroit/Kitchen environments.

Here also is an MDN config you can try. The network size is tiny but if you increase it heavily then it seems to get NaNs during training. In general MDNs can be finicky. A solution should be possible though.

./ibc/ibc/configs/d4rl/run_mlp_mdn.sh pen-human-v0

Summary for Reproducing Results

For the tasks that we've been able to open-source, results from the paper should be reproducible by using the linked data and command-line args below.

Task Figure/Table in paper Data Train + Eval commands
Coordinate regression Figure 4 See colab See colab
D4RL Adroit + Kitchen Table 2 Link Link
N-D particle Figure 6 Link Link
Simulated pushing, single target, states Table 3 Link Link
Simulated pushing, single target, pixels Table 3 Link Link

Citation

If you found our paper/code useful in your research, please consider citing:

@article{florence2021implicit,
    title={Implicit Behavioral Cloning},
    author={Florence, Pete and Lynch, Corey and Zeng, Andy and Ramirez, Oscar and Wahid, Ayzaan and Downs, Laura and Wong, Adrian and Lee, Johnny and Mordatch, Igor and Tompson, Jonathan},
    journal={Conference on Robot Learning (CoRL)},
    month = {November},
    year={2021}
}
Comments
  • Cannot register 2 metrics with the same name: /tensorflow/api/keras/optimizers

    Cannot register 2 metrics with the same name: /tensorflow/api/keras/optimizers

    Just tried running tests on Ubuntu 20.04, CPython 3.8.10, but get the following error:

    $ cd ibc/..
    $ ./ibc/run_tests.sh
    ...
    PYTHONPATH=:{parent}/ibc/.. python3 {parent}/ibc/ibc/agents/mcmc_test.py --alsologtostderr
    ...
    2021-11-08 16:03:33.113843: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
    2021-11-08 16:03:33.113875: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
    2021-11-08 16:03:34.161325: E tensorflow/core/lib/monitoring/collection_registry.cc:77] Cannot register 2 metrics with the same name: /tensorflow/api/keras/optimizers
    ...
    tensorflow.python.framework.errors_impl.AlreadyExistsError: Another metric with the same name already exists.
    ERROR: 'PYTHONPATH=:{parent}/ibc/.. python3 {parent}/ibc/ibc/agents/mcmc_test.py --alsologtostderr' failed!
    

    Full stack trace: https://gist.github.com/EricCousineau-TRI/ac6e9943606e6b9f7e335882f7caa350

    ~Not sure if it's b/c of CUDA error.~ ~I have stock Ubuntu CUDA 10.1 on my machine, so will try out NVidia-installed CUDA 11.0.~ See below.

    Also happens when trying to run training script, ./ibc/ibc/configs/pushing_states/run_mlp_ebm.sh

    opened by EricCousineau-TRI 10
  • step after collecting data

    step after collecting data

    in collect_oracle.py, time_step = env.step(action) is called before observation is recorded: episode_data.time_step.append(time_step). I apologize if I misunderstood the implementation. But as far as I can see the implementation, env.step(action) returns next TimeStep. Thereby, the action at the current time and the observation at the next time are stored in pairs with the same index.

    Wouldn't it be correct to record the state of the system when the action is decided?

    opened by syundo0730 3
  • Goal Tolerance Are Different for Different Methods

    Goal Tolerance Are Different for Different Methods

    Hi, I found that

    train_eval.goal_tolerance = 0.02
    

    is set in EBM's config but not in MSE's config.

    The difference makes the evaluation to be more strict on MSE-based BC as the default goal_tolerance=0.01 (code).

    Setting train_eval.goal_tolerance = 0.01 for the EBM agent decreases its success rate from 1.0 to [0.85, 0.95] after training for 10k steps.

    opened by yenchenlin 3
  • README: Update pypi package versions for keras and tf-agents

    README: Update pypi package versions for keras and tf-agents

    Avoids collision btw keras and tensorflow version Avoids collision btw stable and nightly for tensorflow-probability

    Resolves #1 (I think)

    After using this, I see the following pip freeze output: https://gist.github.com/EricCousineau-TRI/ac6e9943606e6b9f7e335882f7caa350#file-new-pip-freeze-txt

    @peteflorence This allowed me to run ./run_tests.sh with all of 'em passing!

    cla: yes 
    opened by EricCousineau-TRI 0
  • Whether image input is provided in this codebase?

    Whether image input is provided in this codebase?

    Hi @peteflorence ,

    I'm trying to reproduce the IBC project. Thanks for open sourcing this work!I was wondering if this codebase provides an interface for image input that can be used in the real world.My sincerest thanks in advance!​

    Sincerely, Vinson

    opened by Vinson-Tang 2
  • Support for Categorical Action Space

    Support for Categorical Action Space

    I am currently working on implementing implicit BC for a task which has both keyboard and mouse inputs as action-space, is there a straightforward way to make this action space suitable for the implicit regression task?

    opened by rokosbasilisk 0
  • Maybe the version of gym needs to be written to the readme

    Maybe the version of gym needs to be written to the readme

    Hi @peteflorence In the new version of GYM, 'done' has been removed from the parameters of step, and 'terminated' and 'truncated' have been added. So running the unit test in the new version of the GYM environment will fail. I think maybe the version of GYM used for this project should be indicated in the readme. Thanks, Vinson

    opened by Vinson-Tang 1
  • Unit tests fail

    Unit tests fail

    Hi, it seems that the unit tests do not work out of the box. I'm working in a clean conda environment with Python 3.7.13. All of the prerequisites are installed with the versions described in the readme, as well as CUDA and cuDNN (Tensorflow has GPU access).

    Here's the complete output from the test script:

    Test script outputibc-test ❯ ./ibc/run_tests.sh bash: /home/arc/miniconda3/envs/ibc-test/lib/libtinfo.so.6: no version information available (required by bash) Running run_tests.sh in directory /home/arc/noah Running tests: /home/arc/noah/ibc/environments/block_pushing/block_pushing_multimodal_test.py /home/arc/noah/ibc/environments/block_pushing/block_pushing_test.py /home/arc/noah/ibc/environments/utils/utils_pybullet_test.py /home/arc/noah/ibc/environments/utils/xarm_sim_robot_test.py /home/arc/noah/ibc/environments/particle/particle_test.py /home/arc/noah/ibc/ibc/agents/mcmc_test.py /home/arc/noah/ibc/ibc/train/stats_test.py /home/arc/noah/ibc/data/dataset_test.py *********************************************************************** Running test /home/arc/noah/ibc/environments/block_pushing/block_pushing_multimodal_test.py *********************************************************************** PYTHONPATH=:/home/arc/noah:/home/arc/noah/ibc/.. python3 /home/arc/noah/ibc/environments/block_pushing/block_pushing_multimodal_test.py --alsologtostderr /home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/tensorflow/python/autograph/impl/api.py:22: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses import imp /home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/keras_preprocessing/image/utils.py:23: DeprecationWarning: NEAREST is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.NEAREST or Dither.NONE instead. 'nearest': pil_image.NEAREST, /home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/keras_preprocessing/image/utils.py:24: DeprecationWarning: BILINEAR is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BILINEAR instead. 'bilinear': pil_image.BILINEAR, /home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/keras_preprocessing/image/utils.py:25: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. 'bicubic': pil_image.BICUBIC, /home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/keras_preprocessing/image/utils.py:28: DeprecationWarning: HAMMING is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.HAMMING instead. if hasattr(pil_image, 'HAMMING'): /home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/keras_preprocessing/image/utils.py:30: DeprecationWarning: BOX is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BOX instead. if hasattr(pil_image, 'BOX'): /home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/keras_preprocessing/image/utils.py:33: DeprecationWarning: LANCZOS is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.LANCZOS instead. if hasattr(pil_image, 'LANCZOS'): /home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/tf_agents/__init__.py:56: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. if (distutils.version.LooseVersion(tf_version) < /home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/tensorflow_probability/python/__init__.py:61: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. if (distutils.version.LooseVersion(tf.__version__) < /home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/tf_agents/utils/common.py:87: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. and (distutils.version.LooseVersion(tf.__version__) <= pybullet build time: Jun 28 2022 14:19:23 /home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/gym/envs/registration.py:416: UserWarning: WARN: The `registry.env_specs` property along with `EnvSpecTree` is deprecated. Please use `registry` directly as a dictionary instead. "The `registry.env_specs` property along with `EnvSpecTree` is deprecated. Please use `registry` directly as a dictionary instead." 2022-06-29 13:44:14.864729: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-06-29 13:44:14.868489: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-06-29 13:44:14.868817: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero Running tests under Python 3.7.13: /home/arc/miniconda3/envs/ibc-test/bin/python3 [ RUN ] Blocks2DTest.test_load_push_env /home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/gym/spaces/box.py:112: UserWarning: WARN: Box bound precision lowered by casting to float32 logger.warn(f"Box bound precision lowered by casting to {self.dtype}") argv[0]= I0629 13:44:14.880214 140609892135296 utils_pybullet.py:85] Loading URDF plane.urdf I0629 13:44:14.885888 140609892135296 utils_pybullet.py:85] Loading URDF ibc/environments/assets/workspace.urdf I0629 13:44:14.886190 140609892135296 utils_pybullet.py:85] Loading URDF xarm/xarm6_robot.urdf I0629 13:44:14.908028 140609892135296 utils_pybullet.py:85] Loading URDF ibc/environments/assets/suction/suction-head-long.urdf I0629 13:44:14.911714 140609892135296 utils_pybullet.py:85] Loading URDF ibc/environments/assets/zone.urdf I0629 13:44:14.912028 140609892135296 utils_pybullet.py:85] Loading URDF ibc/environments/assets/zone2.urdf I0629 13:44:14.912325 140609892135296 utils_pybullet.py:85] Loading URDF ibc/environments/assets/block.urdf I0629 13:44:14.912554 140609892135296 utils_pybullet.py:85] Loading URDF ibc/environments/assets/block2.urdf INFO:tensorflow:time(__main__.Blocks2DTest.test_load_push_env): 0.09s I0629 13:44:14.959522 140609892135296 test_util.py:2189] time(__main__.Blocks2DTest.test_load_push_env): 0.09s [ OK ] Blocks2DTest.test_load_push_env [ RUN ] Blocks2DTest.test_serialize_state_push argv[0]= I0629 13:44:14.963971 140609892135296 utils_pybullet.py:85] Loading URDF plane.urdf I0629 13:44:14.969090 140609892135296 utils_pybullet.py:85] Loading URDF ibc/environments/assets/workspace.urdf I0629 13:44:14.969393 140609892135296 utils_pybullet.py:85] Loading URDF xarm/xarm6_robot.urdf I0629 13:44:14.987162 140609892135296 utils_pybullet.py:85] Loading URDF ibc/environments/assets/suction/suction-head-long.urdf I0629 13:44:14.991161 140609892135296 utils_pybullet.py:85] Loading URDF ibc/environments/assets/zone.urdf I0629 13:44:14.991568 140609892135296 utils_pybullet.py:85] Loading URDF ibc/environments/assets/zone2.urdf I0629 13:44:14.991911 140609892135296 utils_pybullet.py:85] Loading URDF ibc/environments/assets/block.urdf I0629 13:44:14.992150 140609892135296 utils_pybullet.py:85] Loading URDF ibc/environments/assets/block2.urdf INFO:tensorflow:time(__main__.Blocks2DTest.test_serialize_state_push): 0.13s I0629 13:44:15.085130 140609892135296 test_util.py:2189] time(__main__.Blocks2DTest.test_serialize_state_push): 0.13s [ OK ] Blocks2DTest.test_serialize_state_push [ RUN ] Blocks2DTest.test_session [ SKIPPED ] Blocks2DTest.test_session [ RUN ] Blocks2DTest.test_validate_environment argv[0]= I0629 13:44:15.090482 140609892135296 utils_pybullet.py:85] Loading URDF plane.urdf I0629 13:44:15.095655 140609892135296 utils_pybullet.py:85] Loading URDF ibc/environments/assets/workspace.urdf I0629 13:44:15.095959 140609892135296 utils_pybullet.py:85] Loading URDF xarm/xarm6_robot.urdf I0629 13:44:15.113764 140609892135296 utils_pybullet.py:85] Loading URDF ibc/environments/assets/suction/suction-head-long.urdf I0629 13:44:15.117837 140609892135296 utils_pybullet.py:85] Loading URDF ibc/environments/assets/zone.urdf I0629 13:44:15.118308 140609892135296 utils_pybullet.py:85] Loading URDF ibc/environments/assets/zone2.urdf I0629 13:44:15.118738 140609892135296 utils_pybullet.py:85] Loading URDF ibc/environments/assets/block.urdf I0629 13:44:15.119005 140609892135296 utils_pybullet.py:85] Loading URDF ibc/environments/assets/block2.urdf /home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/gym/utils/passive_env_checker.py:98: UserWarning: WARN: We recommend you to use a symmetric and normalized Box action space (range=[-1, 1]) https://stable-baselines3.readthedocs.io/en/master/guide/rl_tips.html "We recommend you to use a symmetric and normalized Box action space (range=[-1, 1]) " /home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/gym/utils/passive_env_checker.py:217: UserWarning: WARN: Future gym versions will require that `Env.reset` can be passed a `seed` instead of using `Env.seed` for resetting the environment random number generator. "Future gym versions will require that `Env.reset` can be passed a `seed` instead of using `Env.seed` for resetting the environment random number generator. " /home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/gym/utils/passive_env_checker.py:229: UserWarning: WARN: Future gym versions will require that `Env.reset` can be passed `return_info` to return information from the environment resetting. "Future gym versions will require that `Env.reset` can be passed `return_info` to return information from the environment resetting." /home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/gym/utils/passive_env_checker.py:234: UserWarning: WARN: Future gym versions will require that `Env.reset` can be passed `options` to allow the environment initialisation to be passed additional information. "Future gym versions will require that `Env.reset` can be passed `options` to allow the environment initialisation to be passed additional information." /home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/gym/spaces/box.py:197: UserWarning: WARN: Casting input x to numpy array. logger.warn("Casting input x to numpy array.") INFO:tensorflow:time(__main__.Blocks2DTest.test_validate_environment): 0.09s I0629 13:44:15.179256 140609892135296 test_util.py:2189] time(__main__.Blocks2DTest.test_validate_environment): 0.09s [ FAILED ] Blocks2DTest.test_validate_environment ====================================================================== FAIL: test_validate_environment (__main__.Blocks2DTest) Blocks2DTest.test_validate_environment ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/arc/noah/ibc/environments/block_pushing/block_pushing_multimodal_test.py", line 34, in test_validate_environment utils.validate_py_environment(env) File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/tf_agents/environments/utils.py", line 75, in validate_py_environment time_step = environment.reset() File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/tf_agents/environments/py_environment.py", line 196, in reset self._current_time_step = self._reset() File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/tf_agents/environments/wrappers.py", line 111, in _reset return self._env.reset() File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/tf_agents/environments/py_environment.py", line 196, in reset self._current_time_step = self._reset() File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/tf_agents/environments/gym_wrapper.py", line 193, in _reset observation = self._gym_env.reset() File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/gym/wrappers/time_limit.py", line 66, in reset return self.env.reset(**kwargs) File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/gym/wrappers/order_enforcing.py", line 42, in reset return self.env.reset(**kwargs) File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/gym/wrappers/env_checker.py", line 47, in reset return passive_env_reset_check(self.env, **kwargs) File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/gym/utils/passive_env_checker.py", line 247, in passive_env_reset_check _check_obs(obs, env.observation_space, "reset") File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/gym/utils/passive_env_checker.py", line 115, in _check_obs ), f"{pre} is not contained with the observation space ({observation_space})" AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(block_translation: Box(-5.0, 5.0, (2,), float32), block_orientation: Box(-6.2831855, 6.2831855, (1,), float32), block2_translation: Box(-5.0, 5.0, (2,), float32), block2_orientation: Box(-6.2831855, 6.2831855, (1,), float32), effector_translation: Box([ 0.05 -0.6 ], [0.8 0.6], (2,), float32), effector_target_translation: Box([ 0.05 -0.6 ], [0.8 0.6], (2,), float32), target_translation: Box(-5.0, 5.0, (2,), float32), target_orientation: Box(-6.2831855, 6.2831855, (1,), float32), target2_translation: Box(-5.0, 5.0, (2,), float32), target2_orientation: Box(-6.2831855, 6.2831855, (1,), float32))) ---------------------------------------------------------------------- Ran 4 tests in 0.310s FAILED (failures=1, skipped=1) ERROR: 'PYTHONPATH=:/home/arc/noah:/home/arc/noah/ibc/.. python3 /home/arc/noah/ibc/environments/block_pushing/block_pushing_multimodal_test.py --alsologtostderr' failed!

    Here's the last part of that formatted a bit more nicely:

    INFO:tensorflow:time(__main__.Blocks2DTest.test_validate_environment): 0.09s
    I0629 13:44:15.179256 140609892135296 test_util.py:2189] time(__main__.Blocks2DTest.test_validate_environment): 0.09s
    [  FAILED  ] Blocks2DTest.test_validate_environment
    ======================================================================
    FAIL: test_validate_environment (__main__.Blocks2DTest)
    Blocks2DTest.test_validate_environment
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/arc/noah/ibc/environments/block_pushing/block_pushing_multimodal_test.py", line 34, in test_validate_environment
        utils.validate_py_environment(env)
      File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/tf_agents/environments/utils.py", line 75, in validate_py_environment
        time_step = environment.reset()
      File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/tf_agents/environments/py_environment.py", line 196, in reset
        self._current_time_step = self._reset()
      File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/tf_agents/environments/wrappers.py", line 111, in _reset
        return self._env.reset()
      File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/tf_agents/environments/py_environment.py", line 196, in reset
        self._current_time_step = self._reset()
      File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/tf_agents/environments/gym_wrapper.py", line 193, in _reset
        observation = self._gym_env.reset()
      File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/gym/wrappers/time_limit.py", line 66, in reset
        return self.env.reset(**kwargs)
      File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/gym/wrappers/order_enforcing.py", line 42, in reset
        return self.env.reset(**kwargs)
      File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/gym/wrappers/env_checker.py", line 47, in reset
        return passive_env_reset_check(self.env, **kwargs)
      File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/gym/utils/passive_env_checker.py", line 247, in passive_env_reset_check
        _check_obs(obs, env.observation_space, "reset")
      File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/gym/utils/passive_env_checker.py", line 115, in _check_obs
        ), f"{pre} is not contained with the observation space ({observation_space})"
    AssertionError: The observation returned by the `reset()` method is not contained with the observation space (Dict(block_translation: Box(-5.0, 5.0, (2,), float32), block_orientation: Box(-6.2831855, 6.2831855, (1,), float32), block2_translation: Box(-5.0, 5.0, (2,), float32), block2_orientation: Box(-6.2831855, 6.2831855, (1,), float32), effector_translation: Box([ 0.05 -0.6 ], [0.8 0.6], (2,), float32), effector_target_translation: Box([ 0.05 -0.6 ], [0.8 0.6], (2,), float32), target_translation: Box(-5.0, 5.0, (2,), float32), target_orientation: Box(-6.2831855, 6.2831855, (1,), float32), target2_translation: Box(-5.0, 5.0, (2,), float32), target2_orientation: Box(-6.2831855, 6.2831855, (1,), float32)))
    
    ----------------------------------------------------------------------
    

    The issue seems to be that several fields of the observation returned by BlockPushMultimodal._compute_state() need to be converted to np arrays with dtype np.float32. After doing that and running the test again, I get the following error instead:

    INFO:tensorflow:time(__main__.Blocks2DTest.test_validate_environment): 0.1s
    I0629 14:04:36.733099 140095945187712 test_util.py:2189] time(__main__.Blocks2DTest.test_validate_environment): 0.1s
    [  FAILED  ] Blocks2DTest.test_validate_environment
    ======================================================================
    ERROR: test_validate_environment (__main__.Blocks2DTest)
    Blocks2DTest.test_validate_environment
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/arc/noah/ibc/environments/block_pushing/block_pushing_multimodal_test.py", line 34, in test_validate_environment
        utils.validate_py_environment(env)
      File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/tf_agents/environments/utils.py", line 84, in validate_py_environment
        time_step = environment.step(action)
      File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/tf_agents/environments/py_environment.py", line 233, in step
        self._current_time_step = self._step(action)
      File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/tf_agents/environments/wrappers.py", line 117, in _step
        time_step = self._env.step(action)
      File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/tf_agents/environments/py_environment.py", line 233, in step
        self._current_time_step = self._step(action)
      File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/tf_agents/environments/gym_wrapper.py", line 215, in _step
        observation, reward, self._done, self._info = self._gym_env.step(action)
      File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/gym/wrappers/time_limit.py", line 49, in step
        observation, reward, done, info = self.env.step(action)
      File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/gym/wrappers/order_enforcing.py", line 37, in step
        return self.env.step(action)
      File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/gym/wrappers/env_checker.py", line 39, in step
        return passive_env_step_check(self.env, action)
      File "/home/arc/miniconda3/envs/ibc-test/lib/python3.7/site-packages/gym/utils/passive_env_checker.py", line 273, in passive_env_step_check
        if np.any(np.isnan(obs)):
    TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
    
    ----------------------------------------------------------------------
    

    This is the same error as in #14 so I'd guess these issues are related.

    Any thoughts?

    opened by noahcgreen 2
  • Error running particle experiments

    Error running particle experiments

    Hi, thanks for open sourcing this work! I tried running:

    ./ibc/ibc/configs/particle/run_mlp_ebm_langevin_best.sh 2
    

    And got this error

      File "ibc/ibc/train_eval.py", line 397, in main                                                                        [122/528]
        strategy=strategy)                                                                                                              File "/iris/u/ayz/anaconda3/envs/ibc/lib/python3.7/site-packages/gin/config.py", line 1069, in gin_wrapper                          utils.augment_exception_message_and_reraise(e, err_str)                                                                         File "/iris/u/ayz/anaconda3/envs/ibc/lib/python3.7/site-packages/gin/utils.py", line 41, in augment_exception_message_and_rerais
    e                                                                                                                                 
        raise proxy.with_traceback(exception.__traceback__) from None                                                                 
      File "/iris/u/ayz/anaconda3/envs/ibc/lib/python3.7/site-packages/gin/config.py", line 1046, in gin_wrapper                          return fn(*new_args, **new_kwargs)                                                                                            
      File "ibc/ibc/train_eval.py", line 279, in train_eval                                                                           
        name_scope_suffix=f'_{env_name}')                                                                                             
      File "ibc/ibc/train_eval.py", line 353, in evaluation_step                                                                          eval_actor.run()                                                                                                              
      File "/iris/u/ayz/anaconda3/envs/ibc/lib/python3.7/site-packages/tf_agents/train/actor.py", line 149, in run                    
        self._time_step, self._policy_state)                                                                                          
      File "/iris/u/ayz/anaconda3/envs/ibc/lib/python3.7/site-packages/tf_agents/drivers/py_driver.py", line 112, in run                  next_time_step = self.env.step(action_step.action)                                                                            
      File "/iris/u/ayz/anaconda3/envs/ibc/lib/python3.7/site-packages/tf_agents/environments/py_environment.py", line 233, in step   
        self._current_time_step = self._step(action)                                                                                  
      File "/iris/u/ayz/anaconda3/envs/ibc/lib/python3.7/site-packages/tf_agents/environments/wrappers.py", line 1015, in _step           time_step = self._env.step(action)                                                                                            
      File "/iris/u/ayz/anaconda3/envs/ibc/lib/python3.7/site-packages/tf_agents/environments/py_environment.py", line 233, in step   
        self._current_time_step = self._step(action)                                                                                  
      File "/iris/u/ayz/anaconda3/envs/ibc/lib/python3.7/site-packages/tf_agents/environments/gym_wrapper.py", line 215, in _step     
        observation, reward, self._done, self._info = self._gym_env.step(action)                                                      
      File "/iris/u/ayz/anaconda3/envs/ibc/lib/python3.7/site-packages/gym/wrappers/order_enforcing.py", line 37, in step             
        return self.env.step(action)                                                                                                  
      File "/iris/u/ayz/anaconda3/envs/ibc/lib/python3.7/site-packages/gym/wrappers/env_checker.py", line 39, in step                 
        return passive_env_step_check(self.env, action)                                                                               
      File "/iris/u/ayz/anaconda3/envs/ibc/lib/python3.7/site-packages/gym/utils/passive_env_checker.py", line 273, in passive_env_st$
    p_check                                                                                                                           
        if np.any(np.isnan(obs)):                                                                                                     
    TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types acc$
    rding to the casting rule ''safe''
    

    I'm not super familiar with tf-agents, but from some debugging it looks like obs is a dictionary type and np.isnan is having issue with it. Any thought on how one could fix this?

    Thanks, Allan

    opened by AllanYangZhou 2
  • other tasks in D4RL

    other tasks in D4RL

    Hi, Thanks for providing the implementations of your work! I want to valid IBC on the locomotion tasks in D4RL, such as hopper, halfcheetah .. But it seems like you haven't provided the relevant datasets. Are there any scripts code for converting the d4rl dataset to the tfrecords? Or the dataset links for direct downloading like the adroits :) Thanks

    opened by pcheng2 0
Owner
Google Research
Google Research
Proximal Backpropagation - a neural network training algorithm that takes implicit instead of explicit gradient steps

Proximal Backpropagation Proximal Backpropagation (ProxProp) is a neural network training algorithm that takes implicit instead of explicit gradient s

Thomas Frerix 40 Dec 17, 2022
Official PyTorch implementation of Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations

Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations Zhenyu Jiang, Yifeng Zhu, Maxwell Svetlik, Kuan Fang, Yu

UT-Austin Robot Perception and Learning Lab 63 Jan 3, 2023
Digan - Official PyTorch implementation of Generating Videos with Dynamics-aware Implicit Generative Adversarial Networks

DIGAN (ICLR 2022) Official PyTorch implementation of "Generating Videos with Dyn

Sihyun Yu 147 Dec 31, 2022
Official PyTorch code of Holistic 3D Scene Understanding from a Single Image with Implicit Representation (CVPR 2021)

Implicit3DUnderstanding (Im3D) [Project Page] Holistic 3D Scene Understanding from a Single Image with Implicit Representation Cheng Zhang, Zhaopeng C

Cheng Zhang 149 Jan 8, 2023
Official code release for ICCV 2021 paper SNARF: Differentiable Forward Skinning for Animating Non-rigid Neural Implicit Shapes.

Official code release for ICCV 2021 paper SNARF: Differentiable Forward Skinning for Animating Non-rigid Neural Implicit Shapes.

null 235 Dec 26, 2022
Implementation of "Deep Implicit Templates for 3D Shape Representation"

Deep Implicit Templates for 3D Shape Representation Zerong Zheng, Tao Yu, Qionghai Dai, Yebin Liu. arXiv 2020. This repository is an implementation fo

Zerong Zheng 144 Dec 7, 2022
Pytorch implementation of COIN, a framework for compression with implicit neural representations 🌸

COIN ?? This repo contains a Pytorch implementation of COIN: COmpression with Implicit Neural representations, including code to reproduce all experim

Emilien Dupont 104 Dec 14, 2022
Pytorch implementation for "Implicit Feature Alignment: Learn to Convert Text Recognizer to Text Spotter".

Implicit Feature Alignment: Learn to Convert Text Recognizer to Text Spotter This is a pytorch-based implementation for paper Implicit Feature Alignme

wangtianwei 61 Nov 12, 2022
Implementation for our ICCV2021 paper: Internal Video Inpainting by Implicit Long-range Propagation

Implicit Internal Video Inpainting Implementation for our ICCV2021 paper: Internal Video Inpainting by Implicit Long-range Propagation paper | project

null 202 Dec 30, 2022
A PyTorch implementation of Implicit Q-Learning

IQL-PyTorch This repository houses a minimal PyTorch implementation of Implicit Q-Learning (IQL), an offline reinforcement learning algorithm, along w

Garrett Thomas 30 Dec 12, 2022
Unofficial Tensorflow 2 implementation of the paper Implicit Neural Representations with Periodic Activation Functions

Siren: Implicit Neural Representations with Periodic Activation Functions The unofficial Tensorflow 2 implementation of the paper Implicit Neural Repr

Seyma Yucer 2 Jun 27, 2022
RL algorithm PPO and IRL algorithm AIRL written with Tensorflow.

RL algorithm PPO and IRL algorithm AIRL written with Tensorflow. They have a parallel sampling feature in order to increase computation speed (especially in high-performance computing (HPC)).

Fangjian Li 3 Dec 28, 2021
Official PyTorch implementation for FastDPM, a fast sampling algorithm for diffusion probabilistic models

Official PyTorch implementation for "On Fast Sampling of Diffusion Probabilistic Models". FastDPM generation on CIFAR-10, CelebA, and LSUN datasets. S

Zhifeng Kong 68 Dec 26, 2022
Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.

Softlearning Softlearning is a deep reinforcement learning toolbox for training maximum entropy policies in continuous domains. The implementation is

Robotic AI & Learning Lab Berkeley 997 Dec 30, 2022
The official implementation of the Hybrid Self-Attention NEAT algorithm

PUREPLES - Pure Python Library for ES-HyperNEAT About This is a library of evolutionary algorithms with a focus on neuroevolution, implemented in pure

Adrian Westh 91 Dec 12, 2022
DCA - Official Python implementation of Delaunay Component Analysis algorithm

Delaunay Component Analysis (DCA) Official Python implementation of the Delaunay

Petra Poklukar 9 Sep 6, 2022
Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)

Deep Daze mist over green hills shattered plates on the grass cosmic love and attention a time traveler in the crowd life during the plague meditative

Phil Wang 4.4k Jan 3, 2023
Learning Continuous Image Representation with Local Implicit Image Function

LIIF This repository contains the official implementation for LIIF introduced in the following paper: Learning Continuous Image Representation with Lo

Yinbo Chen 1k Dec 25, 2022
Implicit Graph Neural Networks

Implicit Graph Neural Networks This repository is the official PyTorch implementation of "Implicit Graph Neural Networks". Fangda Gu*, Heng Chang*, We

Heng Chang 48 Nov 29, 2022