Vision-and-Language Navigation in Continuous Environments using Habitat

Jacob Krantz

Last update: Jan 2, 2023

Related tags

Deep Learning python research ai computer-vision deep-learning robotics

Overview

Vision-and-Language Navigation in Continuous Environments (VLN-CE)

Project Website — VLN-CE Challenge — RxR-Habitat Challenge

Official implementations:

Beyond the Nav-Graph: Vision-and-Language Navigation in Continuous Environments (paper)
Waypoint Models for Instruction-guided Navigation in Continuous Environments (paper, README)

Vision and Language Navigation in Continuous Environments (VLN-CE) is an instruction-guided navigation task with crowdsourced instructions, realistic environments, and unconstrained agent navigation. This repo is a launching point for interacting with the VLN-CE task and provides both baseline agents and training methods. Both the Room-to-Room (R2R) and the Room-Across-Room (RxR) datasets are supported. VLN-CE is implemented using the Habitat platform.

Setup

This project is developed with Python 3.6. If you are using miniconda or anaconda, you can create an environment:

conda create -n vlnce python3.6
conda activate vlnce

VLN-CE uses Habitat-Sim 0.1.7 which can be built from source or installed from conda:

conda install -c aihabitat -c conda-forge habitat-sim=0.1.7 headless

Then install Habitat-Lab:

git clone --branch v0.1.7 [email protected]:facebookresearch/habitat-lab.git
cd habitat-lab
# installs both habitat and habitat_baselines
python -m pip install -r requirements.txt
python -m pip install -r habitat_baselines/rl/requirements.txt
python -m pip install -r habitat_baselines/rl/ddppo/requirements.txt
python setup.py develop --all

Now you can install VLN-CE:

git clone [email protected]:jacobkrantz/VLN-CE.git
cd VLN-CE
python -m pip install -r requirements.txt

Data

Scenes: Matterport3D

Matterport3D (MP3D) scene reconstructions are used. The official Matterport3D download script (download_mp.py) can be accessed by following the instructions on their project webpage. The scene data can then be downloaded:

# requires running with python 2.7
python download_mp.py --task habitat -o data/scene_datasets/mp3d/

Extract such that it has the form data/scene_datasets/mp3d/{scene}/{scene}.glb. There should be 90 scenes.

Episodes: Room-to-Room (R2R)

The R2R_VLNCE dataset is a port of the Room-to-Room (R2R) dataset created by Anderson et al for use with the Matterport3DSimulator (MP3D-Sim). For details on the porting process from MP3D-Sim to the continuous reconstructions used in Habitat, please see our paper. We provide two versions of the dataset, R2R_VLNCE_v1-2 and R2R_VLNCE_v1-2_preprocessed. R2R_VLNCE_v1-2 contains the train, val_seen, val_unseen, and test splits. R2R_VLNCE_v1-2_preprocessed runs with our models out of the box. It additionally includes instruction tokens mapped to GloVe embeddings, ground truth trajectories, and a data augmentation split (envdrop) that is ported from R2R-EnvDrop. The test split does not contain episode goals or ground truth paths. For more details on the dataset contents and format, see our project page.

Dataset	Extract path	Size
R2R_VLNCE_v1-2.zip	`data/datasets/R2R_VLNCE_v1-2`	3 MB
R2R_VLNCE_v1-2_preprocessed.zip	`data/datasets/R2R_VLNCE_v1-2_preprocessed`	345 MB

Downloading the dataset:

# R2R_VLNCE_v1-2
gdown https://drive.google.com/uc?id=1YDNWsauKel0ht7cx15_d9QnM6rS4dKUV
# R2R_VLNCE_v1-2_preprocessed
gdown https://drive.google.com/uc?id=18sS9c2aRu2EAL4c7FyG29LDAm2pHzeqQ

Encoder Weights

Baseline models encode depth observations using a ResNet pre-trained on PointGoal navigation. Those weights can be downloaded from here (672M). Extract the contents to data/ddppo-models/{model}.pth.

Episodes: Room-Across-Room (RxR)

Download: RxR_VLNCE_v0.zip

The Room-Across-Room dataset was ported to continuous environments for the RxR-Habitat Challenge hosted at the CVPR 2021 Embodied AI Workshop. The dataset has train, val_seen, val_unseen, and test_challenge splits with both Guide and Follower trajectories ported. The starter code expects files in this structure:

data/datasets
├─ RxR_VLNCE_v0
|   ├─ train
|   |    ├─ train_guide.json.gz
|   |    ├─ train_guide_gt.json.gz
|   |    ├─ train_follower.json.gz
|   |    ├─ train_follower_gt.json.gz
|   ├─ val_seen
|   |    ├─ val_seen_guide.json.gz
|   |    ├─ val_seen_guide_gt.json.gz
|   |    ├─ val_seen_follower.json.gz
|   |    ├─ val_seen_follower_gt.json.gz
|   ├─ val_unseen
|   |    ├─ val_unseen_guide.json.gz
|   |    ├─ val_unseen_guide_gt.json.gz
|   |    ├─ val_unseen_follower.json.gz
|   |    ├─ val_unseen_follower_gt.json.gz
|   ├─ test_challenge
|   |    ├─ test_challenge_guide.json.gz
|   ├─ text_features
|   |    ├─ ...

The baseline models for RxR-Habitat use precomputed BERT instruction features which can be downloaded from here and saved to data/datasets/RxR_VLNCE_v0/text_features/rxr_{split}/{instruction_id}_{language}_text_features.npz.

RxR-Habitat Challenge (RxR Data)

The RxR-Habitat Challenge uses the new Room-Across-Room (RxR) dataset which:

contains multilingual instructions (English, Hindi, Telugu),
is an order of magnitude larger than existing datasets, and
uses varied paths to break a shortest-path-to-goal assumption.

The challenge was hosted at the CVPR 2021 Embodied AI Workshop. While the official challenge is over, the leaderboard remains open and we encourage submissions on this difficult task! For guidelines and access, please visit: ai.google.com/research/rxr/habitat.

Generating Submissions

Submissions are made by running an agent locally and submitting a jsonlines file (.jsonl) containing the agent's trajectories. Starter code for generating this file is provided in the function BaseVLNCETrainer.inference(). Here is an example of generating predictions for English using the Cross-Modal Attention baseline:

python run.py \
  --exp-config vlnce_baselines/config/rxr_baselines/rxr_cma_en.yaml \
  --run-type inference

If you use different models for different languages, you can merge their predictions with scripts/merge_inference_predictions.py. Submissions are only accepted that contain all episodes from all three languages in the test-challenge split. Starter code for this challenge was originally hosted in the rxr-habitat-challenge branch but is now under continual development in master.

VLN-CE Challenge (R2R Data)

The VLN-CE Challenge is live and taking submissions for public test set evaluation. This challenge uses the R2R data ported in the original VLN-CE paper.

To submit to the leaderboard, you must run your agent locally and submit a JSON file containing the generated agent trajectories. Starter code for generating this JSON file is provided in the function BaseVLNCETrainer.inference(). Here is an example of generating this file using the pretrained Cross-Modal Attention baseline:

python run.py \
  --exp-config vlnce_baselines/config/r2r_baselines/test_set_inference.yaml \
  --run-type inference

Predictions must be in a specific format. Please visit the challenge webpage for guidelines.

Baseline Performance

The baseline model for the VLN-CE task is the cross-modal attention model trained with progress monitoring, DAgger, and augmented data (CMA_PM_DA_Aug). As evaluated on the leaderboard, this model achieves:

Split	TL	NE	OS	SR	SPL
Test	8.85	7.91	0.36	0.28	0.25
Val Unseen	8.27	7.60	0.36	0.29	0.27
Val Seen	9.06	7.21	0.44	0.34	0.32

This model was originally presented with a val_unseen performance of 0.30 SPL, however the leaderboard evaluates this same model at 0.27 SPL. The model was trained and evaluated on a hardware + Habitat build that gave slightly different results, as is the case for the other paper experiments. Going forward, the leaderboard contains the performance metrics that should be used for official comparison. In our tests, the installation procedure for this repo gives nearly identical evaluation to the leaderboard, but we recognize that compute hardware along with the version and build of Habitat are factors to reproducibility.

For push-button replication of all VLN-CE experiments, see here.

Starter Code

The run.py script controls training and evaluation for all models and datasets:

python run.py \
  --exp-config path/to/experiment_config.yaml \
  --run-type {train | eval | inference}

For example, a random agent can be evaluated on 10 val-seen episodes of R2R using this command:

python run.py --exp-config vlnce_baselines/config/r2r_baselines/nonlearning.yaml --run-type eval

For lists of modifiable configuration options, see the default task config and experiment config files.

Training Agents

The DaggerTrainer class is the standard trainer and supports teacher forcing or dataset aggregation (DAgger). This trainer saves trajectories consisting of RGB, depth, ground-truth actions, and instructions to disk to avoid time spent in simulation.

The RecollectTrainer class performs teacher forcing using the ground truth trajectories provided in the dataset rather than a shortest path expert. Also, this trainer does not save episodes to disk, instead opting to recollect them in simulation.

Both trainers inherit from BaseVLNCETrainer.

Evaluating Agents

Evaluation on validation splits can be done by running python run.py --exp-config path/to/experiment_config.yaml --run-type eval. If EVAL.EPISODE_COUNT == -1, all episodes will be evaluated. If EVAL_CKPT_PATH_DIR is a directory, each checkpoint will be evaluated one at a time.

Cuda

Cuda will be used by default if it is available. We find that one GPU for the model and several GPUs for simulation is favorable.

SIMULATOR_GPU_IDS: [0]  # list of GPU IDs to run simulations
TORCH_GPU_ID: 0  # GPU for pytorch-related code (the model)
NUM_ENVIRONMENTS: 1  # Each GPU runs NUM_ENVIRONMENTS environments

The simulator and torch code do not need to run on the same device. For faster training and evaluation, we recommend running with as many NUM_ENVIRONMENTS as will fit on your GPU while assuming 1 CPU core per env.

License

The VLN-CE codebase is MIT licensed. Trained models and task datasets are considered data derived from the mp3d scene dataset. Matterport3D based task datasets and trained models are distributed with Matterport3D Terms of Use and under CC BY-NC-SA 3.0 US license.

Citing

If you use VLN-CE in your research, please cite the following paper:

@inproceedings{krantz_vlnce_2020,
  title={Beyond the Nav-Graph: Vision and Language Navigation in Continuous Environments},
  author={Jacob Krantz and Erik Wijmans and Arjun Majundar and Dhruv Batra and Stefan Lee},
  booktitle={European Conference on Computer Vision (ECCV)},
  year={2020}
 }

If you use the RxR-Habitat data, please additionally cite the following paper:

@inproceedings{ku2020room,
  title={Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding},
  author={Ku, Alexander and Anderson, Peter and Patel, Roma and Ie, Eugene and Baldridge, Jason},
  booktitle={Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  pages={4392--4412},
  year={2020}
}

Comments

ValueError: Type mismatch ( vs. ) with values (DATASET:

hello，I'm running python run.py --exp-config vlnce_ baselines/config/ nonlearning.yaml --The following error occurred during run type Eval:

ValueError: Type mismatch (<class 'habitat.config.default.Config'> vs. <class 'yacs.config.CfgNode'>) with values (DATASET:

I don't understand why the report is wrong. I look forward to your reply.

opened by W-xf 5
[RxR-Habitat] What kind of GPU is needed to train the cma policy with the original config?

Hi there,

While running the starter code for the rxr challenge I found a single NVIDIA 2080ti GPU's VRAM (11GiB) could only fit batch_size 1 with the cma policy and max_traj_len 250. Although we could set effective_batch_size but it is set to be -1 while batch_size to be 3 in the original config. So I'm wondering what kind of GPU is needed to train the cma policy with the original config?

Also, I found with batch_size 1, max_traj_len 250, preload_size 30 and 9 environments simulated, the used RAM will be more that 40 GiB. Is it normal?

Thanks!

opened by wz0919 3
Bring back v0.1.4 ShortestPathFollower
In response to #7.

The dataset path generation and pruning was performed using the Habitat-Lab v0.1.4 ShortestPathFollower. The VLN-CE paper also used this follower as an oracle. Habitat v0.1.5 updated the ShortestPathFollower to slightly different behavior. For compatibility with the oracle used for dataset generation and in the VLN-CE paper, we are bringing back the v0.1.4 ShortestPathFollower (ShortestPathFollowerCompat) as default. To instead use the Habitat path follower in the VLNOracleActionSensor, update the task config to include:

TASK: VLN_ORACLE_ACTION_SENSOR: USE_ORIGINAL_FOLLOWER: False
opened by jacobkrantz 3
[RxR-VLNCE Challenge] How to get the error log fron RxR leadboard

I have tried some times to submmit our results. But the status of our attempts are error. (We have used the standard RxR task config.) I want to know if there is any way to check the error log file?

Thanks for your attention to this matter! Best regards,

opened by MarSaKi 2
How to use multiple GPUs to train dagger models?

Thanks for the great work.

When I run training using dagger_trainer.py, I found that a large part of training time is taken by 1) collecting data and 2) training the model using collected data. The first process can be speeded up by setting more simulator GPU (SIMULATOR_GPU_IDS). However, the second process can only use one GPU (TORCH_GPU_ID) by default.

Is there any easy way to use multiple GPUs to speed up the second process? Or should I use torch.distributed to reproduce the code by myself?

Many thanks!

opened by PeihaoChen 2
[RxR-Habitat] Eval baseline reproduction
Hello, I tried to reproduce the baseline performance using the same yaml file listed on the README-

python run.py \ --exp-config vlnce_baselines/config/rxr_configs/rxr_cma_en.yaml \ --run-type train

My experimental setup used 4 TITAN X GPUs with 4 environments each. Referring issue #17 , I set my batch_size: 1 and effective_batch_size: 3 to successfully train. No other changes have been made to the codebase.

After evaluating my saved checkpoint, I found the following metrics (all average across episodes):

steps_taken: 350.443718 path_length: 6.737881 distance_to_goal: 11.082229 success: 0.066503 oracle_success: 0.180703 spl: 0.055868 ndtw: 0.358719

Comparing these to Table 2 entry for Seq2Seq w/ RGBD, Instructions, and History, I found my performance to be significantly lower for the matching metrics.

Is this the right config to be used to match the relevant baseline?
opened by nikwalia 2
How to get panoramas ?

Hi, I find when I use env.observation I only get one image. However, the CMA model should use panorams, right ? Is ther someone konws how to get the panoramas ?

opened by Mingxiao-Li 2
Hope to provide more detailed content about embeddings.json.gz

Hello, I am very interested in your research. I hope to get the details about embeddings.json.gz: the correspondence among words - word embedding - instruction_tokens. I would be very grateful if I could get your reply.

opened by Dominique-github 2
habitat-sim problem: Platform::WindowlessEglApplication::tryCreateContext(): no EGL devices found
When I want to try this code, I must install the haistat-sim first, but I encounter a big bug. I followed the issue in the haistat sim No.288, but it did not solve my problem. Who has some good suggestions?

I follow the codes:

conda install -c aihabitat -c conda-forge habitat-sim=0.1.7 headless git clone --branch v0.1.7 [email protected]:facebookresearch/habitat-lab.git python -m pip install -r requirements.txt python -m pip install -r habitat_baselines/rl/requirements.txt python -m pip install -r habitat_baselines/rl/ddppo/requirements.txt python setup.py develop --all wget http://dl.fbaipublicfiles.com/habitat/habitat-test-scenes.zip unzip habitat-test-scenes.zip python examples/example.py

Then get the following errors:

I1010 18:31:04.590993 11476 SceneGraph.h:93] Created DrawableGroup: Platform::WindowlessEglApplication::tryCreateContext(): unable to find EGL device for CUDA device 0 WindowlessContext: Unable to create windowless context

OR

I1207 08:31:49.998020 1190 SceneGraph.h:93] Created DrawableGroup: Platform::WindowlessEglApplication::tryCreateContext(): no EGL devices found, likely a driver issue; enable --magnum-gpu-validation to see additional info WindowlessContext: Unable to create windowless context

I can confirm that my NVIDIA driver is correct, cuda is also OK, since I can run other CNN-related codes well.

In fact, I can run habitat correctly in my PC, but I have always encountered this error in docker of my server.

I have tried multiple versions of the NVIDIA driver and cuda, as well as some possible dependent lib versions, such as libgl in my server, following haistat sim No.288 .

The following are all the differences I can understand between PC and server's docker:

with **ldconfig -N -v | grep libEGL**, In the server's docker:

/sbin/ldconfig.real: Can't stat /usr/local/cuda/compat/lib: No such file or directory /sbin/ldconfig.real: Path /usr/local/cuda/lib64' given more than once /sbin/ldconfig.real: Can't stat /usr/local/nvidia/lib: No such file or directory /sbin/ldconfig.real: Can't stat /usr/local/nvidia/lib64: No such file or directory /sbin/ldconfig.real: Can't stat /usr/local/lib/x86_64-linux-gnu: No such file or directory /sbin/ldconfig.real: Path/lib/x86_64-linux-gnu' given more than once /sbin/ldconfig.real: Path `/usr/lib/x86_64-linux-gnu' given more than once /sbin/ldconfig.real: /lib/x86_64-linux-gnu/ld-2.27.so is the dynamic linker, ignoring /sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.440.100 is empty, not checked. /sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.440.100 is empty, not checked. /sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.440.100 is empty, not checked. /sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.440.100 is empty, not checked. /sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvcuvid.so.440.100 is empty, not checked. /sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.440.100 is empty, not checked. /sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libcuda.so.440.100 is empty, not checked. /sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.440.100 is empty, not checked. /sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.440.100 is empty, not checked. /sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.440.100 is empty, not checked. /sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.440.100 is empty, not checked. /sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libvdpau_nvidia.so.440.100 is empty, not checked. libEGL_mesa.so.0 -> libEGL_mesa.so.0.0.0 libEGL.so.1 -> libEGL.so.1.0.0 libEGL_nvidia.so.0 -> libEGL_nvidia.so.470.141.03

BUT in the PC:

/sbin/ldconfig.real: Path /lib/x86_64-linux-gnu' given more than once /sbin/ldconfig.real: Path/usr/lib/x86_64-linux-gnu' given more than once /sbin/ldconfig.real: /usr/local/cuda-10.1/targets/x86_64-linux/lib/libcudnn.so.7 is not a symbolic link /sbin/ldconfig.real: /lib/x86_64-linux-gnu/ld-2.23.so is the dynamic linker, ignoring libEGL.so.1 -> libEGL.so.1.1.0 libEGL_nvidia.so.0 -> libEGL_nvidia.so.440.44

It seems to be a problem from this difference? Who has some good solutions, please help me. Thanks.
opened by FutureGoingOn 1
The ddppo download link is broken.

"Baseline models encode depth observations using a ResNet pre-trained on PointGoal navigation. Those weights can be downloaded from here (672M). Extract the contents to data/ddppo-models/{model}.pth." The ddppo download link is broken.

opened by sunqiang85 1
Bump tensorflow from 1.13.1 to 2.7.2
Bumps tensorflow from 1.13.1 to 2.7.2.

Release notes

Sourced from tensorflow's releases.

TensorFlow 2.7.2

Release 2.7.2

This releases introduces several vulnerability fixes:

Fixes a code injection in saved_model_cli (CVE-2022-29216)

Fixes a missing validation which causes TensorSummaryV2 to crash (CVE-2022-29193)

Fixes a missing validation which crashes QuantizeAndDequantizeV4Grad (CVE-2022-29192)

Fixes a missing validation which causes denial of service via DeleteSessionTensor (CVE-2022-29194)

Fixes a missing validation which causes denial of service via GetSessionTensor (CVE-2022-29191)

Fixes a missing validation which causes denial of service via StagePeek (CVE-2022-29195)

Fixes a missing validation which causes denial of service via UnsortedSegmentJoin (CVE-2022-29197)

Fixes a missing validation which causes denial of service via LoadAndRemapMatrix (CVE-2022-29199)

Fixes a missing validation which causes denial of service via SparseTensorToCSRSparseMatrix (CVE-2022-29198)

Fixes a missing validation which causes denial of service via LSTMBlockCell (CVE-2022-29200)

Fixes a missing validation which causes denial of service via Conv3DBackpropFilterV2 (CVE-2022-29196)

Fixes a CHECK failure in depthwise ops via overflows (CVE-2021-41197)

Fixes issues arising from undefined behavior stemming from users supplying invalid resource handles (CVE-2022-29207)

Fixes a segfault due to missing support for quantized types (CVE-2022-29205)

Fixes a missing validation which results in undefined behavior in SparseTensorDenseAdd (CVE-2022-29206)

Fixes a missing validation which results in undefined behavior in QuantizedConv2D (CVE-2022-29201)

Fixes an integer overflow in SpaceToBatchND (CVE-2022-29203)

Fixes a segfault and OOB write due to incomplete validation in EditDistance (CVE-2022-29208)

Fixes a missing validation which causes denial of service via Conv3DBackpropFilterV2 (CVE-2022-29204)

Fixes a denial of service in tf.ragged.constant due to lack of validation (CVE-2022-29202)

Fixes a segfault when tf.histogram_fixed_width is called with NaN values (CVE-2022-29211)

Fixes a core dump when loading TFLite models with quantization (CVE-2022-29212)

Fixes crashes stemming from incomplete validation in signal ops (CVE-2022-29213)

Fixes a type confusion leading to CHECK-failure based denial of service (CVE-2022-29209)

Updates curl to 7.83.1 to handle (CVE-2022-22576, (CVE-2022-27774, (CVE-2022-27775, (CVE-2022-27776, (CVE-2022-27778, (CVE-2022-27779, (CVE-2022-27780, (CVE-2022-27781, (CVE-2022-27782 and (CVE-2022-30115

Updates zlib to 1.2.12 after 1.2.11 was pulled due to security issue

TensorFlow 2.7.1

Release 2.7.1

This releases introduces several vulnerability fixes:

Fixes a floating point division by 0 when executing convolution operators (CVE-2022-21725)

Fixes a heap OOB read in shape inference for ReverseSequence (CVE-2022-21728)

Fixes a heap OOB access in Dequantize (CVE-2022-21726)

Fixes an integer overflow in shape inference for Dequantize (CVE-2022-21727)

Fixes a heap OOB access in FractionalAvgPoolGrad (CVE-2022-21730)

Fixes an overflow and divide by zero in UnravelIndex (CVE-2022-21729)

Fixes a type confusion in shape inference for ConcatV2 (CVE-2022-21731)

Fixes an OOM in ThreadPoolHandle (CVE-2022-21732)

Fixes an OOM due to integer overflow in StringNGrams (CVE-2022-21733)

Fixes more issues caused by incomplete validation in boosted trees code (CVE-2021-41208)

Fixes an integer overflows in most sparse component-wise ops (CVE-2022-23567)

Fixes an integer overflows in AddManySparseToTensorsMap (CVE-2022-23568)

... (truncated)

Changelog

Sourced from tensorflow's changelog.

Release 2.7.2

This releases introduces several vulnerability fixes:

Fixes a code injection in saved_model_cli (CVE-2022-29216)

Fixes a missing validation which causes TensorSummaryV2 to crash (CVE-2022-29193)

Fixes a missing validation which crashes QuantizeAndDequantizeV4Grad (CVE-2022-29192)

Fixes a missing validation which causes denial of service via DeleteSessionTensor (CVE-2022-29194)

Fixes a missing validation which causes denial of service via GetSessionTensor (CVE-2022-29191)

Fixes a missing validation which causes denial of service via StagePeek (CVE-2022-29195)

Fixes a missing validation which causes denial of service via UnsortedSegmentJoin (CVE-2022-29197)

Fixes a missing validation which causes denial of service via LoadAndRemapMatrix (CVE-2022-29199)

Fixes a missing validation which causes denial of service via SparseTensorToCSRSparseMatrix (CVE-2022-29198)

Fixes a missing validation which causes denial of service via LSTMBlockCell (CVE-2022-29200)

Fixes a missing validation which causes denial of service via Conv3DBackpropFilterV2 (CVE-2022-29196)

Fixes a CHECK failure in depthwise ops via overflows (CVE-2021-41197)

Fixes issues arising from undefined behavior stemming from users supplying invalid resource handles (CVE-2022-29207)

Fixes a segfault due to missing support for quantized types (CVE-2022-29205)

Fixes a missing validation which results in undefined behavior in SparseTensorDenseAdd (CVE-2022-29206)

Fixes a missing validation which results in undefined behavior in QuantizedConv2D (CVE-2022-29201)

Fixes an integer overflow in SpaceToBatchND (CVE-2022-29203)

Fixes a segfault and OOB write due to incomplete validation in EditDistance (CVE-2022-29208)

Fixes a missing validation which causes denial of service via Conv3DBackpropFilterV2 (CVE-2022-29204)

Fixes a denial of service in tf.ragged.constant due to lack of validation (CVE-2022-29202)

Fixes a segfault when tf.histogram_fixed_width is called with NaN values (CVE-2022-29211)

Fixes a core dump when loading TFLite models with quantization (CVE-2022-29212)

Fixes crashes stemming from incomplete validation in signal ops (CVE-2022-29213)

Fixes a type confusion leading to CHECK-failure based denial of service (CVE-2022-29209)

Updates curl to 7.83.1 to handle (CVE-2022-22576, (CVE-2022-27774, (CVE-2022-27775, (CVE-2022-27776, (CVE-2022-27778, (CVE-2022-27779, (CVE-2022-27780, (CVE-2022-27781, (CVE-2022-27782 and (CVE-2022-30115

Updates zlib to 1.2.12 after 1.2.11 was pulled due to security issue

Release 2.6.4

This releases introduces several vulnerability fixes:

Fixes a code injection in saved_model_cli (CVE-2022-29216)

Fixes a missing validation which causes TensorSummaryV2 to crash (CVE-2022-29193)

Fixes a missing validation which crashes QuantizeAndDequantizeV4Grad (CVE-2022-29192)

Fixes a missing validation which causes denial of service via DeleteSessionTensor (CVE-2022-29194)

Fixes a missing validation which causes denial of service via GetSessionTensor (CVE-2022-29191)

Fixes a missing validation which causes denial of service via StagePeek (CVE-2022-29195)

Fixes a missing validation which causes denial of service via UnsortedSegmentJoin (CVE-2022-29197)

Fixes a missing validation which causes denial of service via LoadAndRemapMatrix (CVE-2022-29199)

Fixes a missing validation which causes denial of service via SparseTensorToCSRSparseMatrix (CVE-2022-29198)

Fixes a missing validation which causes denial of service via LSTMBlockCell (CVE-2022-29200)

Fixes a missing validation which causes denial of service via Conv3DBackpropFilterV2 (CVE-2022-29196)

Fixes a CHECK failure in depthwise ops via overflows (CVE-2021-41197)

Fixes issues arising from undefined behavior stemming from users supplying invalid resource handles (CVE-2022-29207)

Fixes a segfault due to missing support for quantized types (CVE-2022-29205)

Fixes a missing validation which results in undefined behavior in SparseTensorDenseAdd (CVE-2022-29206)

... (truncated)

Commits

dd7b8a3 Merge pull request #56034 from tensorflow-jenkins/relnotes-2.7.2-15779

1e7d6ea Update RELEASE.md

5085135 Merge pull request #56069 from tensorflow/mm-cp-52488e5072f6fe44411d70c6af09e...

adafb45 Merge pull request #56060 from yongtang:curl-7.83.1

01cb1b8 Merge pull request #56038 from tensorflow-jenkins/version-numbers-2.7.2-4733

8c90c2f Update version numbers to 2.7.2

43f3cdc Update RELEASE.md

98b0a48 Insert release notes place-fill

dfa5cf3 Merge pull request #56028 from tensorflow/disable-tests-on-r2.7

501a65c Disable timing out tests

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 1
R2RBaseline
Hi! Thank you for publishing this great work! I followed the example in https://github.com/jacobkrantz/VLN-CE#vln-ce-challenge-r2r-data and tried to reproduce the predictions.json. However, there are two main issues in current codebase. python run.py --exp-config vlnce_baselines/config/r2r_baselines/test_set_inference.yaml --run-type inference

AttributeError: 'InstructionData' object has no attribute 'instruction_id'

File "~/VLN-CE/vlnce_baselines/common/base_il_trainer.py", line 511, in inference k = current_episodes[i].instruction.instruction_id

This can be resolved by adding FORMAT: r2r in vlnce_baselines/config/r2r_baselines/test_set_inference.yaml

RuntimeError: 'lengths' argument should be a 1D CPU int64 tensor, but got 1D cuda:0 Long tensor

my ENV: ubuntu 20.04 torch=1.7.0+cu110

File "~/anaconda3/envs/vlnce/lib/python3.6/site-packages/torch/nn/utils/rnn.py", line 244, in pack_padded_sequence _VF._pack_padded_sequence(input, lengths, batch_first) RuntimeError: 'lengths' argument should be a 1D CPU int64 tensor, but got 1D cuda:0 Long tensor

As set in https://pytorch.org/docs/stable/generated/torch.nn.utils.rnn.pack_padded_sequence.html: torch.nn.utils.rnn.pack_padded_sequence(input, lengths, batch_first=False, enforce_sorted=True) lengths (Tensor or list(int)) – list of sequence lengths of each batch element (must be on the CPU if provided as a tensor).

This can be solved by changing the code in vlnce_baselines/models/encoders/instruction_encoder.py L78

- lengths = (lengths != 0.0).long().sum(dim=1) + lengths = (lengths != 0.0).long().sum(dim=1).cpu()
opened by MuMuJun97 1
The number of ground-truth actions does not match the number of steps in R2R_VLNCE_v1-3 gt. dataset

I have downloaded the preprocessed R2R datasets from this official website. In {split}_gt.json.gz, the field 'actions' contains ground truth actions, which should produce the coordinates stored in the field 'locations'. However, the numbers of the elements in these 2 fields do not equal.

Could anyone give me a hint on how to relate these 2 fields? Thanks

opened by ZJULiHongxin 0
Bump tensorflow from 1.13.1 to 2.9.3
Bumps tensorflow from 1.13.1 to 2.9.3.

Release notes

Sourced from tensorflow's releases.

TensorFlow 2.9.3

Release 2.9.3

This release introduces several vulnerability fixes:

Fixes an overflow in tf.keras.losses.poisson (CVE-2022-41887)

Fixes a heap OOB failure in ThreadUnsafeUnigramCandidateSampler caused by missing validation (CVE-2022-41880)

Fixes a segfault in ndarray_tensor_bridge (CVE-2022-41884)

Fixes an overflow in FusedResizeAndPadConv2D (CVE-2022-41885)

Fixes a overflow in ImageProjectiveTransformV2 (CVE-2022-41886)

Fixes an FPE in tf.image.generate_bounding_box_proposals on GPU (CVE-2022-41888)

Fixes a segfault in pywrap_tfe_src caused by invalid attributes (CVE-2022-41889)

Fixes a CHECK fail in BCast (CVE-2022-41890)

Fixes a segfault in TensorListConcat (CVE-2022-41891)

Fixes a CHECK_EQ fail in TensorListResize (CVE-2022-41893)

Fixes an overflow in CONV_3D_TRANSPOSE on TFLite (CVE-2022-41894)

Fixes a heap OOB in MirrorPadGrad (CVE-2022-41895)

Fixes a crash in Mfcc (CVE-2022-41896)

Fixes a heap OOB in FractionalMaxPoolGrad (CVE-2022-41897)

Fixes a CHECK fail in SparseFillEmptyRowsGrad (CVE-2022-41898)

Fixes a CHECK fail in SdcaOptimizer (CVE-2022-41899)

Fixes a heap OOB in FractionalAvgPool and FractionalMaxPool(CVE-2022-41900)

Fixes a CHECK_EQ in SparseMatrixNNZ (CVE-2022-41901)

Fixes an OOB write in grappler (CVE-2022-41902)

Fixes a overflow in ResizeNearestNeighborGrad (CVE-2022-41907)

Fixes a CHECK fail in PyFunc (CVE-2022-41908)

Fixes a segfault in CompositeTensorVariantToComponents (CVE-2022-41909)

Fixes a invalid char to bool conversion in printing a tensor (CVE-2022-41911)

Fixes a heap overflow in QuantizeAndDequantizeV2 (CVE-2022-41910)

Fixes a CHECK failure in SobolSample via missing validation (CVE-2022-35935)

Fixes a CHECK fail in TensorListScatter and TensorListScatterV2 in eager mode (CVE-2022-35935)

TensorFlow 2.9.2

Release 2.9.2

This releases introduces several vulnerability fixes:

Fixes a CHECK failure in tf.reshape caused by overflows (CVE-2022-35934)

Fixes a CHECK failure in SobolSample caused by missing validation (CVE-2022-35935)

Fixes an OOB read in Gather_nd op in TF Lite (CVE-2022-35937)

Fixes a CHECK failure in TensorListReserve caused by missing validation (CVE-2022-35960)

Fixes an OOB write in Scatter_nd op in TF Lite (CVE-2022-35939)

Fixes an integer overflow in RaggedRangeOp (CVE-2022-35940)

Fixes a CHECK failure in AvgPoolOp (CVE-2022-35941)

Fixes a CHECK failures in UnbatchGradOp (CVE-2022-35952)

Fixes a segfault TFLite converter on per-channel quantized transposed convolutions (CVE-2022-36027)

Fixes a CHECK failures in AvgPool3DGrad (CVE-2022-35959)

Fixes a CHECK failures in FractionalAvgPoolGrad (CVE-2022-35963)

Fixes a segfault in BlockLSTMGradV2 (CVE-2022-35964)

Fixes a segfault in LowerBound and UpperBound (CVE-2022-35965)

... (truncated)

Changelog

Sourced from tensorflow's changelog.

Release 2.9.3

This release introduces several vulnerability fixes:

Fixes an overflow in tf.keras.losses.poisson (CVE-2022-41887)

Fixes a heap OOB failure in ThreadUnsafeUnigramCandidateSampler caused by missing validation (CVE-2022-41880)

Fixes a segfault in ndarray_tensor_bridge (CVE-2022-41884)

Fixes an overflow in FusedResizeAndPadConv2D (CVE-2022-41885)

Fixes a overflow in ImageProjectiveTransformV2 (CVE-2022-41886)

Fixes an FPE in tf.image.generate_bounding_box_proposals on GPU (CVE-2022-41888)

Fixes a segfault in pywrap_tfe_src caused by invalid attributes (CVE-2022-41889)

Fixes a CHECK fail in BCast (CVE-2022-41890)

Fixes a segfault in TensorListConcat (CVE-2022-41891)

Fixes a CHECK_EQ fail in TensorListResize (CVE-2022-41893)

Fixes an overflow in CONV_3D_TRANSPOSE on TFLite (CVE-2022-41894)

Fixes a heap OOB in MirrorPadGrad (CVE-2022-41895)

Fixes a crash in Mfcc (CVE-2022-41896)

Fixes a heap OOB in FractionalMaxPoolGrad (CVE-2022-41897)

Fixes a CHECK fail in SparseFillEmptyRowsGrad (CVE-2022-41898)

Fixes a CHECK fail in SdcaOptimizer (CVE-2022-41899)

Fixes a heap OOB in FractionalAvgPool and FractionalMaxPool(CVE-2022-41900)

Fixes a CHECK_EQ in SparseMatrixNNZ (CVE-2022-41901)

Fixes an OOB write in grappler (CVE-2022-41902)

Fixes a overflow in ResizeNearestNeighborGrad (CVE-2022-41907)

Fixes a CHECK fail in PyFunc (CVE-2022-41908)

Fixes a segfault in CompositeTensorVariantToComponents (CVE-2022-41909)

Fixes a invalid char to bool conversion in printing a tensor (CVE-2022-41911)

Fixes a heap overflow in QuantizeAndDequantizeV2 (CVE-2022-41910)

Fixes a CHECK failure in SobolSample via missing validation (CVE-2022-35935)

Fixes a CHECK fail in TensorListScatter and TensorListScatterV2 in eager mode (CVE-2022-35935)

Release 2.8.4

This release introduces several vulnerability fixes:

Fixes a heap OOB failure in ThreadUnsafeUnigramCandidateSampler caused by missing validation (CVE-2022-41880)

Fixes a segfault in ndarray_tensor_bridge (CVE-2022-41884)

Fixes an overflow in FusedResizeAndPadConv2D (CVE-2022-41885)

Fixes a overflow in ImageProjectiveTransformV2 (CVE-2022-41886)

Fixes an FPE in tf.image.generate_bounding_box_proposals on GPU (CVE-2022-41888)

Fixes a segfault in pywrap_tfe_src caused by invalid attributes (CVE-2022-41889)

Fixes a CHECK fail in BCast (CVE-2022-41890)

Fixes a segfault in TensorListConcat (CVE-2022-41891)

Fixes a CHECK_EQ fail in TensorListResize (CVE-2022-41893)

Fixes an overflow in CONV_3D_TRANSPOSE on TFLite (CVE-2022-41894)

Fixes a heap OOB in MirrorPadGrad (CVE-2022-41895)

Fixes a crash in Mfcc (CVE-2022-41896)

Fixes a heap OOB in FractionalMaxPoolGrad (CVE-2022-41897)

Fixes a CHECK fail in SparseFillEmptyRowsGrad (CVE-2022-41898)

Fixes a CHECK fail in SdcaOptimizer (CVE-2022-41899)

... (truncated)

Commits

a5ed5f3 Merge pull request #58584 from tensorflow/vinila21-patch-2

258f9a1 Update py_func.cc

cd27cfb Merge pull request #58580 from tensorflow-jenkins/version-numbers-2.9.3-24474

3e75385 Update version numbers to 2.9.3

bc72c39 Merge pull request #58482 from tensorflow-jenkins/relnotes-2.9.3-25695

3506c90 Update RELEASE.md

8dcb48e Update RELEASE.md

4f34ec8 Merge pull request #58576 from pak-laura/c2.99f03a9d3bafe902c1e6beb105b2f2417...

6fc67e4 Replace CHECK with returning an InternalError on failing to create python tuple

5dbe90a Merge pull request #58570 from tensorflow/r2.9-7b174a0f2e4

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0

DefaultCPUAllocator: can't allocate memory: you tried to allocate 1589575680 bytes.

Hi Jacob,

Following the setup instructions in https://github.com/jacobkrantz/VLN-CE to run the model, with

python run.py --exp-config=vlnce_baselines/config/rxr_baselines/rxr_cma_en.yml --run-type=train

I got the following error:

  File "/home/wangsu/anaconda3/envs/vlnce_py3.6_h0.1.7/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/wangsu/KrantzVLNCE/habitat-lab/VLN-CE/vlnce_baselines/models/encoders/resnet_encoders.py", line 199, in forward
    resnet_output = self.cnn(normalize(rgb_observations))
  File "/home/wangsu/anaconda3/envs/vlnce_py3.6_h0.1.7/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/wangsu/anaconda3/envs/vlnce_py3.6_h0.1.7/lib/python3.6/site-packages/torch/nn/modules/container.py", line 141, in forward
    input = module(input)
  File "/home/wangsu/anaconda3/envs/vlnce_py3.6_h0.1.7/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/wangsu/anaconda3/envs/vlnce_py3.6_h0.1.7/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 446, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/home/wangsu/anaconda3/envs/vlnce_py3.6_h0.1.7/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 443, in _conv_forward
    self.padding, self.dilation, self.groups)
RuntimeError: [enforce fail at CPUAllocator.cpp:68] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 1589575680 bytes. Error code 12 (Cannot allocate memory)

My machine however does have enough memory:

              total        used        free      shared  buff/cache   available
Mem:    27390640128   574816256 25532850176     8773632  1282973696 26407305216

Could you help look into this please? Thanks!

opened by wangsu-google-language 0

Instruction encoder

Could you please provide details on your instruction encoder? I would like to test the agent on new data, but you only supply pre-computed text weights. Thanks

opened by idansc 0
VLNCE questions

Thanks for the incredible effort on putting this dataset together! I was wondering how can I find the continuous trajectory actions/camera poses for each episode. If I look into "xR_VLNCE_v0/train/train_guide.json.gz", each episode has a trajectory_id field. Does this correspond to the keys in "RxR_VLNCE_v0/train/train_guide_gt.json.gz"? Or is it episode_id that corresponds?

In addition, where can I find the camera poses (location and rotation) for each trajectory? There's an "actions" field in "RxR_VLNCE_v0/train/train_guide_gt.json.gz", how do the actions integers map to actions (1 forward, 2 turn left, 3 turn right)? What does the "locations" field mean there? I would really appreciate if you could help understand the field structure a bit better.

Thanks!

opened by mbautistamartin 0