Recurrent VLN-BERT
Code of the Recurrent-VLN-BERT paper: A Recurrent Vision-and-Language BERT for Navigation
Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-Opazo, Stephen Gould
Prerequisites
Installation
Install the Matterport3D Simulator. Please find the versions of packages in our environment here.
Install the Pytorch-Transformers. In particular, we use this version (same as OSCAR) in our experiments.
Data Preparation
Please follow the instructions below to prepare the data in directories:
- MP3D navigability graphs:
connectivity
- Download the connectivity maps [23.8MB].
- R2R data:
data
- Download the R2R data [5.8MB].
- Augmented data:
data/prevalent
- Download the collected triplets in PREVALENT [1.5GB] (pre-processed for easy use).
- MP3D image features:
img_features
- Download the Scene features [4.2GB] (ResNet-152-Places365).
Initial OSCAR and PREVALENT weights
Please refer to vlnbert_init.py to set up the directories.
- Pre-trained OSCAR weights
- Download the
base-no-labels
following this guide.
- Download the
- Pre-trained PREVALENT weights
- Download the
pytorch_model.bin
from here.
- Download the
Trained Network Weights
- Recurrent-VLN-BERT:
snap
- Download the trained network weights [2.5GB] for our OSCAR-based and PREVALENT-based models.
R2R Navigation
Please read Peter Anderson's VLN paper for the R2R Navigation task.
Reproduce Testing Results
To replicate the performance reported in our paper, load the trained network weights and run validation:
bash run/test_agent.bash
You can simply switch between the OSCAR-based and the PREVALENT-based VLN models by changing the arguments vlnbert
(oscar or prevalent) and load
(trained model paths).
Training
Navigator
To train the network from scratch, simply run:
bash run/train_agent.bash
The trained Navigator will be saved under snap/
.
Citation
If you use or discuss our Recurrent VLN-BERT, please cite our paper:
@article{hong2020recurrent,
title={A Recurrent Vision-and-Language BERT for Navigation},
author={Hong, Yicong and Wu, Qi and Qi, Yuankai and Rodriguez-Opazo, Cristian and Gould, Stephen},
journal={arXiv preprint arXiv:2011.13922},
year={2020}
}