HAIS_2GNN: 3D Visual Grounding with Graph and Attention
This repository is for the HAIS_2GNN research project.
Tao Gu, Yue Chen
Introduction
The motivation of this project is to improve the accuracy of 3D visual grounding. In this report, we propose a new model, named HAIS_2GNN based on the InstanceRefer model, to tackle the problem of insufficient connections between instance proposals. Our model incorporates a powerful instance segmentation model HAIS and strengthens the instance features by the structure of graph and attention, so that the text and point cloud can be better matched together. Experiments confirm that our method outperforms the InstanceRefer on ScanRefer validation datasets. Link to the technical report
Setup
The code is tested on Ubuntu 20.04.3 LTS with Python 3.9.7 PyTorch 1.10.1 CUDA 11.3.1 installed.
conda install pytorch==1.10.1 torchvision==0.11.2 cudatoolkit=11.3 -c pytorch
Install the necessary packages listed out in requirements.txt
:
pip install -r requirements.txt
After all packages are properly installed, please run the following commands to compile the torchsaprse v1.4.0:
sudo apt-get install libsparsehash-dev
pip install --upgrade git+https://github.com/mit-han-lab/[email protected]
Before moving on to the next step, please don't forget to set the project root path to the CONF.PATH.BASE
in lib/config.py
.
Data preparation
- Download the ScanRefer dataset and unzip it under
data/
. - Downloadand the preprocessed GLoVE embeddings (~990MB) and put them under
data/
. - Download the ScanNetV2 dataset and put (or link)
scans/
under (or to)data/scannet/scans/
(Please follow the ScanNet Instructions for downloading the ScanNet dataset). After this step, there should be folders containing the ScanNet scene data under thedata/scannet/scans/
with names likescene0000_00
- Used official and pre-trained HAIS generate panoptic segmentation in
PointGroupInst/
. We will provide the pre-trained data soon. - Pre-processed instance labels, and new data should be generated in
data/scannet/pointgroup_data/
cd data/scannet/
python prepare_data.py --split train --pointgroupinst_path [YOUR_PATH]
python prepare_data.py --split val --pointgroupinst_path [YOUR_PATH]
python prepare_data.py --split test --pointgroupinst_path [YOUR_PATH]
Finally, the dataset folder should be organized as follows.
InstanceRefer
├── data
│ ├── glove.p
│ ├── ScanRefer_filtered.json
│ ├── ...
│ ├── scannet
│ │ ├── meta_data
│ │ ├── pointgroup_data
│ │ │ ├── scene0000_00_aligned_bbox.npy
│ │ │ ├── scene0000_00_aligned_vert.npy
│ │ ├──├── ... ...
Training
Train the InstanceRefer model. You can change hyper-parameters in config/InstanceRefer.yaml
:
python scripts/train.py --log_dir HAIS_2GNN
Evaluation
You need specific the use_checkpoint
with the folder that contains model.pth
in config/InstanceRefer.yaml
and run with:
python scripts/eval.py
Pre-trained Models
Input | [email protected] Unique | [email protected] | Checkpoints |
---|---|---|---|
xyz+rgb | 39.24 | 33.66 | will be released soon |
TODO
- Add pre-trained HAIS dataset.
- Release pre-trained model.
- Merge HAIS in an end-to-end manner.
- Upload to ScanRefer benchmark
Changelog
02/09/2022: Released HAIS_2GNN
Acknowledgement
This work is a research project conducted by Tao Gu and Yue Chen for ADL4CV:Visual Computing course at the Technical University of Munich.
We acknowledge that our work is based on ScanRefer, InstanceRefer, HAIS, torchsaprse, and pytorch_geometric.
License
This repository is released under MIT License (see LICENSE file for details).