Rank 1st in the public leaderboard of ScanRefer (2021-03-18)

Overview

InstanceRefer

InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring

This repository is for the 1st method on ScanRefer benchmark [arxiv paper].

Zhihao Yuan, Xu Yan, Yinghong Liao, Ruimao Zhang, Zhen Li*, Shuguang Cui

If you find our work useful in your research, please consider citing:

@InProceedings{yuan2021instancerefer,
  title={InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring},
  author={Zhihao Yuan, Xu Yan, Yinghong Liao, Ruimao Zhang, Zhen Li, Shuguang Cui},
  journal={arXiv preprint},
  year={2021}
}

News

  • 2021-03-31 We release InstanceRefer v1 πŸš€ !
  • 2021-03-18 We achieve 1st place in ScanRefer leaderboard πŸ”₯ .

Getting Started

Setup

The code is tested on Ubuntu 16.04 LTS & 18.04 LTS with PyTorch 1.3.0 CUDA 10.1 installed.

conda install pytorch==1.3.0 cudatoolkit=10.1 -c pytorch

Install the necessary packages listed out in requirements.txt:

pip install -r requirements.txt

After all packages are properly installed, please run the following commands to compile the torchsaprse:

cd lib/torchsparse/
python setup.py install

Before moving on to the next step, please don't forget to set the project root path to the CONF.PATH.BASE in lib/config.py.

Data preparation

  1. Download the ScanRefer dataset and unzip it under data/.
  2. Downloadand the preprocessed GLoVE embeddings (~990MB) and put them under data/.
  3. Download the ScanNetV2 dataset and put (or link) scans/ under (or to) data/scannet/scans/ (Please follow the ScanNet Instructions for downloading the ScanNet dataset). After this step, there should be folders containing the ScanNet scene data under the data/scannet/scans/ with names like scene0000_00
  4. Used official and pre-trained PointGroup generate panoptic segmentation in PointGroupInst/. We provide pre-processed data in Baidu Netdisk [password: 0nxc].
  5. Pre-processed instance labels, and new data should be generated in data/scannet/pointgroup_data/
cd data/scannet/
python prepare_data.py --split train --pointgroupinst_path [YOUR_PATH]
python prepare_data.py --split val   --pointgroupinst_path [YOUR_PATH]
python prepare_data.py --split test  --pointgroupinst_path [YOUR_PATH]

Finally, the dataset folder should be organized as follows.

InstanceRefer
β”œβ”€β”€ data
β”‚   β”œβ”€β”€ scannet
β”‚   β”‚  β”œβ”€β”€ meta_data
β”‚   β”‚  β”œβ”€β”€ pointgroup_data
β”‚   β”‚  β”‚  β”œβ”€β”€ scene0000_00_aligned_bbox.npy
β”‚   β”‚  β”‚  β”œβ”€β”€ scene0000_00_aligned_vert.npy
β”‚   β”‚  β”œβ”€β”€β”œβ”€β”€  ... ...

Training

Train the InstanceRefer model. You can change hyper-parameters in config/InstanceRefer.yaml:

python scripts/train.py --log_dir instancerefer

TODO

  • Updating to the best version.
  • Release codes for prediction on benchmark.
  • Release pre-trained model.
  • Merge PointGroup in an end-to-end manner.

Acknowledgement

This project is not possible without multiple great opensourced codebases.

License

This repository is released under MIT License (see LICENSE file for details).

Comments
  • the resulrt of pretrained model I run is differnet from yours

    the resulrt of pretrained model I run is differnet from yours

    I follow your instructions and run your pretrained model. But the eval result is iou0.25: 0.3751 iou0.5: 0.3056 I do not know why it is different form yours. Thank you very much!

    opened by Umaruchain 5
  • An error when loading data

    An error when loading data

    Thank you for this amazing work.

    When I run "python scripts/train.py --log_dir instancerefer", I encountered a problem:

    File "/home/nicole/InstanceRefer/lib/dataset.py", line 184, in getitem class_ind = [DC.nyu40id2class[int(x)] for x in instance_bboxes[:num_bbox, -2]] File "/home/nicole/InstanceRefer/lib/dataset.py", line 184, in class_ind = [DC.nyu40id2class[int(x)] for x in instance_bboxes[:num_bbox, -2]] KeyError: 0

    the x is 0.0 but DC.nyu40id2class does not contain 0. Should I modify the code as "DC.nyu40id2class[int(x)+1]"?

    opened by vivonicole 3
  • Some commands in README might be incorrect.

    Some commands in README might be incorrect.

    In Step 5. Pre-processed instance labels, and new data should be generated in data/scannet/pointgroup_data/

    Since train and test data of scannet are placed at different positions, the commands should be:

    cd data/scannet/
    python prepare_data.py --split train --scannet_path REPO_DIR/data/scannet/scans --pointgroupinst_path [YOUR_PATH]
    python prepare_data.py --split test  --scannet_path REPO_DIR/data/scannet/scans_test  --pointgroupinst_path [YOUR_PATH]
    
    opened by WinterShiver 1
  • fix issues with wrong ScanNet aggregation file

    fix issues with wrong ScanNet aggregation file

    Hi there,

    the original _vh_clean.aggregation.json is for the high-res mesh files - I've corrected it to .aggregation.json which matches the low-res files you use in this repo.

    (plz check ScanNet README)

    Best Dave

    opened by daveredrum 1
  • InstanceRefer(xyz): Acc@0.5IoU 37.05

    InstanceRefer(xyz): [email protected] 37.05

    I am currently having a look at the performance of your architecture. In the paper you state that the [email protected] is 37.05 using only xyz coordinates which is way higher than the [email protected] of 30.70 using additional color information. Is this just a typo in the paper or is the additional rgb information really damaging your performance? If this is a typo, could you please provide me with the correct [email protected] results?

    Thank you very much!

    opened by bwittmann 1
  • RuntimeError: zero-dimensional tensor (at position 0) cannot be concatenated

    RuntimeError: zero-dimensional tensor (at position 0) cannot be concatenated

    Hi, CurryYuan: I have met some problems as below: Preparing data... train on 36665 samples and val on 9508 samples loading data... loading data... Initializing... model params 6857788 Start training... Epoch 1 starting... Traceback (most recent call last): File "/remote-home/source/46/guopeng/scripts/train.py", line 221, in train(CONF) File "/remote-home/source/46/guopeng/scripts/train.py", line 211, in train solver(args.epoch, args.verbose) File "/remote-home/source/46/guopeng/lib/solver.py", line 156, in call self._feed(self.dataloader["train"], "train", epoch_id) File "/remote-home/source/46/guopeng/lib/solver.py", line 248, in _feed for idx, data_dict in enumerate(dataloader): File "/usr/local/miniconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 363, in next data = self._next_data() File "/usr/local/miniconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 989, in _next_data return self._process_data(data) File "/usr/local/miniconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1014, in _process_data data.reraise() File "/usr/local/miniconda3/lib/python3.6/site-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg) RuntimeError: Caught RuntimeError in DataLoader worker process 0. Original Traceback (most recent call last): File "/usr/local/miniconda3/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 185, in _worker_loop data = fetcher.fetch(index) File "/usr/local/miniconda3/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch return self.collate_fn(data) File "/remote-home/source/46/guopeng/lib/dataset.py", line 464, in collate_fn outputs = sparse_collate_fn(inputs) File "/usr/local/miniconda3/lib/python3.6/site-packages/torchsparse-1.1.0-py3.6-linux-x86_64.egg/torchsparse/utils/helpers.py", line 209, in sparse_collate_fn dim=0) RuntimeError: zero-dimensional tensor (at position 0) cannot be concatenated I think maybe you know the Bug.

    opened by guo13947296316 1
  • About co-attention problem

    About co-attention problem

    Hi, thanks for your amazing work.

    We notice that in your newly released version, you use co-attention from MCAN to achieve a much higher performance, I wonder that will this part of code be released?

    Thanks!

    opened by ChengShiest 2
Owner
null
Leaderboard and Visualization for RLCard

RLCard Showdown This is the GUI support for the RLCard project and DouZero project. RLCard-Showdown provides evaluation and visualization tools to hel

Data Analytics Lab at Texas A&M University 246 Dec 26, 2022
This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code for training a DPR model then continuing training with RAG.

KGI (Knowledge Graph Induction) for slot filling This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code fo

International Business Machines 72 Jan 6, 2023
πŸ† The 1st Place Submission to AICity Challenge 2021 Natural Language-Based Vehicle Retrieval Track (Alibaba-UTS submission)

AI City 2021: Connecting Language and Vision for Natural Language-Based Vehicle Retrieval ?? The 1st Place Submission to AICity Challenge 2021 Natural

null 82 Dec 29, 2022
The 1st place solution of track2 (Vehicle Re-Identification) in the NVIDIA AI City Challenge at CVPR 2021 Workshop.

AICITY2021_Track2_DMT The 1st place solution of track2 (Vehicle Re-Identification) in the NVIDIA AI City Challenge at CVPR 2021 Workshop. Introduction

Hao Luo 91 Dec 21, 2022
My 1st place solution at Kaggle Hotel-ID 2021

1st place solution at Kaggle Hotel-ID My 1st place solution at Kaggle Hotel-ID to Combat Human Trafficking 2021. https://www.kaggle.com/c/hotel-id-202

Kohei Ozaki 18 Aug 19, 2022
1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection

This project releases our 1st place solution on ICDAR 2021 Competition on Mathematical Formula Detection. We implement our solution based on MMDetection, which is an open source object detection toolbox based on PyTorch.

yuxzho 94 Dec 25, 2022
MPI Interest Group on Algorithms on 1st semester 2021

MPI Algorithms Interest Group Introduction Lecturer: Steve Yan Location: TBA Time Schedule: TBA Semester: 1 Useful URLs Typora: https://typora.io Goog

Ex10si0n 13 Sep 8, 2022
1st ranked 'driver careless behavior detection' for AI Online Competition 2021, hosted by MSIT Korea.

2021AICompetition-03 λ³Έ repo λŠ” mAy-I Inc. νŒ€μœΌλ‘œ μ°Έκ°€ν•œ 2021 인곡지λŠ₯ 온라인 κ²½μ§„λŒ€νšŒ 쀑 [이미지] μš΄μ „ 사고 μ˜ˆλ°©μ„ μœ„ν•œ μš΄μ „μž λΆ€μ£Όμ˜ 행동 κ²€μΆœ λͺ¨λΈ] νƒœμŠ€ν¬ μˆ˜ν–‰μ„ μœ„ν•œ λ ˆν¬μ§€ν† λ¦¬μž…λ‹ˆλ‹€. mAy-I λŠ” κ³Όν•™κΈ°μˆ μ •λ³΄ν†΅μ‹ λΆ€κ°€ μ£Όμ΅œν•˜

Junhyuk Park 9 Dec 1, 2022
QQ Browser 2021 AI Algorithm Competition Track 1 1st Place Program

QQ Browser 2021 AI Algorithm Competition Track 1 1st Place Program

null 249 Jan 3, 2023
1st Solution For NeurIPS 2021 Competition on ML4CO Dual Task

KIDA: Knowledge Inheritance in Data Aggregation This project releases our 1st place solution on NeurIPS2021 ML4CO Dual Task. Slide and model weights a

MEGVII Research 24 Sep 8, 2022
1st place solution in CCF BDCI 2021 ULSEG challenge

1st place solution in CCF BDCI 2021 ULSEG challenge This is the source code of the 1st place solution for ultrasound image angioma segmentation task (

Chenxu Peng 30 Nov 22, 2022
[ICLR 2021] Rank the Episodes: A Simple Approach for Exploration in Procedurally-Generated Environments.

[ICLR 2021] RAPID: A Simple Approach for Exploration in Reinforcement Learning This is the Tensorflow implementation of ICLR 2021 paper Rank the Episo

Daochen Zha 48 Nov 21, 2022
1st Place Solution to ECCV-TAO-2020: Detect and Represent Any Object for Tracking

Instead, two models for appearance modeling are included, together with the open-source BAGS model and the full set of code for inference. With this code, you can achieve around mAP@23 with TAO test set (based on our estimation).

null 79 Oct 8, 2022
Code for 1st place solution in Sleep AI Challenge SNU Hospital

Sleep AI Challenge SNU Hospital 2021 Code for 1st place solution for Sleep AI Challenge (Note that the code is not fully organized) Refer to the notio

Saewon Yang 13 Jan 3, 2022
1st place solution to the Satellite Image Change Detection Challenge hosted by SenseTime

1st place solution to the Satellite Image Change Detection Challenge hosted by SenseTime

Lihe Yang 209 Jan 1, 2023
1st-in-MICCAI2020-CPM - Combined Radiology and Pathology Classification

Combined Radiology and Pathology Classification MICCAI 2020 Combined Radiology a

null 22 Dec 8, 2022
An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.

SERank An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow

Zhihu 44 Oct 20, 2022
Source Code for DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances (https://arxiv.org/pdf/2012.01775.pdf)

DialogBERT This is a PyTorch implementation of the DialogBERT model described in DialogBERT: Neural Response Generation via Hierarchical BERT with Dis

Xiaodong Gu 67 Jan 6, 2023
Code for "LoRA: Low-Rank Adaptation of Large Language Models"

LoRA: Low-Rank Adaptation of Large Language Models This repo contains the implementation of LoRA in GPT-2 and steps to replicate the results in our re

Microsoft 394 Jan 8, 2023