Code for the RA-L (ICRA) 2021 paper "SeqNet: Learning Descriptors for Sequence-Based Hierarchical Place Recognition"

Sourav Garg

Last update: Dec 12, 2022

Related tags

Deep Learning localization deep-learning sequential snow temporal visual-place-recognition 1d-convolution day-night

Overview

SeqNet: Learning Descriptors for Sequence-Based Hierarchical Place Recognition

[ArXiv+Supplementary] [IEEE Xplore RA-L 2021] [ICRA 2021 YouTube Video]

and

SeqNetVLAD vs PointNetVLAD: Image Sequence vs 3D Point Clouds for Day-Night Place Recognition

[ArXiv] [CVPR 2021 Workshop 3DVR]

Sequence-Based Hierarchical Visual Place Recognition.

News:

Jun 23: CVPR 2021 Workshop 3DVR paper, "SeqNetVLAD vs PointNetVLAD", now available on arXiv. Oxford dataset to be released soon.

Jun 02: SeqNet code release with the Nordland dataset.

Setup (One time)

Conda

conda create -n seqnet python=3.8 mamba -c conda-forge -y
conda activate seqnet
mamba install numpy pytorch=1.8.0 torchvision tqdm scikit-learn faiss tensorboardx h5py -c conda-forge -y

Download

Run bash download.sh to download single image NetVLAD descriptors (3.4 GB) for the Nordland-clean dataset ^[a] and corresponding model files (1.5 GB) ^[b].

Run

Train

To train sequential descriptors through SeqNet:

python main.py --mode train --pooling seqnet --dataset nordland-sw --seqL 10 --w 5 --outDims 4096 --expName "w5"

To (re-)train single descriptors through SeqNet:

python main.py --mode train --pooling seqnet --dataset nordland-sw --seqL 1 --w 1 --outDims 4096 --expName "w1"

Test

python main.py --mode test --pooling seqnet --dataset nordland-sf --seqL 5 --split test --resume ./data/runs/Jun03_15-22-44_l10_w5/

The above will reproduce results for SeqNet (S5) as per Supp. Table III on Page 10.

To obtain other results from the same table, expand this.

# Raw Single (NetVLAD) Descriptor
python main.py --mode test --pooling single --dataset nordland-sf --seqL 1 --split test

# SeqNet (S1)
python main.py --mode test --pooling seqnet --dataset nordland-sf --seqL 1 --split test --resume ./data/runs/Jun03_15-07-46_l1_w1/

# Raw + Smoothing
python main.py --mode test --pooling smooth --dataset nordland-sf --seqL 5 --split test

# Raw + Delta
python main.py --mode test --pooling delta --dataset nordland-sf --seqL 5 --split test

# Raw + SeqMatch
python main.py --mode test --pooling single+seqmatch --dataset nordland-sf --seqL 5 --split test

# SeqNet (S1) + SeqMatch
python main.py --mode test --pooling s1+seqmatch --dataset nordland-sf --seqL 5 --split test --resume ./data/runs/Jun03_15-07-46_l1_w1/

# HVPR (S5 to S1)
# Run S5 first and save its predictions by specifying `resultsPath`
python main.py --mode test --pooling seqnet --dataset nordland-sf --seqL 5 --split test --resume ./data/runs/Jun03_15-22-44_l10_w5/ --resultsPath ./data/results/
# Now run S1 + SeqMatch using results from above (the timestamp of `predictionsFile` would be different in your case)
python main.py --mode test --pooling s1+seqmatch --dataset nordland-sf --seqL 5 --split test --resume ./data/runs/Jun03_15-07-46_l1_w1/ --predictionsFile ./data/results/Jun03_16-07-36_l5_0.npz

Acknowledgement

The code in this repository is based on Nanne/pytorch-NetVlad. Thanks to Tobias Fischer for his contributions to this code during the development of our project QVPR/Patch-NetVLAD.

Citation

@article{garg2021seqnet,
  title={SeqNet: Learning Descriptors for Sequence-based Hierarchical Place Recognition},
  author={Garg, Sourav and Milford, Michael},
  journal={IEEE Robotics and Automation Letters},
  volume={6},
  number={3},
  pages={4305-4312},
  year={2021},
  publisher={IEEE},
  doi={10.1109/LRA.2021.3067633}
}

@misc{garg2021seqnetvlad,
  title={SeqNetVLAD vs PointNetVLAD: Image Sequence vs 3D Point Clouds for Day-Night Place Recognition},
  author={Garg, Sourav and Milford, Michael},
  howpublished={CVPR 2021 Workshop on 3D Vision and Robotics (3DVR)},
  month={Jun},
  year={2021},
}

Other Related Projects

Patch-NetVLAD (2021); Delta Descriptors (2020); CoarseHash (2020); seq2single (2019); LoST (2018)

[a] This is the clean version of the dataset that excludes images from the tunnels and red lights, exact image names can be obtained from here.

[b] These will automatically save to ./data/, you can modify this path in download.sh and get_datasets.py to specify your workdir.

Comments

Questions about training and testing Seqnet using other dataset

Hi! I am now working on your SeqNet project, and meeting some problems through it. Currently I can run the codes and get recall results using nordland and oxford dataset. Now I am trying to use other datasets like hilti: https://hilti-challenge.com/dataset.html . I wonder how I can fit this dataset into your program. Should I also generate .db AND .npy files as well? If my original data format is .jpg, what should be my process? the .npy files under /netvlad-pytorch file are descriptors generated by netVLAD, am I right?

Thank you for your patience and waiting for your kindly help!

opened by XinLan12138 12
Doubts about the descriptors / img poses from timestamps of img & poses
Hi! Thanks for your work! I have some doubts about the nordland image descriptors.

The nordland image descriptors are from netvlad. Whether the netvlad model was trained in the nordland?

Which one is a fixed distance sample between oxford-pnv and oxford-v1.0

Kind regards!
opened by kaiyi98 5
dataset split issues
Hi @oravus , thanks for your help and I have generated the needed db files using pitts250k dataset. Generally, I use the whole dataset to generate descriptors and saved both query and ref .npy files. That is : ===> Loading dataset(s) All Db descs: (254064, 4096) All Qry descs: (24000, 4096)

Next, I use the first 10000 images as train sets, followed by 3000 images as validation and 3000 images as test sets. Also, I generated 3 .db files as train_mat_file, test_mat_file, val_mat_file.

Thereafter, I wrote the specification in get_datasets.py. The indexes are defined as nordland dataset format: trainInds, testInds, valInds = np.arange(10000), np.arange(10000,13000), np.arange(13000,16000)

And I think I can test the dataset using your pretrained model, but things comes out that still errors occur regarding index.

////////////////////////////////////////// Restored flags: ['--optim', 'SGD', '--lr', '0.0001', '--lrStep', '50', '--lrGamma', '0.5', '--weightDecay', '0.001', '--momentum', '0.9', '--seed', '123', '--runsPath', './data/runs', '--savePath', './data/runs/Jun03_15-22-44_l10_l10_w5_seqnetEnv/checkpoints', '--patience', '0', '--pooling', 'seqnet', '--w', '5', '--outDims', '4096', '--margin', '0.1'] Namespace(batchSize=16, cacheBatchSize=24, cachePath='./data/cache', cacheRefreshRate=0, ckpt='latest', dataset='pitts250k', descType='netvlad-pytorch', evalEvery=1, expName='0', extractOnly=False, lr=0.0001, lrGamma=0.5, lrStep=50.0, margin=0.1, mode='test', momentum=0.9, msls_trainCity='melbourne', msls_valCity='austin', nEpochs=200, nGPU=1, nocuda=False, numSamples2Project=-1, optim='SGD', outDims=4096, patience=0, pooling='seqnet', predictionsFile=None, resultsPath=None, resume='./data/runs/Jun03_15-22-44_l10_w5/', runsPath='./data/runs', savePath='./data/runs/Jun03_15-22-44_l10_l10_w5_seqnetEnv/checkpoints', seed=123, seqL=5, seqL_filterData=None, split='test', start_epoch=0, threads=8, w=5, weightDecay=0.001) ===> Loading dataset(s) All Db descs: (254064, 4096) All Qry descs: (24000, 4096) ===> Evaluating on test set ====> Query count: 800 ===> Building model => loading checkpoint './data/runs/Jun03_15-22-44_l10_w5/checkpoints/checkpoint.pth.tar' => loaded checkpoint './data/runs/Jun03_15-22-44_l10_w5/checkpoints/checkpoint.pth.tar' (epoch 200) ===> Running evaluation step ====> Extracting Features 20%|████████████████████████████████████▉ | 49/249 [00:00<00:01, 127.41it/s]==> Batch (50/250) 39%|█████████████████████████████████████████████████████████████████████████▏ | 97/249 [00:00<00:01, 148.19it/s]==> Batch (100/250) 58%|████████████████████████████████████████████████████████████████████████████████████████████████████████████▉ | 145/249 [00:01<00:00, 154.13it/s]==> Batch (150/250) 78%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉ | 193/249 [00:01<00:00, 156.80it/s]==> Batch (200/250) 97%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋ | 242/249 [00:01<00:00, 158.98it/s]==> Batch (250/250) Average batch time: 0.006786982536315918 0.009104941585046229
torch.Size([3000, 4096]) torch.Size([3000, 4096]) ====> Building faiss index ====> Calculating recall @ N Using Localization Radius: 25 Traceback (most recent call last): File "main.py", line 133, in recallsOrDesc, dbEmb, qEmb, rAtL, preds = test(opt, model, encoder_dim, device, whole_test_set, writer, epoch, extract_noEval=opt.extractOnly) File "/home/lx/lx/Seqnet_new/test.py", line 138, in test rAtL.append(getRecallAtN(n_values, predictions, gtAtL)) File "/home/lx/lx/Seqnet_new/test.py", line 37, in getRecallAtN if len(gt[qIx]) == 0: IndexError: list index out of range

///////////////////////////////////////// I am writing to ask :

how should the index should be defined? Could you use oxford dataset or nordland dataset to explain the details for me?

should npy files contain the same number of descriptors as I generate the dataset as 3 db files for train, test, and val?

THANKS FOR HELP!
opened by XinLan12138 5
Train sequential descriptors through SeqNet on the MSLS dataset

Hi! Thanks for your great work on SeqNet. And I met a problem when I trained the sequential descriptors on the MSLS dataset. I just run the command: python main.py --mode train --pooling seqnet --dataset msls --msls_trainCity melbourne --msls_valCity austin --seqL 5 --w 3 --outDims 4096 --expName "msls_w3". But ther is a problem: No such file or directory: './data/descData/netvlad-pytorch/msls_melbourne_databast.npy'. Maybe I did not find any corresponding ***.npy files about the MSLS dataset. Would you mind providing the corresponging download link of the MSLS dataset.? Kind regards!

opened by fufj 2
use SeqNet without SeqMatch

Hello @oravus ,

Thanks for your fantastic job providing a nice way to fuse information from multiple frames. Now I only use SeqNet (Conv+SAP+L2-Norm) without SeqMatch, trained with my own dataset for LIDAR-based place recognition. However, I found SeqNet did not work well once the seq_len is small (< 20) and was hard to train to get a strong model. Thus I guess that whether SeqNet without SeqMatch works well largely depends on the distribution of training set and test set, or it largely depends on the output module of the raw descriptor generation algorithm. Is this right? Have you met some dataset where SeqNet can not work well?

Best wishes!

opened by BIT-MJY 2
MSLS results

Hi, I would like to replicate the results on Mapillary. Thank you for adding the code to generate the .npy files containing the descriptors for any dataset. Could you please suggest how to generate the .db files for MSLS and then replicate the results in your paper? Thank your help!

opened by rm-wu 2
discussion on sequential descriptor

Hi May I ask what the sequential descriptor really encodes?

As far as I understand, I can use LSTM to estimate both self-motion as well as motion stereo depth.

Does LSTM here only encodes the changes in locomotion? or does it also encode the overall 3D structure prior?

If it encodes either motion or map, I can use short-duration odometry as additional field input or something like LIDAR projected depth map to speed up the seqNet process to make it multi-modality seqNet?

opened by snakehaihai 2
the original images dataset for training SeqNet

Hi, thanks for your remarkable work! Could you please provide the original images dataset for training SeqNet? such as Nordland, Oxford, Brisbane, and MSLS? Thanks.

opened by jinyummiao 2
"seqL_filterData" argument

Hi! Thanks for your great work! The argument, “seqL_filterData”, has always defaulted to None, What does it do and under what conditions is it used? Kind regards!

opened by kaiyi98 1
use SeqNet without SeqMatch

Hello @oravus ,

Thanks for your fantastic job providing a nice way to fuse information from multiple frames. Now I only use SeqNet (Conv+SAP+L2-Norm) without SeqMatch, trained with my own dataset for LIDAR-based place recognition. However, I found SeqNet did not work well once the seq_len is small (< 20) and was hard to train to get a strong model. Thus I guess that whether SeqNet without SeqMatch works well largely depends on the distribution of training set and test set, or it largely depends on the output module of the raw descriptor generation algorithm. Is this right? Have you met some dataset where SeqNet can not work well?

Best wishes!

opened by BIT-MJY 1
use SeqNet without SeqMatch

Hello @oravus ,

Thanks for your fantastic job providing a nice way to fuse information from multiple frames. Now I only use SeqNet (Conv+SAP+L2-Norm) without SeqMatch, trained with my own dataset for LIDAR-based place recognition. However, I found SeqNet did not work well once the seq_len is small (< 20) and was hard to train to get a strong model. Thus I guess that whether SeqNet without SeqMatch works well largely depends on the distribution of training set and test set, or it largely depends on the output module of the raw descriptor generation algorithm. Is this right? Have you met some dataset where SeqNet can not work well?

Best wishes!

opened by BIT-MJY 1
The descriptor data for the Brisbane dataset

Hi! Thanks for your great work on SeqNet. With your kindly help in issue #12 opened by myself, I have obtained the corresponding descriptor data for the Nordland, Oxford, and MSLS datasets, and achieved the same results as the paper. Then, I found that the Bribane dataset is not open-source, such that I cannot directly obtain the corresponding descriptor through the original dataset imsges. I have found your reply--"For the Brisbane dataset, we can only release the descriptor data." in issue #1 that the descritpor data can be released. Would you mind providing the descriptor data of the Brisbane dataset for me? I just want to reproduce all the results in the paper and improve this remarkable work. I promise it will never be commercial but for research. Kind regards!

opened by fufj 1

Code for the RA-L (ICRA) 2021 paper "SeqNet: Learning Descriptors for Sequence-Based Hierarchical Place Recognition"

Related tags

Overview

SeqNet: Learning Descriptors for Sequence-Based Hierarchical Place Recognition

SeqNetVLAD vs PointNetVLAD: Image Sequence vs 3D Point Clouds for Day-Night Place Recognition

News:

Setup (One time)

Conda

Download

Run

Train

Test

Acknowledgement

Citation

Other Related Projects

Comments

Owner

Sourav Garg

Offcial repository for the IEEE ICRA 2021 paper Auto-Tuned Sim-to-Real Transfer.

This repository is an open-source implementation of the ICRA 2021 paper: Locus: LiDAR-based Place Recognition using Spatiotemporal Higher-Order Pooling.

Official PyTorch implementation of the ICRA 2021 paper: Adversarial Differentiable Data Augmentation for Autonomous Systems.

Code for "FGR: Frustum-Aware Geometric Reasoning for Weakly Supervised 3D Vehicle Detection", ICRA 2021

Official code for "EagerMOT: 3D Multi-Object Tracking via Sensor Fusion" [ICRA 2021]

Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"

ICRA 2021 "Towards Precise and Efficient Image Guided Depth Completion"

SSL_SLAM2: Lightweight 3-D Localization and Mapping for Solid-State LiDAR (mapping and localization separated) ICRA 2021

Spatial Intention Maps for Multi-Agent Mobile Manipulation (ICRA 2021)

The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022

Aerial Single-View Depth Completion with Image-Guided Uncertainty Estimation (RA-L/ICRA 2020)

SafePicking: Learning Safe Object Extraction via Object-Level Mapping, ICRA 2022

[ICRA 2022] CaTGrasp: Learning Category-Level Task-Relevant Grasping in Clutter from Simulation

[ICRA 2022] An opensource framework for cooperative detection. Official implementation for OPV2V.

The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.

This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

Code of the lileonardo team for the 2021 Emotion and Theme Recognition in Music task of MediaEval 2021

This is the code for the paper "Contrastive Clustering" (AAAI 2021)