Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

Overview

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

We revisit and address issues with Oxford 5k and Paris 6k image retrieval benchmarks. New annotation for both datasets is created with an extra attention to the reliability of the ground truth and three new protocols of varying difficulty are introduced. We additionally introduce 15 new challenging queries per dataset and a new set of 1M hard distractors.

This package provides support in downloading and using the new benchmark.

MATLAB

Tested with MATLAB R2017a on Debian 8.1.

Process images

This example script first downloads dataset images and the revisited annotation files. Then, it describes how to: read and process database images; read, crop and process query images:

>> example_process_images

Similarly, this example script first downloads one million images from the revisited distractor dataset (this can take a while). Then, it describes how to read and process images.

>> example_process_distractors

Evaluate results

Example script that describes how to evaluate according to the revisited annotation and the three protocol setups:

>> example_evaluate

It automatically downloads dataset images, the revisited annotation file, and example features (R-[37]-GeM from the paper) to be used in the evaluation. The final output should look like this (depending on the selected test_dataset):

>> roxford5k: mAP E: 84.81, M: 64.67, H: 38.47
>> roxford5k: mP@k[1 5 10] E: [97.06 92.06 86.49], M: [97.14 90.67 84.67], H: [81.43 63.00 53.00]

or

>> rparis6k: mAP E: 92.12, M: 77.20, H: 56.32
>> rparis6k: mP@k[1 5 10] E: [100.00 97.14 96.14], M: [100.00 98.86 98.14], H: [94.29 90.29 89.14]

Python

Tested with Python 3.5.3 on Debian 8.1.

Process images

This example script first downloads dataset images and the revisited annotation files. Then, it describes how to: read and process database images; read, crop and process query images:

>> python3 example_process_images

Similarly, this example script first downloads one million images from the revisited distractor dataset (this can take a while). Then, it describes how to read and process images.

>> python3 example_process_distractors

Evaluate results

Example script that describes how to evaluate according to the revisited annotation and the three protocol setups:

>> python3 example_evaluate

It automatically downloads dataset images, revisited annotation file, and example features (R-[37]-GeM from the paper) to be used in the evaluation. The final output should look like this (depending on the selected test_dataset):

>> roxford5k: mAP E: 84.81, M: 64.67, H: 38.47
>> roxford5k: mP@k[ 1  5 10] E: [97.06 92.06 86.49], M: [97.14 90.67 84.67], H: [81.43 63.   53.  ]

or

>> rparis6k: mAP E: 92.12, M: 77.2, H: 56.32
>> rparis6k: mP@k[ 1  5 10] E: [100.    97.14  96.14], M: [100.    98.86  98.14], H: [94.29 90.29 89.14]

Related publication

@inproceedings{RITAC18,
 author = {Radenovi\'{c}, F. and Iscen, A. and Tolias, G. and Avrithis, Y. and Chum, O.},
 title = {Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking},
 booktitle = {CVPR},
 year = {2018}
}
Comments
  • A quick question about R-[37]-GEM+DFS performance on ROxf and RPar

    A quick question about R-[37]-GEM+DFS performance on ROxf and RPar

    Hi Filip,

    Thank you for making your code and dataset available! It's really good.

    I have a one quick question. I tested "R-[37]-GEM+DFS" approach on the both ROxf and RPar datasets using the below git repository code and realized that these performance worked better than the performance you reported in your Revisiting paper published at CVPR 18.

    https://github.com/ducha-aiki/manifold-diffusion

    Here are the evaluation results.

    Plain

    roxford5k: mAP E: 84.81, M: 64.67, H: 38.47 roxford5k: mP@k[ 1 5 10] E: [97.06 92.06 86.49], M: [97.14 90.67 84.67], H: [81.43 63. 53. ] Diffusion cg roxford5k: mAP E: 88.36, M: 74.4, H: 50.57 roxford5k: mP@k[ 1 5 10] E: [95.59 93.82 91.39], M: [95.71 92.86 88.14], H: [88.57 70.29 61.29] Diffusion trunkated roxford5k: mAP E: 88.18, M: 74.33, H: 50.7 roxford5k: mP@k[ 1 5 10] E: [95.59 93.82 91.39], M: [95.71 92.86 88.14], H: [88.57 70.29 61.29] Spectral R=2000 roxford5k: mAP E: 86.5, M: 72.0, H: 45.7 roxford5k: mP@k[ 1 5 10] E: [94.12 92.65 91.3 ], M: [94.29 91.14 87.86], H: [81.43 72.43 61.29]

    Plain

    rparis6k: mAP E: 92.12, M: 77.2, H: 56.32 rparis6k: mP@k[ 1 5 10] E: [100. 97.14 96.14], M: [100. 98.86 98.14], H: [94.29 90.29 89.14] Diffusion cg rparis6k: mAP E: 94.72, M: 89.63, H: 80.01 rparis6k: mP@k[ 1 5 10] E: [100. 97.43 95.86], M: [100. 99.43 98.57], H: [95.71 91.14 91.43] Diffusion trunkated rparis6k: mAP E: 94.73, M: 89.64, H: 80.03 rparis6k: mP@k[ 1 5 10] E: [100. 97.43 95.86], M: [100. 99.43 98.57], H: [95.71 91.14 91.29] Spectral R=2000 rparis6k: mAP E: 93.61, M: 85.42, H: 73.31 rparis6k: mP@k[ 1 5 10] E: [98.57 96.29 95.95], M: [100. 98.86 98.86], H: [92.86 90.86 91.71]

    Could you please let me know why the results are difference between the results you reported and the results that I got using the code?

    Thanks you.

    Best, Jangwon

    question 
    opened by leejang 4
  • Is there a plan to release rest of evaluations?

    Is there a plan to release rest of evaluations?

    Appreciate in making this dataset available.

    I was wondering if you plan to release the other evaluations codebase to the community. Currently, I see that R-[37]-GeM is released. Is there any plan to release other?

    If not, perhaps, can you do release DELF-ASMK*+SP, HesAff-rSIFT-ASMK*+SP, R-[10]-R-MAC+DFS?

    question 
    opened by msharmavikram 4
  • revisitop1m loading bug

    revisitop1m loading bug

    Hello @filipradenovic,

    Thank you for providing the tool for accessing Oxford5k, Paris6k, and the revisited ones. However, there is a bug in loading the distractor in the example file 'example_process_distractor.py'.

    # Error
    Traceback (most recent call last):
      File "example_process_distractors.py", line 42, in <module>
        cfg = configdataset(distractors_dataset, os.path.join(data_root, 'datasets'))
      File "/store/tsunyi/revisitop-master/python/dataset.py", line 11, in configdataset
        raise ValueError('Unknown dataset: {}!'.format(dataset))
    ValueError: Unknown dataset: revisitop1m!
    

    It is because 'configdataset' is adopted for getting the names.

    config file for the dataset
    cfg = configdataset(distractors_dataset, os.path.join(data_root, 'datasets'))
    
    for i in np.arange(cfg['n']):
        im = pil_loader(cfg['im_fname'](cfg, i))
        ##------------------------------------------------------
        ## Perform image processing here, eg, feature extraction
        ##------------------------------------------------------
        print('>> {}: Processing image {}'.format(distractors_dataset, i+1))
    

    However, revisitop1m does not share the same thing as ROxford5k, RParis6k, and it does not need the label as well.

    Therefore I provide an alternative to load them.

    import glob
    from tqdm import tqdm
    
    img_list = glob.glob('../data/datasets/revisitop1m/**/*.jpg', recursive=True)
    
    for filename in tqdm(img_list):
    
        im = pil_loader(filename)
        ##------------------------------------------------------
        ## Perform image processing here, eg, feature extraction
        ##------------------------------------------------------
    

    Thank you again for such an amazing project.

    bug 
    opened by shamangary 1
  • where are variables :'Q','X'?

    where are variables :'Q','X'?

    I can see that he has not updated the datasets for a long time. But, where is the pre-trained network mat file. I have checked every link and cannot find the file. When I loaded the file in the ftp folder. it does not contain X and Q variables. Where are they? any help is appreciated.

    opened by farsab 0
  • Training protocol for RevisitOP

    Training protocol for RevisitOP

    Hi Filip, I am new to these datasets and hence the confusion. Usually we have training data with landmark ids , their GT (positive samples) and then separate query images and corresponding positive samples for evaluation.

    In these datasets like Oxford5k or ROxford5k, I find the landmark images, and other images for that landmark. For ex, everything starting with all_souls correspond to the all_souls building. But when I see other images containign this tag, it contain people and indoor images which are possibly junk. In the gt files, I see the structure having, all_souls_1_query.txt, all_souls_1_good.txt, all_souls_1_ok.txt, all_souls_1_junk.txt and so on.

    While training do we use all_souls_1_good.txt, all_souls_ok.txt and ignore the junk? and cannot use all_souls_1_query.txt?

    I just want to make sure I get the standard practice of how to train on this dataset and evaluate properly. I looked in to the PhD thesis too which you linked in another issue, but this training protocol is I am unable to grasp.

    Thanks a lot for your patience.

    opened by sonamsingh19 1
  • distractor iamges download fail

    distractor iamges download fail

    Hi,

    I use the python file to download distractor images (1M), but the download failed at the first file. I also change the start id and try again, they all failed. Could you please help to check, or is there any other links that I can download the distractor images? Thank you very much.

    opened by zhangleai 1
  • Throw error when pos is empty

    Throw error when pos is empty

    @filipradenovic Hi filipradenovic, I find there is a bug when pos is empty in evaluate.py#L80. If there is no positive result in the ranking result, such as rank@10, the pos is empty, then it will throw an error. How you fix the bug? I'm on the way to try to understand the MAP in the file of evaluate.py.

    Thanks.

    opened by willard-yuan 0
  • Questions about ASMK* aggregation time

    Questions about ASMK* aggregation time

    Hi, @filipradenovic, I'm quite interested in your paper "Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking". Table 4 in your paper said DELF feature extraction plus ASMK* aggregation using GPU cost 0.41s per 1024x768 image, right? So I'm curious about the time of feature extraction and aggregation respectively. Do you remember how much time does it cost to aggregate (ASMK*) an 1024768 image using GPU and CPU? And can you provide the code of ASMK aggregation (GPU) or how can I have access to it?

    Looking forward to your reply. Thank you in advance!

    question 
    opened by Jemmagu 3
  • Possible bug in AP calculation

    Possible bug in AP calculation

    Thanks for making your work available.

    This is a report for a possible bug in computation of AP.

    Let say that I have ranks of [0, 3, 4, 5]. If I pass it to compute_ap() with nres = 4, and inspected precision_0 and precision_1 in the function, their values in the for-loop are

    rank 0: (precision_0, precision_1) = (1.000, 1.000)
    rank 3: (precision_0, precision_1) = (0.333, 0.500)
    rank 4: (precision_0, precision_1) = (0.500, 0.600)
    rank 5: (precision_0, precision_1) = (0.600, 0.667)
    

    However, the values at rank 3 (zero-based) is odd to me, because the precision of ranks at each k is 1.0, 0.5 (2/4), 0.6 (3/5) and 0.667 (4/6). Therefore, the value of precision_0 and precision_1 must be 1.0 and 0.5 respectively at rank 3.

    I could be wrong, but it seems to me that there is a bug in conpute_ap().

    good first issue 
    opened by tkanmae 2
Owner
Filip Radenovic
Research Scientist at Facebook
Filip Radenovic
Revisiting, benchmarking, and refining Heterogeneous Graph Neural Networks.

Heterogeneous Graph Benchmark Revisiting, benchmarking, and refining Heterogeneous Graph Neural Networks. Roadmap We organize our repo by task, and on

THUDM 176 Dec 17, 2022
RITA is a family of autoregressive protein models, developed by LightOn in collaboration with the OATML group at Oxford and the Debora Marks Lab at Harvard.

RITA: a Study on Scaling Up Generative Protein Sequence Models RITA is a family of autoregressive protein models, developed by a collaboration of Ligh

LightOn 69 Dec 22, 2022
PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition, CVPR 2018

PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place

Mikaela Uy 294 Dec 12, 2022
Image-retrieval-baseline - MUGE Multimodal Retrieval Baseline

MUGE Multimodal Retrieval Baseline This repo is implemented based on the open_cl

null 47 Dec 16, 2022
[ICLR 2021, Spotlight] Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

Large Scale Image Completion via Co-Modulated Generative Adversarial Networks, ICLR 2021 (Spotlight) Demo | Paper [NEW!] Time to play with our interac

Shengyu Zhao 373 Jan 2, 2023
Official implementation of "Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets" (CVPR2021)

Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets This is the official implementation of "Towards Good Pract

Sanja Fidler's Lab 52 Nov 22, 2022
This implements one of result networks from Large-scale evolution of image classifiers

Exotic structured image classifier This implements one of result networks from Large-scale evolution of image classifiers by Esteban Real, et. al. Req

null 54 Nov 25, 2022
Official Code for ICML 2021 paper "Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline"

Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline Ankit Goyal, Hei Law, Bowei Liu, Alejandro Newell, Jia Deng Internati

Princeton Vision & Learning Lab 115 Jan 4, 2023
Twins: Revisiting the Design of Spatial Attention in Vision Transformers

Twins: Revisiting the Design of Spatial Attention in Vision Transformers Very recently, a variety of vision transformer architectures for dense predic

null 482 Dec 18, 2022
Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations. [2021]

Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations This repo contains the Pytorch implementation of our paper: Revisit

Wouter Van Gansbeke 80 Nov 20, 2022
an implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation using PyTorch

revisiting-sepconv This is a reference implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation [1] using PyTorch. Given two f

Simon Niklaus 59 Dec 22, 2022
Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme (NeurIPS2021)

Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme (NeurIPS2021) Overview Prerequisites Linux Pytho

Shaojie Li 34 Mar 31, 2022
RE-OWOD - Revisiting open world object detection

Revisting Open World Object Detection Installation See INSTALL.md. Dataset Our n

null 7 Jan 5, 2022
Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021

Portrait Photo Retouching with PPR10K Paper | Supplementary Material PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask an

null 184 Dec 11, 2022
A large-scale video dataset for the training and evaluation of 3D human pose estimation models

ASPset-510 ASPset-510 (Australian Sports Pose Dataset) is a large-scale video dataset for the training and evaluation of 3D human pose estimation mode

Aiden Nibali 36 Oct 30, 2022
A large-scale video dataset for the training and evaluation of 3D human pose estimation models

ASPset-510 (Australian Sports Pose Dataset) is a large-scale video dataset for the training and evaluation of 3D human pose estimation models. It contains 17 different amateur subjects performing 30 sports-related actions each, for a total of 510 action clips.

Aiden Nibali 25 Jun 20, 2021
Official Implement of CVPR 2021 paper “Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting”

RGBT Crowd Counting Lingbo Liu, Jiaqi Chen, Hefeng Wu, Guanbin Li, Chenglong Li, Liang Lin. "Cross-Modal Collaborative Representation Learning and a L

null 37 Dec 8, 2022
null 190 Jan 3, 2023
An algorithm that handles large-scale aerial photo co-registration, based on SURF, RANSAC and PyTorch autograd.

An algorithm that handles large-scale aerial photo co-registration, based on SURF, RANSAC and PyTorch autograd.

Luna Yue Huang 41 Oct 29, 2022