LOST
Pytorch implementation of the unsupervised object discovery method LOST. More details can be found in the paper:
Localizing Objects with Self-Supervised Transformers and no Labels [arXiv]
by Oriane Siméoni, Gilles Puy, Huy V. Vo, Simon Roburin, Spyros Gidaris, Andrei Bursuc, Patrick Pérez, Renaud Marlet and Jean Ponce
If you use the LOST code or framework in your research, please consider citing:
@article{LOST,
title = {Localizing Objects with Self-Supervised Transformers and no Labels},
author = {Oriane Sim\'eoni and Gilles Puy and Huy V. Vo and Simon Roburin and Spyros Gidaris and Andrei Bursuc and Patrick P\'erez and Renaud Marlet and Jean Ponce},
journal = {arXiv preprint arXiv:2109.14279},
month = {09},
year = {2021}
}
Installation
Dependencies
This code was implemented with python 3.7, PyTorch 1.7.1 and CUDA 10.2. Please install PyTorch. In order to install the additionnal dependencies, please launch the following command:
pip install -r requirements.txt
Install DINO
This method is based on DINO paper. The framework can be installed using the following commands:
git clone https://github.com/facebookresearch/dino.git
cd dino;
touch __init__.py
echo -e "import sys\nfrom os.path import dirname, join\nsys.path.insert(0, join(dirname(__file__), '.'))" >> __init__.py; cd ../;
The code was made using the commit ba9edd1 of DINO repo (please rebase if breakage).
Apply LOST to one image
Following are scripts to apply LOST to an image defined via the image_path
parameter and visualize the predictions (pred
), the maps of the Figure 2 in the paper (fms
) and the visulization of the seed expansion (seed_expansion
). Box predictions are also stored in the output directory given by parameter output_dir
.
python main_lost.py --image_path examples/VOC07_000236.jpg --visualize pred
python main_lost.py --image_path examples/VOC07_000236.jpg --visualize fms
python main_lost.py --image_path examples/VOC07_000236.jpg --visualize seed_expansion
Launching on datasets
Following are the different steps to reproduce the results of LOST presented in the paper.
PASCAL-VOC
Please download the PASCAL VOC07 and PASCAL VOC12 datasets (link) and put the data in the folder datasets
. There should be the two subfolders: datasets/VOC2007
and datasets/VOC2012
. In order to apply lost and compute corloc results (VOC07 61.9, VOC12 64.0), please launch:
python main_lost.py --dataset VOC07 --set trainval
python main_lost.py --dataset VOC12 --set trainval
COCO
Please download the COCO dataset and put the data in datasets/COCO
. Results are provided given the 2014 annotations following previous works. The following command line allows you to get results on the subset of 20k images of the COCO dataset (corloc 50.7), following previous litterature. To be noted that the 20k images are a subset of the train
set.
python main_lost.py --dataset COCO20k --set train
Different models
We have tested the method on different setups of the VIT model, corloc results are presented in the following table (more can be found in the paper).
arch | pre-training | dataset | ||
---|---|---|---|---|
VOC07 | VOC12 | COCO20k | ||
ViT-S/16 | DINO | 61.9 | 64.0 | 50.7 |
ViT-S/8 | DINO | 55.5 | 57.0 | 49.5 |
ViT-B/16 | DINO | 60.1 | 63.3 | 50.0 |
ResNet50 | DINO | 36.8 | 42.7 | 26.5 |
ResNet50 | Imagenet | 33.5 | 39.1 | 25.5 |
Previous results on the dataset VOC07
can be obtained by launching:
python main_lost.py --dataset VOC07 --set trainval #VIT-S/16
python main_lost.py --dataset VOC07 --set trainval --patch_size 8 #VIT-S/8
python main_lost.py --dataset VOC07 --set trainval --arch vit_base #VIT-B/16
python main_lost.py --dataset VOC07 --set trainval --arch resnet50 #Resnet50/DINO
python main_lost.py --dataset VOC07 --set trainval --arch resnet50_imagenet #Resnet50/imagenet