Scaling and Benchmarking Self-Supervised Visual Representation Learning

Overview

FAIR Self-Supervision Benchmark is deprecated. Please see VISSL, a ground-up rewrite of benchmark in PyTorch.

FAIR Self-Supervision Benchmark

This code provides various benchmark (and legacy) tasks for evaluating quality of visual representations learned by various self-supervision approaches. This code corresponds to our work on Scaling and Benchmarking Self-Supervised Visual Representation Learning. The code is written in Python and can be used to evaluate both PyTorch and Caffe2 models (see this). We hope that this benchmark release will provided a consistent evaluation strategy that will allow measuring the progress in self-supervision easily.

Introduction

The goal of fair_self_supervision_benchmark is to standardize the methodology for evaluating quality of visual representations learned by various self-supervision approaches. Further, it provides evaluation on a variety of tasks as follows:

Benchmark tasks: The benchmark tasks are based on principle: a good representation (1) transfers to many different tasks, and, (2) transfers with limited supervision and limited fine-tuning. The tasks are as follows.

Image Classification Object Detection Surface Normal Estimation Visual Navigation

These Benchmark tasks use the network architectures:

Legacy tasks: We also classify some commonly used evaluation tasks as legacy tasks for reasons mentioned in Section 7 of paper. The tasks are as follows:

License

fair_self_supervision_benchmark is CC-NC 4.0 International licensed, as found in the LICENSE file.

Citation

If you use fair_self_supervision_benchmark in your research or wish to refer to the baseline results published in the paper, please use the following BibTeX entry.

@article{goyal2019scaling,
  title={Scaling and Benchmarking Self-Supervised Visual Representation Learning},
  author={Goyal, Priya and Mahajan, Dhruv and Gupta, Abhinav and Misra, Ishan},
  journal={arXiv preprint arXiv:1905.01235},
  year={2019}
}

Installation

Please find installation instructions in INSTALL.md.

Getting Started

After installation, please see GETTING_STARTED.md for how to run various benchmark tasks.

Model Zoo

We provide models used in our paper in the MODEL_ZOO.

References

Comments
  • [feature] Plans to support other pre-trained models?

    [feature] Plans to support other pre-trained models?

    Are there any plans to support custom pre-trained models, instead of AlexNet/ResNet-50? As background, I am working on a few new models and would like to evaluate them across all the benchmarks you've put together. Thanks!

    awaiting_user_response 
    opened by david-codaio 4
  • [bug] GPU is not extensively used in feature extraction

    [bug] GPU is not extensively used in feature extraction

    Hello,

    Thank you for sharing this benchmark for evaluating self-supervised learning approaches.

    I followed the INSTALL.md file to setup the benchmark tool successfully. Then followed this README.md file to download the datasets and setup file hierarchies accordingly. And finally, used extra_scripts/README.md to produce image/label lists for each dataset as expected. So far so good.

    I simply want to extract COCO2014 features from a pre-trained model, for instance AlexNet-In1K. To do that first I change NUM_DEVICES to 1 in caffenet_bvlc_supervised_extract_features.yaml file since I have only 1 GPU, then run the following code that I take from GETTING_STARTED.md

    python tools/extract_features.py \
        --config_file configs/benchmark_tasks/image_classification/coco2014/caffenet_bvlc_supervised_extract_features.yaml \
        --data_type train \
        --output_file_prefix trainval \
        --output_dir /tmp/ssl-benchmark-output/extract_features/weights_init \
        TEST.PARAMS_FILE https://dl.fbaipublicfiles.com/fair_self_supervision_benchmark/models/caffenet_bvlc_in1k_supervised.npy \ TRAIN.DATA_FILE /tmp/ssl-benchmark-output/coco/train_images.npy \
        TRAIN.LABELS_FILE /tmp/ssl-benchmark-output/coco/train_labels.npy
    

    While extracting features, it seems that GPU utilization is quite low, mostly less than 5% but instead, CPU load is 100% almost all the time. I believe that there is something wrong in device selection. Could you please help me resolving this issue?

    I have the same thing when I try this on machines with:

    • Ubuntu-16.04, CUDA-9, pytorch-1.0
    • CentOs-7, CUDA-10, pytorch-1.0

    Bulent

    awaiting_user_response 
    opened by mbsariyildiz 4
  • [bug] Missing make_SN_labels.py?

    [bug] Missing make_SN_labels.py?

    The script make_SN_labels.py referenced in the Surface Normal Estimation README doesn't seem to exist in either of the repositories or data download. I could be mistaken, but I haven't been able to find it with over an hour of searching. Thanks!

    bug 
    opened by BramSW 2
  • [bug] Following few-shot recipe does not seem to work at

    [bug] Following few-shot recipe does not seem to work at "testing SVM" stage

    Hello - thanks for providing the benchmarking code =)

    I've been trying to reproduce the places205 few-shot results. I've followed the various sections of the README's, and I've ran into a few inconsistencies in file names, which I've tried to sensibly resolve, and eventually got everything working (I think) up until the Step 4: Testing SVM stage of the GETTING_STARTED.md guide: step 4 - here

    Here's everything I've done, with possibly relevant inconsistencies in bold:

    1. Downloaded and renamed places205 in the desired format
    2. Converted places205 to numpy files
    python extra_scripts/create_imagenet_data_files.py --data_source_dir places205 --output_dir places205-processed
    
    1. Generated the 5 samples at different k values
    python extra_scripts/create_places_low_shot_samples.py \
        --images_data_file ssl-benchmark-output/places205/train_images.npy \
        --targets_data_file ssl-benchmark-output/places205/train_labels.npy \
        --output_path ssl-benchmark-output/places205/low_shot/ \
        --k_values "1,2,4,8,16,32,64,96" \
        --num_samples 5
    
    1. I extracted the training and test fetures with the following two commands
    # TRAIN
    python tools/extract_features.py \
        --config_file configs/benchmark_tasks/low_shot_image_classification/places205/resnet50_supervised_low_shot_extract_features.yaml \ --data_type train \
        --output_file_prefix trainval \
        --output_dir ssl-benchmark-output/extract_features/weights_init \
        TEST.PARAMS_FILE https://dl.fbaipublicfiles.com/fair_self_supervision_benchmark/models/resnet50_in1k_supervised.pkl \
        TRAIN.DATA_FILE ssl-benchmark-output/places205/train_images_sample4_k1.npy \
        TRAIN.LABELS_FILE ssl-benchmark-output/places205/train_labels_sample4_k1.npy
    # TEST
    python tools/extract_features.py  \
      --config_file configs/benchmark_tasks/low_shot_image_classification/places205/resnet50_supervised_low_shot_extract_features.yaml \
      --data_type test \
      --output_file_prefix test \
      --output_dir ssl-benchmark-output/extract_features/weights_init \
      TEST.PARAMS_FILE https://dl.fbaipublicfiles.com/fair_self_supervision_benchmark/models/resnet50_in1k_supervised.pkl \ 
      TEST.DATA_FILE places205-processed/val_images.npy \
      TEST.LABELS_FILE places205-processed/val_labels.npy
    
    1. Train the SVM I had to change the --data_file and --targets arg to not have the s0 part in the filepath
    python tools/svm/train_svm_low_shot.py \
      --data_file ssl-benchmark-output/extract_features/weights_init/trainval_res_conv1_bn_resize_features.npy \
      --targets_data_file ssl-benchmark-output/extract_features/weights_init/trainval_res_conv1_bn_resize_targets.npy \ 
      --costs_list "0.0000001,0.000001,0.00001,0.0001,0.001,0.01,0.1,1.0,10.0,100.0"  \
      --output_path ssl-benchmark-output/p205_svm_low_shot/svm_conv1/
    
    1. Test the SVM I had to change the --data_file and --targets arg to not have the s0 part in the filepath and also set the k_values and sample_inds to only work with the first set of features trained in the previous command: doesn't seem to be an option to train them all at once
    python tools/svm/test_svm_low_shot.py 
      --data_file ssl-benchmark-output/extract_features/weights_init/test_res_conv1_bn_resize_features.npy \
      --targets_data_file ssl-benchmark-output/extract_features/weights_init/test_res_conv1_bn_resize_targets.npy \
      --costs_list "0.0000001,0.000001,0.00001,0.0001,0.001,0.01,0.1,1.0,10.0,100.0" \
      --output_path ssl-benchmark-output/p205_svm_low_shot/svm_conv1/ \
      --k_values "4" \
      --sample_inds "0"
    

    The first error was:

    [INFO: test_svm_low_shot.py:  188]: Namespace(costs_list='0.0000001,0.000001,0.00001,0.0001,0.001,0.01,0.1,1.0,10.0,100.0', data_file='ssl-benchmark-output/extract_features/weights_init/test_res_conv1_bn_resize_features.npy', dataset='voc', generate_json=0, json_targets=None, k_values='4', output_path='ssl-benchmark-output/p205_svm_low_shot/svm_conv1/', sample_inds='1', targets_data_file='ssl-benchmark-output/extract_features/weights_init/test_res_conv1_bn_resize_targets.npy')
    [INFO: test_svm_low_shot.py:   80]: Testing svm for k-values: [4] and sample_inds: [0]
    [INFO: svm_helper.py:   58]: loading features and targets...
    [INFO: svm_helper.py:   63]: Loaded features: (20500, 9216) and targets: (20500, 1)
    [INFO: test_svm_low_shot.py:   98]: Testing SVM for costs: [1e-07, 1e-06, 1e-05, 0.0001, 0.001, 0.01, 0.1, 1.0, 10.0, 100.0, 0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625, 0.00048828125, 0.000244140625, 0.0001220703125, 6.103515625e-05, 3.0517578125e-05, 1.52587890625e-05, 7.62939453125e-06, 3.814697265625e-06, 1.9073486328125e-06]
    [INFO: svm_helper.py:  144]: Testing SVM for classes: range(0, 1)
    [INFO: svm_helper.py:  145]: Num classes: 1
    [INFO: test_svm_low_shot.py:  125]: Test sample/k_value/cost/cls: 2/4/1e-07/0
    Traceback (most recent call last):
      File "tools/svm/test_svm_low_shot.py", line 193, in <module>
        main()
      File "tools/svm/test_svm_low_shot.py", line 189, in main
        test_svm_low_shot(opts)
      File "tools/svm/test_svm_low_shot.py", line 128, in test_svm_low_shot
        with open(model_file, 'rb') as fopen:
    FileNotFoundError: [Errno 2] No such file or directory: 'ssl-benchmark-output/p205_svm_low_shot/svm_conv1/cls0_cost1e-07_sample1_k4.pickle'
    

    Looking in ssl-benchmark-output/p205_svm_low_shot/svm_conv1/ that file isn't there, but this one is: cls0_cost1e-07_sample1_k4.pickle, so I renamed it to cls0_cost1e-07_sample1_k4.pickle and reran the script, output this time:

    [INFO: test_svm_low_shot.py:   80]: Testing svm for k-values: [4] and sample_inds: [0]                                    
    [INFO: svm_helper.py:   58]: loading features and targets...                                                              
    [INFO: svm_helper.py:   63]: Loaded features: (20500, 9216) and targets: (20500, 1)                                       
    [INFO: test_svm_low_shot.py:   98]: Testing SVM for costs: [1e-07, 1e-06, 1e-05, 0.0001, 0.001, 0.01, 0.1, 1.0, 10.0, 100.
    0, 0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625, 0.00048828125, 0.000244140625, 0.000122070
    3125, 6.103515625e-05, 3.0517578125e-05, 1.52587890625e-05, 7.62939453125e-06, 3.814697265625e-06, 1.9073486328125e-06]   [INFO: svm_helper.py:  144]: Testing SVM for classes: range(0, 1)                                                         
    [INFO: svm_helper.py:  145]: Num classes: 1                                                                               [INFO: test_svm_low_shot.py:  125]: Test sample/k_value/cost/cls: 1/4/1e-07/0                                             Traceback (most recent call last):                                                                                          File "tools/svm/test_svm_low_shot.py", line 193, in <module>                                                            
        main()                                                                                                                  File "tools/svm/test_svm_low_shot.py", line 189, in main                                                                    test_svm_low_shot(opts)                                                                                               
      File "tools/svm/test_svm_low_shot.py", line 138, in test_svm_low_shot                                                   
        eval_cls_labels, eval_preds                                                                                           
      File "/rscratch/cjrd/dul-project/deepul_proj_2/fair_self_supervision_benchmark/tools/svm/svm_helper.py", line 99, in get
    _precision_recall                                                                                                         
        preds[:, np.newaxis].astype(np.float64)                                                                               
      File "<__array_function__ internals>", line 6, in hstack                                                                
      File "/rscratch/cjrd/anaconda3/envs/ssl/lib/python3.7/site-packages/numpy/core/shape_base.py", line 345, in hstack      
        return _nx.concatenate(arrs, 1)                                                                                       
      File "<__array_function__ internals>", line 6, in concatenate                                                           
    ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 2 dimension(s) and the 
    array at index 1 has 3 dimension(s)    
    

    I tried fiddling with the numpy/dimensions of the the given problem, but it just seems to create more problems downstream.

    Any help would be very appreciated. Thank you!

    opened by cjrd 1
  • Can't get low accuracy for randomly extracted features

    Can't get low accuracy for randomly extracted features

    I am trying to get a <10% accuracy on VOC for using randomly extracted features, but I can't seem to do so. Running the following lines on a clean version of the repository gives me a Mean AP of 0.514. Am I doing this correctly?

    python extra_scripts/create_voc_data_files.py \
        --data_source_dir /home/brenta/scratch/data/VOC2007/ \
        --output_dir voc07
    
    python tools/extract_features.py \
        --config_file configs/benchmark_tasks/image_classification/voc07/caffenet_bvlc_random_extract_features.yaml \
        --data_type train \
        --output_file_prefix trainval \
        --output_dir extract_features/random \
        TRAIN.DATA_FILE voc07/train_images.npy \
        TRAIN.LABELS_FILE voc07/train_labels.npy
    
    python tools/svm/train_svm_kfold.py \
        --data_file extract_features/random/trainval_conv1_s4k19_resize_features.npy \
        --targets_data_file extract_features/random/trainval_conv1_s4k19_resize_targets.npy \
        --costs_list "0.0000001,0.000001,0.00001,0.0001,0.001,0.01,0.1,1.0,10.0,100.0" \
        --output_path voc07_svm/svm_conv1/
    
    python tools/svm/test_svm.py \
        --data_file extract_features/random/trainval_conv1_s4k19_resize_features.npy \
        --targets_data_file extract_features/random/trainval_conv1_s4k19_resize_targets.npy \
        --costs_list "0.0000001,0.000001,0.00001,0.0001,0.001,0.01,0.1,1.0,10.0,100.0" \
        --output_path voc07_svm/svm_conv1/
    
    opened by jasonwei20 1
  • Size of extracted features

    Size of extracted features

    Could you point me to where the code resizes the extracted features?

    For ResNet-50, I got 9216 extracted features after layers 1, 2, and 4, and 8192 features after layers 3 and 5. Where do these numbers come from?

    Thank you and apologizes if this is a silly question!

    opened by jasonwei20 1
  • Evaluating benchmark on simple imagenet pretrained model loaded from PyTorch

    Evaluating benchmark on simple imagenet pretrained model loaded from PyTorch

    I'd like to plug in the PyTorch default ResNet model with ImageNet pretraining to see how it does on the dataset. I'm using the default torchvision.models.resnet.ResNet as my model, but I don't know how to save it in the right format to be read in. In other words, your functions save_model_params(model, params_file, checkpoint_dir, model_iter) and checkpoints.load_model_from_params_file(model takes in a model that seems to be a ModelBuilder class, and I'd like to load a a model of the class torchvision.models.resnet.ResNet .

    Any idea on what the best way to do this is?

    Thanks, Jason

    opened by jasonwei20 1
  • Pretraining  using a custom subset of Imagenet.

    Pretraining using a custom subset of Imagenet.

    Thanks for sharing the code and paper!

    I am looking to perform pre-training using Jigsaw(Resnet50) pretext method, just like what was mentioned in the paper(Table 2). But instead of using Imagenet1K, I would like to pretrained with an Imagenet subset of 10 classes. I also have only one GPU available.

    My question:

    1. Is the code for pre-training using Jigsaw(Resenet50) available? I ask this because I am only seeing codes and commands for Benchmark Tasks and Legacy Tasks. If pretrained code is available for Jigsaw(Resenet5), pls indicate where I can find them.

    Many thanks and looking forward for your prompt reply!

    enhancement 
    opened by chho-work 1
  • Preprocessing COCO - 'minival', 'valminusminival'

    Preprocessing COCO - 'minival', 'valminusminival'

    What are minival and valminusminival for the coco dataset? My COCO14 annotations folder looks like this:

    captions_train2014.json  instances_train2014.json  person_keypoints_train2014.json
    captions_val2014.json    instances_val2014.json    person_keypoints_val2014.json
    

    Therefore I get an error in when I try to preprocess COCO when running extra_scripts/create_coco_data_files.py:

    partitions = ['val', 'train', 'minival', 'valminusminival']
        for partition in partitions:
    

    Do I need minival and valminusminival?

    opened by jasonwei20 1
  • When have you applied Jigsaw, for pre-training Faster-RCNN

    When have you applied Jigsaw, for pre-training Faster-RCNN

    Hello,

    I'm tiring to understand where did you apply the Jigsaw augmentation for object detection:

    • To the original image, before the backbone
    • To the proposed regions, suggested by the RPN, so on a feature map

    Thanks in advance! And have a nice day

    opened by DarioRugg 0
  • Is there a simple demo file, containing code to get the visual representation of a single image?

    Is there a simple demo file, containing code to get the visual representation of a single image?

    Is there someway that we can get a simple demo file, which can be used to get the visual representation of each image. something like

    python3 demo.py --model=model_name --image=/path/to/image --pretrained=/path/to/pretrained/weights
    

    This will return the visual representation of the image according to the model given. This will be very helpful, for beginners like me to just play around with representations.

    Thanks!

    opened by qureshinomaan 0
Owner
Meta Research
Meta Research
Official code for On Path Integration of Grid Cells: Group Representation and Isotropic Scaling (NeurIPS 2021)

On Path Integration of Grid Cells: Group Representation and Isotropic Scaling This repo contains the official implementation for the paper On Path Int

Ruiqi Gao 39 Nov 10, 2022
The Self-Supervised Learner can be used to train a classifier with fewer labeled examples needed using self-supervised learning.

Published by SpaceML • About SpaceML • Quick Colab Example Self-Supervised Learner The Self-Supervised Learner can be used to train a classifier with

SpaceML 92 Nov 30, 2022
Dense Contrastive Learning (DenseCL) for self-supervised representation learning, CVPR 2021.

Dense Contrastive Learning for Self-Supervised Visual Pre-Training This project hosts the code for implementing the DenseCL algorithm for se

Xinlong Wang 491 Jan 3, 2023
Self-supervised learning on Graph Representation Learning (node-level task)

graph_SSL Self-supervised learning on Graph Representation Learning (node-level task) How to run the code To run GRACE, sh run_GRACE.sh To run GCA, sh

Namkyeong Lee 3 Dec 31, 2021
Implementation of Self-supervised Graph-level Representation Learning with Local and Global Structure (ICML 2021).

Self-supervised Graph-level Representation Learning with Local and Global Structure Introduction This project is an implementation of ``Self-supervise

MilaGraph 50 Dec 9, 2022
Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR

UniSpeech The family of UniSpeech: UniSpeech (ICML 2021): Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR UniSpeech-

Microsoft 282 Jan 9, 2023
Official PyTorch implementation for paper Context Matters: Graph-based Self-supervised Representation Learning for Medical Images

Context Matters: Graph-based Self-supervised Representation Learning for Medical Images Official PyTorch implementation for paper Context Matters: Gra

null 49 Nov 23, 2022
[CVPR2021] The source code for our paper 《Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning》.

TBE The source code for our paper "Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Le

Jinpeng Wang 150 Dec 28, 2022
BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation This is a demo implementation of BYOL for Audio (BYOL-A), a self-sup

NTT Communication Science Laboratories 160 Jan 4, 2023
A PyTorch implementation of "Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning", IJCAI-21

MERIT A PyTorch implementation of our IJCAI-21 paper Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning. Depen

Graph Analysis & Deep Learning Laboratory, GRAND 32 Jan 2, 2023
Code for the paper "Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds" (ICCV 2021)

Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds This is the official code implementation for the paper "Spatio-temporal Se

Hesper 63 Jan 5, 2023
A self-supervised 3D representation learning framework named viewpoint bottleneck.

Pointly-supervised 3D Scene Parsing with Viewpoint Bottleneck Paper Created by Liyi Luo, Beiwen Tian, Hao Zhao and Guyue Zhou from Institute for AI In

null 63 Aug 11, 2022
A self-supervised 3D representation learning framework named viewpoint bottleneck.

Pointly-supervised 3D Scene Parsing with Viewpoint Bottleneck Paper Created by Liyi Luo, Beiwen Tian, Hao Zhao and Guyue Zhou from Institute for AI In

null 42 Sep 24, 2021
The official implementation of the paper, "SubTab: Subsetting Features of Tabular Data for Self-Supervised Representation Learning"

SubTab: Author: Talip Ucar ([email protected]) The official implementation of the paper, SubTab: Subsetting Features of Tabular Data for Self-Supervis

AstraZeneca 98 Dec 29, 2022
This is the implementation of "SELF SUPERVISED REPRESENTATION LEARNING WITH DEEP CLUSTERING FOR ACOUSTIC UNIT DISCOVERY FROM RAW SPEECH" submitted to ICASSP 2022

CPC_DeepCluster This is the implementation of "SELF SUPERVISED REPRESENTATION LEARNING WITH DEEP CLUSTERING FOR ACOUSTIC UNIT DISCOVERY FROM RAW SPEEC

LEAP Lab 2 Sep 15, 2022
An official PyTorch implementation of the TKDE paper "Self-Supervised Graph Representation Learning via Topology Transformations".

Self-Supervised Graph Representation Learning via Topology Transformations This repository is the official PyTorch implementation of the following pap

Hsiang Gao 2 Oct 31, 2022
Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

ConSERT Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer Requirements torch==1.6.0

Yan Yuanmeng 478 Dec 25, 2022
This repository is the official implementation of Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning (NeurIPS21).

Core-tuning This repository is the official implementation of ``Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regular

vanint 18 Dec 17, 2022
pytorch implementation of "Contrastive Multiview Coding", "Momentum Contrast for Unsupervised Visual Representation Learning", and "Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination"

Unofficial implementation: MoCo: Momentum Contrast for Unsupervised Visual Representation Learning (Paper) InsDis: Unsupervised Feature Learning via N

Zhiqiang Shen 16 Nov 4, 2020