Scaling and Benchmarking Self-Supervised Visual Representation Learning


FAIR Self-Supervision Benchmark is deprecated. Please see VISSL, a ground-up rewrite of benchmark in PyTorch.

FAIR Self-Supervision Benchmark

This code provides various benchmark (and legacy) tasks for evaluating quality of visual representations learned by various self-supervision approaches. This code corresponds to our work on Scaling and Benchmarking Self-Supervised Visual Representation Learning. The code is written in Python and can be used to evaluate both PyTorch and Caffe2 models (see this). We hope that this benchmark release will provided a consistent evaluation strategy that will allow measuring the progress in self-supervision easily.


The goal of fair_self_supervision_benchmark is to standardize the methodology for evaluating quality of visual representations learned by various self-supervision approaches. Further, it provides evaluation on a variety of tasks as follows:

Benchmark tasks: The benchmark tasks are based on principle: a good representation (1) transfers to many different tasks, and, (2) transfers with limited supervision and limited fine-tuning. The tasks are as follows.

Image Classification Object Detection Surface Normal Estimation Visual Navigation

These Benchmark tasks use the network architectures:

Legacy tasks: We also classify some commonly used evaluation tasks as legacy tasks for reasons mentioned in Section 7 of paper. The tasks are as follows:


fair_self_supervision_benchmark is CC-NC 4.0 International licensed, as found in the LICENSE file.


If you use fair_self_supervision_benchmark in your research or wish to refer to the baseline results published in the paper, please use the following BibTeX entry.

  title={Scaling and Benchmarking Self-Supervised Visual Representation Learning},
  author={Goyal, Priya and Mahajan, Dhruv and Gupta, Abhinav and Misra, Ishan},
  journal={arXiv preprint arXiv:1905.01235},


Please find installation instructions in

Getting Started

After installation, please see for how to run various benchmark tasks.

Model Zoo

We provide models used in our paper in the MODEL_ZOO.


  • [feature] Plans to support other pre-trained models?

    [feature] Plans to support other pre-trained models?

    Are there any plans to support custom pre-trained models, instead of AlexNet/ResNet-50? As background, I am working on a few new models and would like to evaluate them across all the benchmarks you've put together. Thanks!

    opened by david-codaio 4
  • [bug] GPU is not extensively used in feature extraction

    [bug] GPU is not extensively used in feature extraction


    Thank you for sharing this benchmark for evaluating self-supervised learning approaches.

    I followed the file to setup the benchmark tool successfully. Then followed this file to download the datasets and setup file hierarchies accordingly. And finally, used extra_scripts/ to produce image/label lists for each dataset as expected. So far so good.

    I simply want to extract COCO2014 features from a pre-trained model, for instance AlexNet-In1K. To do that first I change NUM_DEVICES to 1 in caffenet_bvlc_supervised_extract_features.yaml file since I have only 1 GPU, then run the following code that I take from

    python tools/ \
        --config_file configs/benchmark_tasks/image_classification/coco2014/caffenet_bvlc_supervised_extract_features.yaml \
        --data_type train \
        --output_file_prefix trainval \
        --output_dir /tmp/ssl-benchmark-output/extract_features/weights_init \
        TEST.PARAMS_FILE \ TRAIN.DATA_FILE /tmp/ssl-benchmark-output/coco/train_images.npy \
        TRAIN.LABELS_FILE /tmp/ssl-benchmark-output/coco/train_labels.npy

    While extracting features, it seems that GPU utilization is quite low, mostly less than 5% but instead, CPU load is 100% almost all the time. I believe that there is something wrong in device selection. Could you please help me resolving this issue?

    I have the same thing when I try this on machines with:

    • Ubuntu-16.04, CUDA-9, pytorch-1.0
    • CentOs-7, CUDA-10, pytorch-1.0


    opened by mbsariyildiz 4
  • [bug] Missing

    [bug] Missing

    The script referenced in the Surface Normal Estimation README doesn't seem to exist in either of the repositories or data download. I could be mistaken, but I haven't been able to find it with over an hour of searching. Thanks!

    opened by BramSW 2
  • [bug] Following few-shot recipe does not seem to work at

    [bug] Following few-shot recipe does not seem to work at "testing SVM" stage

    Hello - thanks for providing the benchmarking code =)

    I've been trying to reproduce the places205 few-shot results. I've followed the various sections of the README's, and I've ran into a few inconsistencies in file names, which I've tried to sensibly resolve, and eventually got everything working (I think) up until the Step 4: Testing SVM stage of the guide: step 4 - here

    Here's everything I've done, with possibly relevant inconsistencies in bold:

    1. Downloaded and renamed places205 in the desired format
    2. Converted places205 to numpy files
    python extra_scripts/ --data_source_dir places205 --output_dir places205-processed
    1. Generated the 5 samples at different k values
    python extra_scripts/ \
        --images_data_file ssl-benchmark-output/places205/train_images.npy \
        --targets_data_file ssl-benchmark-output/places205/train_labels.npy \
        --output_path ssl-benchmark-output/places205/low_shot/ \
        --k_values "1,2,4,8,16,32,64,96" \
        --num_samples 5
    1. I extracted the training and test fetures with the following two commands
    # TRAIN
    python tools/ \
        --config_file configs/benchmark_tasks/low_shot_image_classification/places205/resnet50_supervised_low_shot_extract_features.yaml \ --data_type train \
        --output_file_prefix trainval \
        --output_dir ssl-benchmark-output/extract_features/weights_init \
        TRAIN.DATA_FILE ssl-benchmark-output/places205/train_images_sample4_k1.npy \
        TRAIN.LABELS_FILE ssl-benchmark-output/places205/train_labels_sample4_k1.npy
    # TEST
    python tools/  \
      --config_file configs/benchmark_tasks/low_shot_image_classification/places205/resnet50_supervised_low_shot_extract_features.yaml \
      --data_type test \
      --output_file_prefix test \
      --output_dir ssl-benchmark-output/extract_features/weights_init \
      TEST.DATA_FILE places205-processed/val_images.npy \
      TEST.LABELS_FILE places205-processed/val_labels.npy
    1. Train the SVM I had to change the --data_file and --targets arg to not have the s0 part in the filepath
    python tools/svm/ \
      --data_file ssl-benchmark-output/extract_features/weights_init/trainval_res_conv1_bn_resize_features.npy \
      --targets_data_file ssl-benchmark-output/extract_features/weights_init/trainval_res_conv1_bn_resize_targets.npy \ 
      --costs_list "0.0000001,0.000001,0.00001,0.0001,0.001,0.01,0.1,1.0,10.0,100.0"  \
      --output_path ssl-benchmark-output/p205_svm_low_shot/svm_conv1/
    1. Test the SVM I had to change the --data_file and --targets arg to not have the s0 part in the filepath and also set the k_values and sample_inds to only work with the first set of features trained in the previous command: doesn't seem to be an option to train them all at once
    python tools/svm/ 
      --data_file ssl-benchmark-output/extract_features/weights_init/test_res_conv1_bn_resize_features.npy \
      --targets_data_file ssl-benchmark-output/extract_features/weights_init/test_res_conv1_bn_resize_targets.npy \
      --costs_list "0.0000001,0.000001,0.00001,0.0001,0.001,0.01,0.1,1.0,10.0,100.0" \
      --output_path ssl-benchmark-output/p205_svm_low_shot/svm_conv1/ \
      --k_values "4" \
      --sample_inds "0"

    The first error was:

    [INFO:  188]: Namespace(costs_list='0.0000001,0.000001,0.00001,0.0001,0.001,0.01,0.1,1.0,10.0,100.0', data_file='ssl-benchmark-output/extract_features/weights_init/test_res_conv1_bn_resize_features.npy', dataset='voc', generate_json=0, json_targets=None, k_values='4', output_path='ssl-benchmark-output/p205_svm_low_shot/svm_conv1/', sample_inds='1', targets_data_file='ssl-benchmark-output/extract_features/weights_init/test_res_conv1_bn_resize_targets.npy')
    [INFO:   80]: Testing svm for k-values: [4] and sample_inds: [0]
    [INFO:   58]: loading features and targets...
    [INFO:   63]: Loaded features: (20500, 9216) and targets: (20500, 1)
    [INFO:   98]: Testing SVM for costs: [1e-07, 1e-06, 1e-05, 0.0001, 0.001, 0.01, 0.1, 1.0, 10.0, 100.0, 0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625, 0.00048828125, 0.000244140625, 0.0001220703125, 6.103515625e-05, 3.0517578125e-05, 1.52587890625e-05, 7.62939453125e-06, 3.814697265625e-06, 1.9073486328125e-06]
    [INFO:  144]: Testing SVM for classes: range(0, 1)
    [INFO:  145]: Num classes: 1
    [INFO:  125]: Test sample/k_value/cost/cls: 2/4/1e-07/0
    Traceback (most recent call last):
      File "tools/svm/", line 193, in <module>
      File "tools/svm/", line 189, in main
      File "tools/svm/", line 128, in test_svm_low_shot
        with open(model_file, 'rb') as fopen:
    FileNotFoundError: [Errno 2] No such file or directory: 'ssl-benchmark-output/p205_svm_low_shot/svm_conv1/cls0_cost1e-07_sample1_k4.pickle'

    Looking in ssl-benchmark-output/p205_svm_low_shot/svm_conv1/ that file isn't there, but this one is: cls0_cost1e-07_sample1_k4.pickle, so I renamed it to cls0_cost1e-07_sample1_k4.pickle and reran the script, output this time:

    [INFO:   80]: Testing svm for k-values: [4] and sample_inds: [0]                                    
    [INFO:   58]: loading features and targets...                                                              
    [INFO:   63]: Loaded features: (20500, 9216) and targets: (20500, 1)                                       
    [INFO:   98]: Testing SVM for costs: [1e-07, 1e-06, 1e-05, 0.0001, 0.001, 0.01, 0.1, 1.0, 10.0, 100.
    0, 0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625, 0.00048828125, 0.000244140625, 0.000122070
    3125, 6.103515625e-05, 3.0517578125e-05, 1.52587890625e-05, 7.62939453125e-06, 3.814697265625e-06, 1.9073486328125e-06]   [INFO:  144]: Testing SVM for classes: range(0, 1)                                                         
    [INFO:  145]: Num classes: 1                                                                               [INFO:  125]: Test sample/k_value/cost/cls: 1/4/1e-07/0                                             Traceback (most recent call last):                                                                                          File "tools/svm/", line 193, in <module>                                                            
        main()                                                                                                                  File "tools/svm/", line 189, in main                                                                    test_svm_low_shot(opts)                                                                                               
      File "tools/svm/", line 138, in test_svm_low_shot                                                   
        eval_cls_labels, eval_preds                                                                                           
      File "/rscratch/cjrd/dul-project/deepul_proj_2/fair_self_supervision_benchmark/tools/svm/", line 99, in get
        preds[:, np.newaxis].astype(np.float64)                                                                               
      File "<__array_function__ internals>", line 6, in hstack                                                                
      File "/rscratch/cjrd/anaconda3/envs/ssl/lib/python3.7/site-packages/numpy/core/", line 345, in hstack      
        return _nx.concatenate(arrs, 1)                                                                                       
      File "<__array_function__ internals>", line 6, in concatenate                                                           
    ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 2 dimension(s) and the 
    array at index 1 has 3 dimension(s)    

    I tried fiddling with the numpy/dimensions of the the given problem, but it just seems to create more problems downstream.

    Any help would be very appreciated. Thank you!

    opened by cjrd 1
  • Can't get low accuracy for randomly extracted features

    Can't get low accuracy for randomly extracted features

    I am trying to get a <10% accuracy on VOC for using randomly extracted features, but I can't seem to do so. Running the following lines on a clean version of the repository gives me a Mean AP of 0.514. Am I doing this correctly?

    python extra_scripts/ \
        --data_source_dir /home/brenta/scratch/data/VOC2007/ \
        --output_dir voc07
    python tools/ \
        --config_file configs/benchmark_tasks/image_classification/voc07/caffenet_bvlc_random_extract_features.yaml \
        --data_type train \
        --output_file_prefix trainval \
        --output_dir extract_features/random \
        TRAIN.DATA_FILE voc07/train_images.npy \
        TRAIN.LABELS_FILE voc07/train_labels.npy
    python tools/svm/ \
        --data_file extract_features/random/trainval_conv1_s4k19_resize_features.npy \
        --targets_data_file extract_features/random/trainval_conv1_s4k19_resize_targets.npy \
        --costs_list "0.0000001,0.000001,0.00001,0.0001,0.001,0.01,0.1,1.0,10.0,100.0" \
        --output_path voc07_svm/svm_conv1/
    python tools/svm/ \
        --data_file extract_features/random/trainval_conv1_s4k19_resize_features.npy \
        --targets_data_file extract_features/random/trainval_conv1_s4k19_resize_targets.npy \
        --costs_list "0.0000001,0.000001,0.00001,0.0001,0.001,0.01,0.1,1.0,10.0,100.0" \
        --output_path voc07_svm/svm_conv1/
    opened by jasonwei20 1
  • Size of extracted features

    Size of extracted features

    Could you point me to where the code resizes the extracted features?

    For ResNet-50, I got 9216 extracted features after layers 1, 2, and 4, and 8192 features after layers 3 and 5. Where do these numbers come from?

    Thank you and apologizes if this is a silly question!

    opened by jasonwei20 1
  • Evaluating benchmark on simple imagenet pretrained model loaded from PyTorch

    Evaluating benchmark on simple imagenet pretrained model loaded from PyTorch

    I'd like to plug in the PyTorch default ResNet model with ImageNet pretraining to see how it does on the dataset. I'm using the default torchvision.models.resnet.ResNet as my model, but I don't know how to save it in the right format to be read in. In other words, your functions save_model_params(model, params_file, checkpoint_dir, model_iter) and checkpoints.load_model_from_params_file(model takes in a model that seems to be a ModelBuilder class, and I'd like to load a a model of the class torchvision.models.resnet.ResNet .

    Any idea on what the best way to do this is?

    Thanks, Jason

    opened by jasonwei20 1
  • Pretraining  using a custom subset of Imagenet.

    Pretraining using a custom subset of Imagenet.

    Thanks for sharing the code and paper!

    I am looking to perform pre-training using Jigsaw(Resnet50) pretext method, just like what was mentioned in the paper(Table 2). But instead of using Imagenet1K, I would like to pretrained with an Imagenet subset of 10 classes. I also have only one GPU available.

    My question:

    1. Is the code for pre-training using Jigsaw(Resenet50) available? I ask this because I am only seeing codes and commands for Benchmark Tasks and Legacy Tasks. If pretrained code is available for Jigsaw(Resenet5), pls indicate where I can find them.

    Many thanks and looking forward for your prompt reply!

    opened by chho-work 1
  • Preprocessing COCO - 'minival', 'valminusminival'

    Preprocessing COCO - 'minival', 'valminusminival'

    What are minival and valminusminival for the coco dataset? My COCO14 annotations folder looks like this:

    captions_train2014.json  instances_train2014.json  person_keypoints_train2014.json
    captions_val2014.json    instances_val2014.json    person_keypoints_val2014.json

    Therefore I get an error in when I try to preprocess COCO when running extra_scripts/

    partitions = ['val', 'train', 'minival', 'valminusminival']
        for partition in partitions:

    Do I need minival and valminusminival?

    opened by jasonwei20 1
  • When have you applied Jigsaw, for pre-training Faster-RCNN

    When have you applied Jigsaw, for pre-training Faster-RCNN


    I'm tiring to understand where did you apply the Jigsaw augmentation for object detection:

    • To the original image, before the backbone
    • To the proposed regions, suggested by the RPN, so on a feature map

    Thanks in advance! And have a nice day

    opened by DarioRugg 0
  • Is there a simple demo file, containing code to get the visual representation of a single image?

    Is there a simple demo file, containing code to get the visual representation of a single image?

    Is there someway that we can get a simple demo file, which can be used to get the visual representation of each image. something like

    python3 --model=model_name --image=/path/to/image --pretrained=/path/to/pretrained/weights

    This will return the visual representation of the image according to the model given. This will be very helpful, for beginners like me to just play around with representations.


    opened by qureshinomaan 0
