General Vision Benchmark, a project from OpenGVLab

Overview

Introduction

  • We build GV-B(General Vision Benchmark) on Classification, Detection, Segmentation and Depth Estimation including 26 datasets for model evaluation.
  • It is recommended to evaluate with low-data regime, using only 10% training data.
  • The parameters of model backbone will be frozen during training, as known as 'linear probe'.
  • Face Detection and Depth Estimation is not provided for now, you may evaluate via official repo if needed.
  • Specifically, we use central_model.py in our repo to represent the implementation of Up-G models.

Task Supported

  • Object Classification
  • Object Detection (VOC Detection)
  • Pedestrian Detection (CityPersons Detection)
  • Semantic Segmentation (VOC Segmentation)
  • Face Detection (WiderFace Detection)
  • Depth Estimation (Kitti/NYU-v2 Depth Estimation)

Installation

Requirements

Install Dependencies

a. Create a conda virtual environment and activate it.

conda create -n open-mmlab python=3.8 -y
conda activate open-mmlab

b. Install PyTorch and torchvision following the official instructions, e.g.:

conda install pytorch torchvision -c pytorch
Make sure that your compilation CUDA version and runtime CUDA version match.
You can check the supported CUDA version for precompiled packages on the
[PyTorch website](https://pytorch.org/).

c. Install openmm package via pip (mmcls, mmdet, mmseg):

pip install mmcls
pip install mmdet
pip install mmsegmetation

Usage

This section provide basic tutorials about the usage of GV-B.

Prepare datasets

For each evaluation task, you can follow the official repo tutorial for data preparation.

mmclassification

mmdetection

mmsegmentation

Model evaluation

We use MIM to submit evaluation in GV-B.

a.If you run MMClassification on a cluster managed with slurm, you can use the script mim_slurm_train.sh. (This script also supports single machine training.)

sh tools/mim_slurm_train.sh $PARTITION $TASK $CONFIG $WORK_DIR

b.If you run on w/o slurm. (More details can be found in docs of openmim)

PYTHONPATH='.':$PYTHONPATH mim train $TASK $CONFIG $WORK_DIR
  • PARTITION: The partition you are using
  • WORK_DIR: The directory to save logs and checkpoints
  • CONFIG: Config files corresponding to tasks.

Detailed Tutorials

Currently, we provide tutorials for users.

Benchmark(with Hyperparameter searching)

CLS DET SEG DEP
10% data Cifar10 Cifar100 Food Pets Flowers Sun Cars Dtd Caltech Aircraft Svhn Eurosat Resisc45 Retinopathy Fer2013 Ucf101 Gtsrb Pcam Imagenet Kinetics700 VOC07+12 WIDER FACE CityPersons VOC2012 KITTI NYUv2
Up-A R50 92.4 73.5 75.8 85.7 94.6 57.9 52.7 65.0 88.5 28.7 61.4 93.8 82.9 73.8 55.0 71.1 75.1 82.9 71.9 35.2 76.3 90.3/88.3/70.7 24.6/59.0 62.54 3.181 0.456
MN-B4 96.1 82.9 84.3 89.8 98.3 66.0 61.4 66.8 92.8 32.5 60.4 92.7 85.8 75.6 56.5 76.9 74.4 84.3 77.2 39.4 74.9 89.3/87.6/71.4 26.5/61.8 65.71 3.565 0.482
MN-B15 98.2 87.8 93.9 92.8 99.6 72.3 59.4 70.0 93.8 64.8 58.6 95.3 91.9 77.9 62.8 85.4 76.2 87.8 86.0 52.9 78.4 93.6/91.8/77.2 17.7/49.5 60.68 2.423 0.383
Up-E C-R50 91.9 71.2 80.7 88.8 94.0 57.4 67.9 62.7 85.5 73.9 57.6 93.7 83.6 75.4 54.1 69.6 73.9 85.7 72.5 34.6 72.2 89.7/87.6/68.1 22.4/58.3 57.66 3.214 0.501
D-R50 86.4 57.3 53.9 31.4 44.0 39.8 8.6 44.6 72.5 15.8 64.2 89.1 72.8 73.6 46.6 57.4 67.5 81.7 45.0 25.2 87.7 93.8/92.0/75.5 15.8/41.5 62.3 3.09 0.45
S-R50 78.3 46.6 45.1 24.2 33.9 38.0 5.0 41.4 50.2 8.5 51.5 89.9 76.4 74.0 44.8 42.0 64.0 80.8 34.9 19.7 75.0 87.4/85.7/66.4 19.6/53.3 71.9 3.12 0.45
C-MN-B4 96.7 83.2 89.2 91.9 98.2 66.7 67.7 66.3 91.9 77.2 57.8 94.4 88.0 77.0 56.6 78.5 77.3 85.6 80.5 44.2 73.7 89.6/88.0/71.1 30.3/65.0 65.8 3.54 0.46
D-MN-B4 91.5 67.0 61.4 44.4 57.2 41.8 12.1 41.2 80.6 25.1 68.0 90.7 74.6 74.3 50.3 61.7 74.2 81.9 57.0 29.3 89.3 94.6/92.6/76.5 14.0/43.8 73.1 3.05 0.40
S-MN-B4 83.5 57.2 68.3 70.8 85.8 52.9 25.9 52.8 81.6 17.7 56.1 91.3 83.6 74.5 49.0 55.2 68.0 84.3 61.0 27.4 78.7 89.5/87.9/71.4 19.4/53.0 79.6 3.06 0.41
C-MN-B-15 98.7 90.1 94.7 95.1 99.7 75.7 74.9 73.6 94.4 91.8 66.7 96.2 92.8 77.6 62.3 87.7 83.3 87.5 87.2 54.7 80.4 93.2/91.4/75.7 29.5/59.9 70.6 2.63 0.37
D-MN-B-15 92.2 67.9 69.0 33.9 59.5 45.4 13.8 46.3 82.0 26.6 65.4 90.1 79.1 76.0 53.2 63.7 74.4 83.3 62.2 33.7 89.4 95.8/94.4/80.1 10.5/42.4 77.2 2.72 0.37
Up-G R50 92.9 73.7 81.1 88.9 94.0 58.6 68.6 63.0 86.1 74.0 57.9 94.4 84.0 75.7 54.3 70.8 74.3 85.9 72.6 34.8 87.7 93.9/92.2/77.0 14.7/46.0 66.19 2.835 0.39
MN-B4 96.7 83.9 89.2 92.1 98.2 66.7 67.7 66.5 91.9 77.2 57.8 94.4 88.0 77.0 57.1 79 77.7 86 80.5 44.2 89.1 94.9/92.8/76.5 12.0/50.5 71.4 2.94 0.40
MN-B15 98.7 90.4 94.5 95.4 99.7 74.4 75.4 74.2 94.5 91.8 66.7 96.3 92.7 77.9 63.1 88 83.6 88 87.1 54.7 89.8 95.9/94.2/79.6 10.5/41.3 77.3 2.71 0.37
Comments
  • Reproduce Issue: Up-A CIFAR-100, Flowers 102

    Reproduce Issue: Up-A CIFAR-100, Flowers 102

    Hi,

    When I reproduce the linear-probe classification performance on CIFAR-100, Flowers 102, I got weird results when Up-A R50 was used for the backbone.

    In the paper, Up-A R50 outperforms ImageNet pre-trained R50, however, ImageNet pre-trained R50 outperforms Up-A R50 on CIFAR-100, Flowers 102 cases with large margins.

    My configurations as below: Model image

    Dataset image image

    Others image

    I checked whether the pre-trained Up-A was successfully loaded or not.

    Thanks in advance.

    opened by dev-sungman 3
  • 2 questions about the INTERN

    2 questions about the INTERN

    Hello,

    First, thanks for releasing this great work !

    I have 2 questions about the INTERN. The questions are as follows:

    1. What's the difference between UP-G det and UP-G cls ?

      • according to the paper, there was a single benchmark score for UP-G.
      • however, in the released model and configuration, I could find the UP-G cls, det model.
    2. Is backbone frozen when expert and general training?

    • Due to the amateur training, the backbone may have strong representation. Then, I think we don't have to train the backbone on amateur and expert sessions. Is it right?
    • How can I check the detailed information about the training for each stage?

    Thanks in advance.

    opened by dev-sungman 2
  • Problems about reproducing INTERN-r50-Up-G-dbn

    Problems about reproducing INTERN-r50-Up-G-dbn

    Thanks for your great work.

    My reproduction linear probing result for INTERN-r50-Up-G-dbn on the full VOC07+12 dataset is only 77.5, which is much lower than the paper result 87.7 obtained with only 10% data.

    My exp info:

    1. INTERN-r50-Up-G-dbn-a4040c9c4.pth.tar is used as pretrained model, which is downloaded from the webside: https://opengvlab.shlab.org.cn/models
    2. The file in this repo "./configs/det/linear_probe/faster_rcnn/central_mnb4_fpn_Up-G-D_pretrain_voc0712_10p.py" is used in my exps with two changes: (a). _base_ = [..., '../../base/datasets/voc0712.py', ...] (b). model = dict(backbone=dict(init_cfg=dict( type='Pretrained', checkpoint='./INTERN-r50-Up-G-dbn-a4040c9c4.pth.tar', ))
    3. the used command: bash ./tools/dist_train.sh configs/det/linear_probe/faster_rcnn/central_r50_fpn_Up-G-D_pretrain_voc0712_10p.py 8

    I have no idea what may cause this.

    Some of my training log as follows:

    INFO - Set random seed to 1962424783, deterministic: False INFO - initialize Central_Model with init_cfg {'type': 'Pretrained', 'checkpoint': './INTERN-r50-Up-G-dbn-a4040c9c4.pth.tar'} INFO - load model from: ./INTERN-r50-Up-G-dbn-a4040c9c4.pth.tar INFO - load checkpoint from local path: ./INTERN-r50-Up-G-dbn-a4040c9c4.pth.tar WARNING - The model and loaded state dict do not match exactly

    unexpected key in source state_dict: lateral_convs.0.convclear.weight, lateral_convs.0.convclear.bias, fpn_convs.0.conv.weight, fpn_convs.0.conv.bias, lateral_convs.1.conv.weight, ateral_convs.1.conv.bias, fpn_convs.1.conv.weight, fpn_convs.1.conv.bias, lateral_convs.2.convweight, lateral_convs.2.convbias, fpn_convs.2.conv.weight, fpn_convs.2.conv.bias, lateral_convs.3.conv.weight, lateral_convs.3.conv.bias, fpn_convs.3.convweight, fpn_convs.3.convbias, rpn_head.rpn_conv.weight, rpn_head.rpn_conv.bias, roi_head.bbox_head.shared_fcs.0.weight, roi_head.bbox_head.shared_fcs.0.bias, roi_head.bbox_head.shared_fcs.1.weight, roi_head.bbox_head.shared_fcs.1.bias

    INFO - initialize FPN with init_cfg {'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'} INFO - initialize RPNHead with init_cfg {'type': 'Normal', 'layer': 'Conv2d', 'std': 0.01} INFO - initialize Shared2FCBBoxHead with init_cfg [{'type': 'Normal', 'std': 0.01, 'override': {'name': 'fc_cls'}}, {'type': 'Normal', 'std': 0.001, 'override': {'name': 'fc_reg'}}, {'type': 'Xavier', 'distribution': 'uniform', 'override': [{'name': 'shared_fcs'}, {'name': 'cls_fcs'}, {'name': 'reg_fcs'}]}]

    Results: +-------------+------+-------+--------+-------+ | class | gts | dets | recall | ap | +-------------+------+-------+--------+-------+ | aeroplane | 285 | 787 | 0.923 | 0.836 | | bicycle | 337 | 992 | 0.941 | 0.851 | | bird | 459 | 1389 | 0.900 | 0.769 | | boat | 263 | 1504 | 0.863 | 0.688 | | bottle | 469 | 1951 | 0.842 | 0.661 | | bus | 213 | 812 | 0.920 | 0.810 | | car | 1201 | 3952 | 0.959 | 0.868 | | cat | 358 | 1142 | 0.958 | 0.864 | | chair | 756 | 4869 | 0.862 | 0.598 | | cow | 244 | 794 | 0.959 | 0.857 | | diningtable | 206 | 1409 | 0.922 | 0.725 | | dog | 489 | 1612 | 0.973 | 0.849 | | horse | 348 | 1035 | 0.957 | 0.878 | | motorbike | 325 | 1032 | 0.923 | 0.836 | | person | 4528 | 14008 | 0.938 | 0.835 | | pottedplant | 480 | 2511 | 0.802 | 0.503 | | sheep | 242 | 800 | 0.884 | 0.760 |

    | sofa | 239 | 1301 | 0.933 | 0.742 | | train | 282 | 1033 | 0.908 | 0.812 | | tvmonitor | 308 | 1388 | 0.899 | 0.757 | +-------------+------+-------+--------+-------+ | mAP | | | | 0.775 | +-------------+------+-------+--------+-------+ 2022-03-24 09:57:06,094 - mmdet - INFO - Exp name: central_r50_fpn_Up-G-D_pretrain_voc0712.py 2022-03-24 09:57:06,094 - mmdet - INFO - Epoch(val) [12][619] mAP: 0.7750, AP50: 0.7750

    the full logs here: voc.log

    opened by linfeng93 1
  • ModuleNotFoundError: No module named 'gvbenchmark'

    ModuleNotFoundError: No module named 'gvbenchmark'

    CMD RUN: mim train mmcls configs/cls/linear_probe/mnb4_Up-E-C_pretrain_flowers_10p.py

    LOG:

    Training command is python c:\users\xukang\miniconda3\envs\torch1.7.0\lib\site-packages\mmcls.mim\tools\train.py configs/cls/linear_probe/mnb4_Up-E-C_pretrain_flowers_10p.py --gpus 1 --launcher none. Traceback (most recent call last): File "c:\users\xukang\miniconda3\envs\torch1.7.0\lib\site-packages\mmcv\utils\misc.py", line 73, in import_modules_from_strings imported_tmp = import_module(imp) File "c:\users\xukang\miniconda3\envs\torch1.7.0\lib\importlib_init_.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 953, in _find_and_load_unlocked File "", line 219, in _call_with_frames_removed File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 953, in _find_and_load_unlocked File "", line 219, in _call_with_frames_removed File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 953, in _find_and_load_unlocked File "", line 219, in _call_with_frames_removed File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 965, in _find_and_load_unlocked ModuleNotFoundError: No module named 'gvbenchmark'

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "c:\users\xukang\miniconda3\envs\torch1.7.0\lib\site-packages\mmcls.mim\tools\train.py", line 203, in main() File "c:\users\xukang\miniconda3\envs\torch1.7.0\lib\site-packages\mmcls.mim\tools\train.py", line 92, in main cfg = Config.fromfile(args.config) File "c:\users\xukang\miniconda3\envs\torch1.7.0\lib\site-packages\mmcv\utils\config.py", line 337, in fromfile import_modules_from_strings(**cfg_dict['custom_imports']) File "c:\users\xukang\miniconda3\envs\torch1.7.0\lib\site-packages\mmcv\utils\misc.py", line 80, in import_modules_from_strings raise ImportError ImportError Traceback (most recent call last): File "c:\users\xukang\miniconda3\envs\torch1.7.0\lib\runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "c:\users\xukang\miniconda3\envs\torch1.7.0\lib\runpy.py", line 85, in run_code exec(code, run_globals) File "C:\Users\xukang\miniconda3\envs\torch1.7.0\Scripts\mim.exe_main.py", line 7, in File "c:\users\xukang\miniconda3\envs\torch1.7.0\lib\site-packages\click\core.py", line 829, in call return self.main(*args, **kwargs) File "c:\users\xukang\miniconda3\envs\torch1.7.0\lib\site-packages\click\core.py", line 782, in main rv = self.invoke(ctx) File "c:\users\xukang\miniconda3\envs\torch1.7.0\lib\site-packages\click\core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "c:\users\xukang\miniconda3\envs\torch1.7.0\lib\site-packages\click\core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "c:\users\xukang\miniconda3\envs\torch1.7.0\lib\site-packages\click\core.py", line 610, in invoke return callback(*args, **kwargs) File "c:\users\xukang\miniconda3\envs\torch1.7.0\lib\site-packages\mim\commands\train.py", line 107, in cli other_args=other_args) File "c:\users\xukang\miniconda3\envs\torch1.7.0\lib\site-packages\mim\commands\train.py", line 256, in train cmd, env=dict(os.environ, MASTER_PORT=str(port))) File "c:\users\xukang\miniconda3\envs\torch1.7.0\lib\subprocess.py", line 363, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['python', 'c:\users\xukang\miniconda3\envs\torch1.7.0\lib\site-packages\mmcls\.mim\tools\train.py', 'configs/cls/linear_probe/mnb4_Up-E-C_pretrain_flowers_10p.py', '--gpus', '1', '--launcher', 'none']' returned non-zero exit status 1.

    opened by KangolHsu 1
  • how to load released 	MetaNet-B4-Up-G

    how to load released MetaNet-B4-Up-G

    when I load the released model through the scripts: data = pickle.load(open('data.pkl', 'rb'))

    the error encoutered: UnpicklingError: A load persistent id instruction was encountered, but no persistent_load function was specified.

    and the version of pytorch is 1.10.2

    opened by wangzerong 1
Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020)

GraspNet Baseline Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020). [paper] [dataset] [API] [do

GraspNet 209 Dec 29, 2022
Implementation of self-attention mechanisms for general purpose. Focused on computer vision modules. Ongoing repository.

Self-attention building blocks for computer vision applications in PyTorch Implementation of self attention mechanisms for computer vision in PyTorch

AI Summer 962 Dec 23, 2022
a general-purpose Transformer based vision backbone

Swin Transformer By Ze Liu*, Yutong Lin*, Yue Cao*, Han Hu*, Yixuan Wei, Zheng Zhang, Stephen Lin and Baining Guo. This repo is the official implement

Microsoft 9.9k Jan 8, 2023
CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped

CSWin-Transformer This repo is the official implementation of "CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows". Th

Microsoft 409 Jan 6, 2023
Unofficial PyTorch implementation of MobileViT based on paper "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer".

MobileViT RegNet Unofficial PyTorch implementation of MobileViT based on paper MOBILEVIT: LIGHT-WEIGHT, GENERAL-PURPOSE, AND MOBILE-FRIENDLY VISION TR

Hong-Jia Chen 91 Dec 2, 2022
modelvshuman is a Python library to benchmark the gap between human and machine vision

modelvshuman is a Python library to benchmark the gap between human and machine vision. Using this library, both PyTorch and TensorFlow models can be evaluated on 17 out-of-distribution datasets with high-quality human comparison data.

Bethge Lab 244 Jan 3, 2023
It's final year project of Diploma Engineering. This project is based on Computer Vision.

Face-Recognition-Based-Attendance-System It's final year project of Diploma Engineering. This project is based on Computer Vision. Brief idea about ou

Neel 10 Nov 2, 2022
Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding

Vision Longformer This project provides the source code for the vision longformer paper. Multi-Scale Vision Longformer: A New Vision Transformer for H

Microsoft 209 Dec 30, 2022
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Phil Wang 12.6k Jan 9, 2023
This repository builds a basic vision transformer from scratch so that one beginner can understand the theory of vision transformer.

vision-transformer-from-scratch This repository includes several kinds of vision transformers from scratch so that one beginner can understand the the

null 1 Dec 24, 2021
Repository for the electrical and ICT benchmark model developed in the ERIGrid 2.0 project.

Benchmark Model Electrical and ICT System This repository contains the documentation, code, and models for the electrical and ICT benchmark model deve

ERIGrid 2.0 1 Nov 29, 2021
This project demonstrates the use of neural networks and computer vision to create a classifier that interprets the Brazilian Sign Language.

LIBRAS-Image-Classifier This project demonstrates the use of neural networks and computer vision to create a classifier that interprets the Brazilian

Aryclenio Xavier Barros 26 Oct 14, 2022
Deep Learning for Computer Vision final project

Deep Learning for Computer Vision final project

grassking100 1 Nov 30, 2021
Computer Vision Script to recognize first person motion, developed as final project for the course "Machine Learning and Deep Learning"

Overview of The Code BaseColab/MLDL_FPAR.pdf: it contains the full explanation of our work Base Colab: it contains the base colab used to perform all

Simone Papicchio 4 Jul 16, 2022
Rohit Ingole 2 Mar 24, 2022
A general 3D Object Detection codebase in PyTorch.

Det3D is the first 3D Object Detection toolbox which provides off the box implementations of many 3D object detection algorithms such as PointPillars, SECOND, PIXOR, etc, as well as state-of-the-art methods on major benchmarks like KITTI(ViP) and nuScenes(CBGS).

Benjin Zhu 1.4k Jan 5, 2023
Scikit-learn compatible estimation of general graphical models

skggm : Gaussian graphical models using the scikit-learn API In the last decade, learning networks that encode conditional independence relationships

null 213 Jan 2, 2023
(CVPR2021) ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic

ClassSR (CVPR2021) ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic Paper Authors: Xiangtao Kong, Hengyuan

Xiangtao Kong 308 Jan 5, 2023