Codebase for Image Classification Research, written in PyTorch.

Facebook Research

Last update: Jan 1, 2023

Related tags

Deep Learning pycls

Overview

pycls

pycls is an image classification codebase, written in PyTorch. It was originally developed for the On Network Design Spaces for Visual Recognition project. pycls has since matured and been adopted by a number of projects at Facebook AI Research.

pycls provides a large set of baseline models across a wide range of flop regimes.

Introduction

The goal of pycls is to provide a simple and flexible codebase for image classification. It is designed to support rapid implementation and evaluation of research ideas. pycls also provides a large collection of baseline results (Model Zoo). The codebase supports efficient single-machine multi-gpu training, powered by the PyTorch distributed package, and provides implementations of standard models including ResNet, ResNeXt, EfficientNet, and RegNet.

Using pycls

Please see GETTING_STARTED for brief installation instructions and basic usage examples.

Model Zoo

We provide a large set of baseline results and pretrained models available for download in the pycls Model Zoo; including the simple, fast, and effective RegNet models that we hope can serve as solid baselines across a wide range of flop regimes.

Sweep Code

The pycls codebase now provides powerful support for studying design spaces and more generally population statistics of models as introduced in On Network Design Spaces for Visual Recognition and Designing Network Design Spaces. This idea is that instead of planning a single pycls job (e.g., testing a specific model configuration), one can study the behavior of an entire population of models. This allows for quite powerful and succinct experimental design, and elevates the study of individual model behavior to the study of the behavior of model populations. Please see SWEEP_INFO for details.

Projects

A number of projects at FAIR have been built on top of pycls:

If you are using pycls in your research and would like to include your project here, please let us know or send a PR.

Citing pycls

If you find pycls helpful in your research or refer to the baseline results in the Model Zoo, please consider citing an appropriate subset of the following papers:

@InProceedings{Radosavovic2019,
  title = {On Network Design Spaces for Visual Recognition},
  author = {Ilija Radosavovic and Justin Johnson and Saining Xie Wan-Yen Lo and Piotr Doll{\'a}r},
  booktitle = {ICCV},
  year = {2019}
}

@InProceedings{Radosavovic2020,
  title = {Designing Network Design Spaces},
  author = {Ilija Radosavovic and Raj Prateek Kosaraju and Ross Girshick and Kaiming He and Piotr Doll{\'a}r},
  booktitle = {CVPR},
  year = {2020}
}

@InProceedings{Dollar2021,
  title = {Fast and Accurate Model Scaling},
  author = {Piotr Doll{\'a}r and Mannat Singh and Ross Girshick},
  booktitle = {CVPR},
  year = {2021}
}

License

pycls is released under the MIT license. Please see the LICENSE file for more information.

Contributing

We actively welcome your pull requests! Please see CONTRIBUTING.md and CODE_OF_CONDUCT.md for more info.

Comments

time_model.py gives different results to those in model_zoo

Hi - I appreciate there's already an open issue related to speed, but mine is slightly different.

When I run python tools/time_net.py --cfg configs/dds_baselines/regnetx/RegNetX-1.6GF_dds_8gpu.yaml having changed GPUS: from 8 to 1, I get the following dump. I am running this on a batch of size 64, with input resolution 224x224, on a V100, as stated in the paper.

This implies a forward pass of ~62ms, not the 33ms stated in MODEL_ZOO. Have I done something wrong? Not sure why the times are so different. The other numbers (acts, params, flops) all seem fine. The latency differences are seen for other models as well - here is 800MF (39ms vs model zoo's 21ms):

I am using commit a492b56f580d43fb4e003eabda4373b25b4bedec, not the latest version of the repo, but MODEL_ZOO has not been changed since before this commit. This is because it is useful being able to time the models on dummy data, rather than having to construct a dataset. Would it be possible to have an option to do this? I can open a separate issue as a feature request for consideration if necessary.

opened by Ushk 9
would you tell us how to prepare imagenet dataset?

Hi, After going through the code, I noticed this line: https://github.com/facebookresearch/pycls/blob/cd1cfb185ab5ebef328e2c3a38f68112bbd43712/pycls/datasets/imagenet.py#L55

It seems that the imagenet val dataset does not have images stored in different subdirectories as does with train set. Why is the dataset implement like this? Would you please tell us how to prepare the imagenet dataset so that we could reproduce the result in the model zoo?

opened by CoinCheung 6
Use model without Internet access

Is it possible to use pycls models without Internet access? I'm using pretrained=False parameter and load weights manually, but I'm still getting URLError: <urlopen error [Errno -3] Temporary failure in name resolution>.

opened by atamazian 5
add test_net.py
Adding test_net.py to evaluate a trained model. Example command:

python tools/test_net.py \ --cfg configs/baselines/imagenet/R-50-1x64d_bs32_1gpu.yaml \ TRAIN.START_CHECKPOINT save/resnet50/checkpoints/model_epoch_0096.pyth \ TEST.BATCH_SIZE 256

I wonder if we want to import duplicated functions from train.py or not. I isolate them for now.
CLA Signed Merged
opened by felixgwu 5
question about ema alpha setting

Hi, thanks for your wonderful repo. In your code of update_model_ema https://github.com/facebookresearch/pycls/blob/ee770af5b55cd1959e71af73bf9d5b7d7ac10dc3/pycls/core/net.py#L101-L114

I notice that you are using a magic code adjust = cfg.TRAIN.BATCH_SIZE / cfg.OPTIM.MAX_EPOCH * update_period to modify alpha value. Is there any insight of doing this? If there are some paper of this, could you please help telling me?

Thanks : )

opened by FateScript 4
Sweep code for studying model population stats (1 of 2)

This is a major update and introduces powerful new functionality to pycls.

The pycls codebase now provides powerful support for studying design spaces and more generally population statistics of models as introduced in On Network Design Spaces for Visual Recognition and Designing Network Design Spaces. This idea is that instead of planning a single pycls job (e.g., testing a specific model configuration), one can study the behavior of an entire population of models. This allows for quite powerful and succinct experimental design, and elevates the study of individual model behavior to the study of the behavior of model populations. Please see SWEEP_INFO for details.

This is commit 1 of 2 for the sweep code. It is focused on the sweep config, setting up the sweep, and launching it.

Co-authored-by: Raj Prateek Kosaraju [email protected] Co-authored-by: Piotr Dollar [email protected]
CLA Signed Merged

opened by rajprateek 4
Sweep code for studying model population stats

This is a major update and introduces powerful new functionality to pycls.

The pycls codebase now provides powerful support for studying design spaces and more generally population statistics of models as introduced in On Network Design Spaces for Visual Recognition and Designing Network Design Spaces. This idea is that instead of planning a single pycls job (e.g., testing a specific model configuration), one can study the behavior of an entire population of models. This allows for quite powerful and succinct experimental design, and elevates the study of individual model behavior to the study of the behavior of model populations. Please see SWEEP_INFO for details.
CLA Signed

opened by theschnitz 4
Exponential Moving Average of Weights (EMA)

EMA as used in "Fast and Accurate Model Scaling" to improve accuracy. Note that EMA of model weights is nearly free computationally (if not computed every iter), hence EMA weights area always computed/stored. Saving/loading checkpoints has been updated, but the code is backward compatible with checkpoints that do not store the ema weights.

Details: -config.py: added EMA options -meters.py: generalized to allow for ema meter -net.py: added update_model_ema() to compute model ema -trainer.py: added updating/testing/logging of ema model -checkpoint.py: save/load_checkpoint() also save/load ema weights
CLA Signed Merged

opened by pdollar 4
Model scaling in "Fast and Accurate Model Scaling"

See GETTING_STARTED.md for example usage.

Summary: -paper reference: https://arxiv.org/abs/2103.06877 -regnet.py: added regnet_cfg_to_anynet_cfg() -scaler.py: implements model scaler -scale_net.py: entry point for model scaler -GETTING_STARTED.md: added example usage for scaler
CLA Signed Merged

opened by pdollar 4
Plan to support the design space comparison

Hi @rajprateek , @ir413 , Thanks for your team's great work, it provides many insights to the community. I am sure that the model zoos and the current codebase could inspire future research a lot.

I am also a little bit curious about the future plans of your codebase. So I want to ask that do you have any plans to support the design space comparison in this repo? For example, to allow users to sample & train models from different design spaces and compare these design spaces as described in the Sec. 3.1, as shown in Fig. 5, 7, and 9 in the paper. I think this feature could help the community to reproduce the comparison process and further improve this codebase's impact.
enhancement

opened by ZwwWayne 4
How to sample models for Figure 11 in RegNet paper

Hi, I noticed that 100 models are sampled to get the results as shown in Figure 11. (sec 4).

However, as the flops in the figure span a wide range(0.2B~12.8B), I don't know whether

the total number of models in all the flops regime is 100, or

for each of the flops regime, you sampled 100 models?

opened by ShoufaChen 4
RuntimeError: Cannot re-initialize CUDA in forked subprocess.

RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

how to solve this problem?

an answer says: torch.multiprocessing.set_start_method('spawn') but where to add this line?

opened by pustar 1
Integration of other data types

Does this framework support the integration of other data types? i.e if I want to input data other than images, can I easily extend the framework to do this?

opened by mdanb 0
url download error

device env: win10 py3.8 found in my env, when downloading yaml and pyth, raise "urllib.error.HTTPError: HTTP Error 404: Not Found",

it turned out that when concat url,

config_url = os.path.join(_URL_CONFIGS, _MODEL_ZOO_CONFIGS[name])

the sep added in the url in windows is '', which cause this error,

to solve in my env: config_url = _URL_CONFIGS + '/ + _MODEL_ZOO_CONFIGS[name] or as implemented in https://stackoverflow.com/questions/8223939/how-to-join-absolute-and-relative-urls

opened by jjxyai 0
How to Pick Best Model in RegNetX?

Hi, in regnet, the paper[1] mentioned it picks the best model of 25 random models (Section 5) as the final result. However, I cannot find the relative setting to train these 25 random models. Is that used the same setting to get EDF, showed in Figure.9 ?

[1] https://arxiv.org/pdf/2003.13678.pdf

opened by LicharYuan 0

Releases(0.2)

0.2(May 21, 2021)

This is a major update and introduces powerful new functionality to pycls.

The pycls codebase now provides powerful support for studying design spaces and more generally population statistics of models as introduced in On Network Design Spaces for Visual Recognition and Designing Network Design Spaces. This idea is that instead of planning a single pycls job (e.g., testing a specific model configuration), one can study the behavior of an entire population of models. This allows for quite powerful and succinct experimental design, and elevates the study of individual model behavior to the study of the behavior of model populations. Please see SWEEP_INFO for details.

This code was co-authored by Piotr Dollar (@pdollar) and Raj Prateek Kosaraju (@rajprateek).
Source code(tar.gz)
Source code(zip)
0.1(Apr 15, 2020)
We have added a large set of baseline results and pretrained models available for download in the pycls Model Zoo; including the simple, fast, and effective RegNet models that we hope can serve as solid baselines across a wide range of flop regimes.

New features included in this release:

Cache model weight URLs provided in configs locally | 4e470e21ad55aff941f1098666f70ef982626f7a

Allow optional weight decay fine-tuning for BN params, changes the default BN weight decay | 4e470e21ad55aff941f1098666f70ef982626f7a

Support Squeeze & Excitation in AnyNet | ec188632fab607a41bff1a63c7ac9bd0d949bb52

RegNet model abstraction | 4f5b5dafe4f4274f7cba774923b8d1ba81d9904c, 708d429e67b1a43dfd7e22cb754ecaaf234ba308, 24a2805f1040787a047ed0339925adf98109e64c

Run precise time in test_net and count acts for a model | ad01b81c516e59e134bfde22ab827aa61744b83c

Other changes:

Updated README | 4acac2b06945f6b0722f20748e2104bd6a771468

New Model Zoo | 4acac2b06945f6b0722f20748e2104bd6a771468, 4f5b5dafe4f4274f7cba774923b8d1ba81d9904c

New configs for models from Designing Network Design Spaces paper | 44bcfa71ebd7b9cd8e81f4d28ce319e814cbbb73, e9c1ef4583ef089c182dd7fe9823db063a4909e3

Source code(tar.gz)
Source code(zip)

Owner

Facebook Research

GitHub

Using this codebase as a tool for my own research. Making some modifications to the original repo for my own purposes.

For SwapNet Create a list.txt file containing all the images to process. This can be done with the GNU find command: find path/to/input/folder -name '

2 Nov 10, 2021

AOT-GAN for High-Resolution Image Inpainting (codebase for image inpainting)

AOT-GAN for High-Resolution Image Inpainting Arxiv Paper | AOT-GAN: Aggregated Contextual Transformations for High-Resolution Image Inpainting Yanhong

214 Jan 3, 2023

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

Detectron is deprecated. Please see detectron2, a ground-up rewrite of Detectron in PyTorch. Detectron Detectron is Facebook AI Research's software sy

25.5k Jan 7, 2023

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

MMF is a modular framework for vision and language multimodal research from Facebook AI Research. MMF contains reference implementations of state-of-t

5.1k Jan 4, 2023

This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

3k Dec 26, 2022

Codebase for Diffusion Models Beat GANS on Image Synthesis.

128 Dec 2, 2022

An Image Captioning codebase

An Image Captioning codebase This is a codebase for image captioning research. It supports: Self critical training from Self-critical Sequence Trainin

1.1k Oct 18, 2021

Official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.

GLIDE This is the official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing w

2.9k Jan 4, 2023

A general 3D Object Detection codebase in PyTorch.

Det3D is the first 3D Object Detection toolbox which provides off the box implementations of many 3D object detection algorithms such as PointPillars, SECOND, PIXOR, etc, as well as state-of-the-art methods on major benchmarks like KITTI(ViP) and nuScenes(CBGS).

1.4k Jan 5, 2023

A python-image-classification web application project, written in Python and served through the Flask Microframework

A python-image-classification web application project, written in Python and served through the Flask Microframework. This Project implements the VGG16 covolutional neural network, through Keras and Tensorflow wrappers, to make predictions on uploaded images.

19 Dec 12, 2022

A python-image-classification web application project, written in Python and served through the Flask Microframework. This Project implements the VGG16 covolutional neural network, through Keras and Tensorflow wrappers, to make predictions on uploaded images.

Image Classification in Python Implementing image classification in Flask using Keras. The VGG16 is a convolution neural network model architecture th

19 Dec 12, 2022

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

1.4k Jan 1, 2023

Spearmint Bayesian optimization codebase

Spearmint Spearmint is a software package to perform Bayesian optimization. The Software is designed to automatically run experiments (thus the code n

Formerly: Harvard Intelligent Probabilistic Systems Group -- Now at Princeton

1.5k Dec 29, 2022

Official codebase for Pretrained Transformers as Universal Computation Engines.

universal-computation Overview Official codebase for Pretrained Transformers as Universal Computation Engines. Contains demo notebook and scripts to r

210 Dec 28, 2022

Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

Decision Transformer Lili Chen*, Kevin Lu*, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas†, and Igor M

1.4k Jan 7, 2023

Codebase for the Summary Loop paper at ACL2020

Summary Loop This repository contains the code for ACL2020 paper: The Summary Loop: Learning to Write Abstractive Summaries Without Examples. Training

Canny Lab @ The University of California, Berkeley

44 Nov 4, 2022

This is the codebase for the ICLR 2021 paper Trajectory Prediction using Equivariant Continuous Convolution

Trajectory Prediction using Equivariant Continuous Convolution (ECCO) This is the codebase for the ICLR 2021 paper Trajectory Prediction using Equivar

45 Jul 22, 2022

A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

README.md shall be finished soon. WSSGG 0 Overview 1 Installation 1.1 Faster-RCNN 1.2 Language Parser 1.3 GloVe Embeddings 2 Settings 2.1 VG-GT-Graph

35 Nov 20, 2022

X-modaler is a versatile and high-performance codebase for cross-modal analytics.

X-modaler X-modaler is a versatile and high-performance codebase for cross-modal analytics. This codebase unifies comprehensive high-quality modules i

910 Dec 28, 2022

Codebase for Image Classification Research, written in PyTorch.

Related tags

Overview

pycls

Introduction

Using pycls

Model Zoo

Sweep Code

Projects

Citing pycls

License

Contributing

Comments

Releases(0.2)

0.2(May 21, 2021)

0.1(Apr 15, 2020)

Owner

Facebook Research

Using this codebase as a tool for my own research. Making some modifications to the original repo for my own purposes.

AOT-GAN for High-Resolution Image Inpainting (codebase for image inpainting)

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

Codebase for Diffusion Models Beat GANS on Image Synthesis.

An Image Captioning codebase

Official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.

A general 3D Object Detection codebase in PyTorch.

A python-image-classification web application project, written in Python and served through the Flask Microframework

A python-image-classification web application project, written in Python and served through the Flask Microframework. This Project implements the VGG16 covolutional neural network, through Keras and Tensorflow wrappers, to make predictions on uploaded images.

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Spearmint Bayesian optimization codebase

Official codebase for Pretrained Transformers as Universal Computation Engines.

Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

Codebase for the Summary Loop paper at ACL2020

This is the codebase for the ICLR 2021 paper Trajectory Prediction using Equivariant Continuous Convolution

A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

X-modaler is a versatile and high-performance codebase for cross-modal analytics.