Torchrecipes provides a set of reproduci-able, re-usable, ready-to-run RECIPES for training different types of models, across multiple domains, on PyTorch Lightning.

Overview

License

torchrecipes

This library is currently under heavy development - if you have suggestions on the API or use-cases you'd like to be covered, please open an github issue or reach out. We'd love to hear about how you're using torchrecipes.

torchrecipes is a prototype is built on top of PyTORCH and provides a set of reproduci-able, re-usable, ready-to-run RECIPES for training different types of models, across multiple domains, on PyTorch Lightning.

It aims to provide reproduci-able "applications" built on top of PyTorch with good performance and easy reproduciability. Because this project builds on the pytorch ecosystem and requires significant investment, we'd love to hear from and work with early adopters to shape the design. Please reach out on the issue tracker if you're interested in using this for your project.

Why torchrecipes?

The primary goal of the torchrecipes is to 10x ML development by providing standard blueprints to easily train production-ready ML models across environemnts (from local development to cluster deployment).

Requirements

PyTorch Recipes (torchrecipes):

  • python3 (3.8+)
  • torch

Running

The easiest way to run torchrecipes is to use torchx. You can install it directly (if not already included as part of our requirements.txt) with:

pip install torchx

Then go to torchrecipes/launcher/ and create a file torchx_app.py:

specs.AppDef: return specs.AppDef( name="run", roles=[ specs.Role( name="run", image=image, entrypoint="python", args=[*image_classification_args, *job_args], env={ "CONFIG_MODULE": "torchrecipes.vision.image_classification.conf", "MODE": "prod", "HYDRA_FULL_ERROR": "1", } ) ], ) ">
# 'torchrecipes/launcher/torchx_app.py'

import torchx.specs as specs

image_classification_args = [
    "-m", "run",
    "--config-name",
    "train_app",
    "--config-path",
    "torchrecipes/vision/image_classification/conf",
]

def torchx_app(image: str = "run.py:latest", *job_args: str) -> specs.AppDef:
    return specs.AppDef(
        name="run",
        roles=[
            specs.Role(
                name="run",
                image=image,
                entrypoint="python",
                args=[*image_classification_args, *job_args],
                env={
                    "CONFIG_MODULE": "torchrecipes.vision.image_classification.conf",
                    "MODE": "prod",
                    "HYDRA_FULL_ERROR": "1",
                }
            )
        ],
    )

This app defines the entrypoint, args and image for launching.

Now that we have created a torchx app, we are (almost) ready for launching a job!

Firstly, create a symlink for launcher/run.py at the top level of the repo:

ln -s torchrecipes/launcher/run.py ./run.py

Then we are ready-to-go! Simply launch the image_classification recipe with the following command:

torchx run --scheduler local_cwd torchrecipes/launcher/torchx_app.py:torchx_app trainer.fast_dev_run=True trainer.checkpoint_callback=False +tb_save_dir=/tmp/

Release

# install torchrecipes
pip install torchrecipes

Contributing

We welcome PRs! See the CONTRIBUTING file.

License

torchrecipes is BSD licensed, as found in the LICENSE file.

Comments
  • [paved path] basic charnn with GPT example

    [paved path] basic charnn with GPT example

    1. We decide to put paved path examples in torchrecipes repo
    2. This is an initial commit based on previous work https://github.com/aivanou/disttraining and https://github.com/dracifer/disttraining
    3. It's a basic example without any external dependencies(mlflow logging, airflow, torchx, etc), which will be added soon

    Testing:

    • single GPU python main.py

    • multi GPUs torchrun --nnodes 1 --nproc_per_node 4 \ --rdzv_backend c10d \ --rdzv_endpoint localhost:29500 \ main.py

    • torch single GPU torchx run -s local_cwd dist.ddp -j 1x1 --script main.py

    • torchx multi GPUs torchx run -s local_cwd dist.ddp -j 1x4 --script main.py

    CLA Signed 
    opened by hudeven 16
  • Modifying interaction layer to include 2 MLPs in DLRM

    Modifying interaction layer to include 2 MLPs in DLRM

    Summary: X-link: https://github.com/pytorch/torchrec/pull/382

    X-link: https://github.com/facebookresearch/dlrm/pull/242

    This diff adds 2 MLPs to the interaction layer in DLRM for MLPerf update. New DLRM module called DLRMV2 can be realized by --dlrmv2 argument. Additional arguments for the interaction MLPs are --interaction_branch1_layer_sizes and --interaction_branch2_layer_sizes to pass in the MLP sizes. The output dimension of the interaction MLPs must be a multiple of the embedding dimension.

    DLRMTrain now takes in a DLRM/DLRMV2 module at construction time.

    Reviewed By: colin2328, samiwilf

    Differential Revision: D35861688

    CLA Signed fb-exported 
    opened by narayanan2004 2
  • Failure of installing from source code using Python 3.7

    Failure of installing from source code using Python 3.7

    I'm using Python 3.7.13, I checked out master branch and tried to install from source, but it failed with the below error.

    [root@~/recipes #]pip3 install -e .
    Obtaining file:///root/recipes
      Installing build dependencies ... done
      Getting requirements to build wheel ... error
      ERROR: Command errored out with exit status 1:
       command: /usr/bin/python3.7 /usr/lib/python3.7/site-packages/pip/_vendor/pep517/_in_process.py get_requires_for_build_wheel /tmp/tmpawqnwy25
           cwd: /root/recipes
      Complete output (17 lines):
      Traceback (most recent call last):
        File "/usr/lib/python3.7/site-packages/pip/_vendor/pep517/_in_process.py", line 280, in <module>
          main()
        File "/usr/lib/python3.7/site-packages/pip/_vendor/pep517/_in_process.py", line 263, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "/usr/lib/python3.7/site-packages/pip/_vendor/pep517/_in_process.py", line 114, in get_requires_for_build_wheel
          return hook(config_settings)
        File "/tmp/pip-build-env-j_o4gy55/overlay/lib/python3.7/site-packages/setuptools/build_meta.py", line 178, in get_requires_for_build_wheel
          config_settings, requirements=['wheel'])
        File "/tmp/pip-build-env-j_o4gy55/overlay/lib/python3.7/site-packages/setuptools/build_meta.py", line 159, in _get_build_requires
          self.run_setup()
        File "/tmp/pip-build-env-j_o4gy55/overlay/lib/python3.7/site-packages/setuptools/build_meta.py", line 174, in run_setup
          exec(compile(code, __file__, 'exec'), locals())
        File "setup.py", line 27
          if version := os.getenv("BUILD_VERSION"):
                      ^
      SyntaxError: invalid syntax
      ----------------------------------------
    ERROR: Command errored out with exit status 1: /usr/bin/python3.7 /usr/lib/python3.7/site-packages/pip/_vendor/pep517/_in_process.py get_requires_for_build_wheel /tmp/tmpawqnwy25 Check the logs for full command output. 
    
    opened by chongxiaoc 2
  • [easy] fix pytest error related to test name conflict

    [easy] fix pytest error related to test name conflict

    to reproduce the error, run pytest at "recipes/torchrecipes" and will get error like:

    import file mismatch: imported module 'test_train_app' has this file attribute: /Users/stevenliu/workspace/torchrecipes/recipes/torchrecipes/audio/source_separation/tests/test_train_app.py which is not the same as the test file we want to collect: /Users/stevenliu/workspace/torchrecipes/recipes/torchrecipes/vision/image_classification/tests/test_train_app.py

    Adding __init__.py fixed the issue

    CLA Signed 
    opened by hudeven 2
  • Export transform and model for torchscript, quantization and deployment

    Export transform and model for torchscript, quantization and deployment

    It's troublesome to deal with transform and model separately during torchscript, quantization and deploy with torchserve. So I added CombinedModule wrapper to hold them such that it can consume raw input(text) and produce the prediction(generated text) end to end.

    CLA Signed 
    opened by hudeven 1
  • [paved path] add lintrunner following pytorch

    [paved path] add lintrunner following pytorch

    Stack from ghstack (oldest at bottom):

    • #30
    • -> #29

    Test plan: $ pip install lintrunner $ lintrunner init $ lintrunner image

    Differential Revision: D38983209

    CLA Signed 
    opened by hudeven 1
  • Fix DMP state dict bug when wrapped with lightning

    Fix DMP state dict bug when wrapped with lightning

    Summary: The state dict was empty when calling state dict method from a lightning module wrapping DMP. The destination dict was not being passed through.

    Differential Revision: D35783627

    CLA Signed fb-exported 
    opened by joshuadeng 1
  • Update oss Pyre configuration to preemptively guard again upcoming semantic changes

    Update oss Pyre configuration to preemptively guard again upcoming semantic changes

    Summary: Pyre is going to have a semantic change in its configuration: D35695552.

    Basically, we are changing the default behavior on how search paths are discovered. Pre-existing configurations need to explicitly opt-in to the old behavior -- otherwise, they may risk breaking their type check setups.

    The added option will lead to a Pyre warning for now, but that warning would go away on the next Pyre upgrade.

    Differential Revision: D35724336

    CLA Signed fb-exported 
    opened by grievejia 1
  • Add callout items to the Docs landing page (#12196)

    Add callout items to the Docs landing page (#12196)

    opened by tangbinh 1
  • add best_model_path to TrainOutput

    add best_model_path to TrainOutput

    Summary: It's useful to return the best_model_path after training. e.g. F6 + multimodality needs it to publish the best model

    Differential Revision: D33716125

    CLA Signed fb-exported 
    opened by hudeven 1
  • Re-sync with internal repository

    Re-sync with internal repository

    The internal and external repositories are out of sync. This attempts to brings them back in sync by patching the GitHub repository. Please carefully review this patch. You must disable ShipIt for your project in order to merge this pull request. DO NOT IMPORT this pull request. Instead, merge it directly on GitHub using the MERGE BUTTON. Re-enable ShipIt after merging.

    CLA Signed fh:direct-merge-enabled 
    opened by facebook-github-bot 0
  • [paved path] add lintrunner following pytorch

    [paved path] add lintrunner following pytorch

    It's annoying to fix lint while importing code to internal. We can use the same linter(ufmt) in OSS, following what pytorch does: https://github.com/pytorch/pytorch/wiki/lintrunner

    Test plan: $ pip install lintrunner $ lintrunner init $ lintrunner image

    CLA Signed 
    opened by hudeven 4
  • How to install ai_codesign?

    How to install ai_codesign?

    I'm running torchrec lightning recipe on Ray cluster by following https://github.com/facebookresearch/recipes/tree/main/torchrecipes/rec. Command: torchx run -s ray -cfg working_dir=.,dashboard_address=localhost:31024 dist.ddp -j 1x4 --gpu 4 --script ./dlrm_main.py

    Error:

    ray/0 (CommandActor pid=2748) [2]:Traceback (most recent call last):
    ray/0 (CommandActor pid=2748) [2]:  File "./dlrm_main.py", line 25, in <module>
    ray/0 (CommandActor pid=2748) [2]:    from torchrecipes.rec.modules.lightning_dlrm import LightningDLRM
    ray/0 (CommandActor pid=2748) [2]:  File "/root/recipes/torchrecipes/rec/modules/lightning_dlrm.py", line 18, in <module>
    ray/0 (CommandActor pid=2748) [2]:    from ai_codesign.benchmarks.dlrm.torchrec_dlrm.modules.dlrm_train import DLRMTrain
    ray/0 (CommandActor pid=2748) [2]:ModuleNotFoundError: No module named 'ai_codesign'
    ray/0 (CommandActor pid=2748) [3]:Traceback (most recent call last):
    ray/0 (CommandActor pid=2748) [3]:  File "./dlrm_main.py", line 25, in <module>
    ray/0 (CommandActor pid=2748) [3]:    from torchrecipes.rec.modules.lightning_dlrm import LightningDLRM
    ray/0 (CommandActor pid=2748) [3]:  File "/root/recipes/torchrecipes/rec/modules/lightning_dlrm.py", line 18, in <module>
    ray/0 (CommandActor pid=2748) [3]:    from ai_codesign.benchmarks.dlrm.torchrec_dlrm.modules.dlrm_train import DLRMTrain
    ray/0 (CommandActor pid=2748) [3]:ModuleNotFoundError: No module named 'ai_codesign'
    ray/0 (CommandActor pid=2748) [1]:Traceback (most recent call last):
    ray/0 (CommandActor pid=2748) [1]:  File "./dlrm_main.py", line 25, in <module>
    ray/0 (CommandActor pid=2748) [1]:    from torchrecipes.rec.modules.lightning_dlrm import LightningDLRM
    ray/0 (CommandActor pid=2748) [1]:  File "/root/recipes/torchrecipes/rec/modules/lightning_dlrm.py", line 18, in <module>
    ray/0 (CommandActor pid=2748) [1]:    from ai_codesign.benchmarks.dlrm.torchrec_dlrm.modules.dlrm_train import DLRMTrain
    ray/0 (CommandActor pid=2748) [1]:ModuleNotFoundError: No module named 'ai_codesign'
    ray/0 (CommandActor pid=2748) [0]:Traceback (most recent call last):
    ray/0 (CommandActor pid=2748) [0]:  File "./dlrm_main.py", line 25, in <module>
    ray/0 (CommandActor pid=2748) [0]:    from torchrecipes.rec.modules.lightning_dlrm import LightningDLRM
    ray/0 (CommandActor pid=2748) [0]:  File "/root/recipes/torchrecipes/rec/modules/lightning_dlrm.py", line 18, in <module>
    ray/0 (CommandActor pid=2748) [0]:    from ai_codesign.benchmarks.dlrm.torchrec_dlrm.modules.dlrm_train import DLRMTrain
    ray/0 (CommandActor pid=2748) [0]:ModuleNotFoundError: No module named 'ai_codesign'
    

    Env: Python 3.7 Torchrec: 0.1.1 Torch: 1.11.0 + cu113 TorchX: 0.2.0.dev0 Lightning: 1.6.3

    I was trying to install ai_codesign from pip but it didn't work out for me.

    [root@~/recipes/torchrecipes/rec #]pip3 install ai-codesign
    ERROR: Could not find a version that satisfies the requirement ai-codesign (from versions: none)
    ERROR: No matching distribution found for ai-codesign
    WARNING: You are using pip version 20.2.2; however, version 22.1.2 is available.
    You should consider upgrading via the '/usr/bin/python3.7 -m pip install --upgrade pip' command.
    [root@~/recipes/torchrecipes/rec #]pip3 install ai_codesign
    ERROR: Could not find a version that satisfies the requirement ai_codesign (from versions: none)
    ERROR: No matching distribution found for ai_codesign
    WARNING: You are using pip version 20.2.2; however, version 22.1.2 is available.
    You should consider upgrading via the '/usr/bin/python3.7 -m pip install --upgrade pip' command.
    

    How can I install this module?

    cc @colin2328

    opened by chongxiaoc 1
  • Leveraging hydra-zen for auto-generated and validated Hydra configs

    Leveraging hydra-zen for auto-generated and validated Hydra configs

    Hello! Thanks for creating this repo -- it looks like it will be very useful. Hydra + PyTorch Lightning is definitely a killer one-two punch!

    I wanted to bring your attention to hydra-zen, which is a library that adds functionality to Hydra and simplifies the process of writing configs. I think that this could be very handy for recipes.

    Rather than hand-write YAML configs, you can auto-generate configs for objects using hydra_zen.builds. Furthermore, these config-generating functions provide strict runtime and static checking of the configs, making it trivial to validate all of recipes configs during nightly builds.

    Another perk of leveraging hydra-zen is that it can make recipes less dependent on the specific directory layout of your directory structure; e.g. you could avoid having users be required to add a file specifically to torchrecipes/launcher/.

    I could go on and on, but I'll leave it at that. I would be happy to provide more details, answer questions, etc. I hope that you find this to be useful 😄

    Here are our docs: https://mit-ll-responsible-ai.github.io/hydra-zen/ Our code (our project is very well-tested!): https://github.com/mit-ll-responsible-ai/hydra-zen And a quick example of using hydra-zen + PyTorch lightning: https://mit-ll-responsible-ai.github.io/hydra-zen/how_to/pytorch_lightning.html

    opened by rsokl 2
Owner
Meta Research
Meta Research
NeuralQA: A Usable Library for Question Answering on Large Datasets with BERT

NeuralQA: A Usable Library for (Extractive) Question Answering on Large Datasets with BERT Still in alpha, lots of changes anticipated. View demo on n

Victor Dibia 184 Feb 10, 2021
Conditional probing: measuring usable information beyond a baseline

Conditional probing: measuring usable information beyond a baseline

John Hewitt 20 Dec 15, 2022
A method to generate speech across multiple speakers

VoiceLoop PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop. VoiceLoop is a n

Facebook Archive 873 Dec 15, 2022
simpleT5 is built on top of PyTorch-lightning⚡️ and Transformers🤗 that lets you quickly train your T5 models.

Quickly train T5 models in just 3 lines of code + ONNX support simpleT5 is built on top of PyTorch-lightning ⚡️ and Transformers ?? that lets you quic

Shivanand Roy 220 Dec 30, 2022
🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools

?? The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Hugging Face 15k Jan 2, 2023
Modular and extensible speech recognition library leveraging pytorch-lightning and hydra.

Lightning ASR Modular and extensible speech recognition library leveraging pytorch-lightning and hydra What is Lightning ASR • Installation • Get Star

Soohwan Kim 40 Sep 19, 2022
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

OpenSpeech provides reference implementations of various ASR modeling papers and three languages recipe to perform tasks on automatic speech recogniti

Soohwan Kim 26 Dec 14, 2022
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

OpenSpeech provides reference implementations of various ASR modeling papers and three languages recipe to perform tasks on automatic speech recogniti

Soohwan Kim 86 Jun 11, 2021
Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

Flexible interface for high performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra. What is Lightning Tran

Pytorch Lightning 581 Dec 21, 2022
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

?? Contributing to OpenSpeech ?? OpenSpeech provides reference implementations of various ASR modeling papers and three languages recipe to perform ta

Openspeech TEAM 513 Jan 3, 2023
An example project using OpenPrompt under pytorch-lightning for prompt-based SST2 sentiment analysis model

pl_prompt_sst An example project using OpenPrompt under the framework of pytorch-lightning for a training prompt-based text classification model on SS

Zhiling Zhang 5 Oct 21, 2022
C.J. Hutto 3.8k Dec 30, 2022
C.J. Hutto 2.8k Feb 18, 2021
DomainWordsDict, Chinese words dict that contains more than 68 domains, which can be used as text classification、knowledge enhance task

DomainWordsDict, Chinese words dict that contains more than 68 domains, which can be used as text classification、knowledge enhance task。涵盖68个领域、共计916万词的专业词典知识库,可用于文本分类、知识增强、领域词汇库扩充等自然语言处理应用。

liuhuanyong 357 Dec 24, 2022
Beautiful visualizations of how language differs among document types.

Scattertext 0.1.0.0 A tool for finding distinguishing terms in corpora and displaying them in an interactive HTML scatter plot. Points corresponding t

Jason S. Kessler 2k Dec 27, 2022
Beautiful visualizations of how language differs among document types.

Scattertext 0.1.0.0 A tool for finding distinguishing terms in corpora and displaying them in an interactive HTML scatter plot. Points corresponding t

Jason S. Kessler 1.5k Feb 17, 2021
Code for EMNLP'21 paper "Types of Out-of-Distribution Texts and How to Detect Them"

Code for EMNLP'21 paper "Types of Out-of-Distribution Texts and How to Detect Them"

Udit Arora 19 Oct 28, 2022
Code release for "COTR: Correspondence Transformer for Matching Across Images"

COTR: Correspondence Transformer for Matching Across Images This repository contains the inference code for COTR. We plan to release the training code

UBC Computer Vision Group 358 Dec 24, 2022