Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

Overview

Flexible interface for high performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.


What is Lightning TransfomersUsing Lightning TransformersDocsCommunityLicense


Installation

Option 1: from PyPI

pip install lightning-transformers
# instead of: `python train.py ...`, run with:
pl-transformers-train ...

Option 2: from source

git clone https://github.com/PyTorchLightning/lightning-transformers.git
cd lightning-transformers
pip install .
python train.py ...
# the `pl-transformers-train` endpoint is also available!

What is Lightning-Transformers

Lightning Transformers offers a flexible interface for training and fine-tuning SOTA Transformer models using the PyTorch Lightning Trainer.

  • Train using HuggingFace Transformers models and datasets with Lightning custom Callbacks, Loggers, Accelerators and high performance scaling.
  • Seamless Memory and Speed Optimizations such as DeepSpeed ZeRO or FairScale Sharded Training with no code changes.
  • Powerful config composition backed by Hydra - Easily swap out models, optimizers, schedulers and many more configurations without touching the code.
  • Transformer Task Abstraction for Rapid Research & Experimentation - Built from the ground up to be task agnostic, the library supports creating transformer tasks across all modalities with little friction.

Lightning Transformers tasks allow you to train models using HuggingFace Transformer models and datasets, use Hydra to hotswap models, optimizers or schedulers and leverage all the advances features that Lightning has to offer, including custom Callbacks, Loggers, Accelerators and high performance scaling with minimal changes.

Using Lightning-Transformers

Grid is our platform for training models at scale on the cloud! Sign up here.

Task Quick Commands Run
Language Modeling python train.py task=nlp/language_modeling dataset=nlp/language_modeling/wikitext trainer.gpus=1 training.batch_size=8 Grid
Multiple Choice python train.py task=nlp/multiple_choice dataset=nlp/multiple_choice/race trainer.gpus=1 Grid
Question Answering python train.py task=nlp/question_answering dataset=nlp/question_answering/squad trainer.gpus=1 Grid
Summarization python train.py task=nlp/summarization dataset=nlp/summarization/xsum trainer.gpus=1 Grid
Text Classification python train.py task=nlp/text_classification dataset=nlp/text_classification/emotion trainer.gpus=1 Grid
Token Classification python train.py task=nlp/token_classification dataset=nlp/token_classification/conll trainer.gpus=1 Grid
Translation python train.py task=nlp/translation dataset=nlp/translation/wmt16 trainer.gpus=1 Grid

Quick recipes

Train bert-base-cased on the CARER emotion dataset using the Text Classification task.

python train.py \
    task=nlp/text_classification \
    dataset=nlp/text_classification/emotion
See the composed Hydra config used under-the-hood
optimizer:
  _target_: torch.optim.AdamW
  lr: ${training.lr}
  weight_decay: 0.001
scheduler:
  _target_: transformers.get_linear_schedule_with_warmup
  num_training_steps: -1
  num_warmup_steps: 0.1
training:
  run_test_after_fit: true
  lr: 5.0e-05
  output_dir: .
  batch_size: 16
  num_workers: 16
trainer:
  _target_: pytorch_lightning.Trainer
  logger: true
  checkpoint_callback: true
  callbacks: null
  default_root_dir: null
  gradient_clip_val: 0.0
  process_position: 0
  num_nodes: 1
  num_processes: 1
  gpus: null
  auto_select_gpus: false
  tpu_cores: null
  log_gpu_memory: null
  progress_bar_refresh_rate: 1
  overfit_batches: 0.0
  track_grad_norm: -1
  check_val_every_n_epoch: 1
  fast_dev_run: false
  accumulate_grad_batches: 1
  max_epochs: 1
  min_epochs: 1
  max_steps: null
  min_steps: null
  limit_train_batches: 1.0
  limit_val_batches: 1.0
  limit_test_batches: 1.0
  val_check_interval: 1.0
  flush_logs_every_n_steps: 100
  log_every_n_steps: 50
  accelerator: null
  sync_batchnorm: false
  precision: 32
  weights_summary: top
  weights_save_path: null
  num_sanity_val_steps: 2
  truncated_bptt_steps: null
  resume_from_checkpoint: null
  profiler: null
  benchmark: false
  deterministic: false
  reload_dataloaders_every_epoch: false
  auto_lr_find: false
  replace_sampler_ddp: true
  terminate_on_nan: false
  auto_scale_batch_size: false
  prepare_data_per_node: true
  plugins: null
  amp_backend: native
  amp_level: O2
  move_metrics_to_cpu: false
task:
  _recursive_: false
  backbone: ${backbone}
  optimizer: ${optimizer}
  scheduler: ${scheduler}
  _target_: lightning_transformers.task.nlp..text_classification.TextClassificationTransformer
  downstream_model_type: transformers.AutoModelForSequenceClassification
dataset:
  cfg:
    batch_size: ${training.batch_size}
    num_workers: ${training.num_workers}
    dataset_name: emotion
    dataset_config_name: null
    train_file: null
    validation_file: null
    test_file: null
    train_val_split: null
    max_samples: null
    cache_dir: null
    padding: max_length
    truncation: only_first
    preprocessing_num_workers: 1
    load_from_cache_file: true
    max_length: 128
    limit_train_samples: null
    limit_val_samples: null
    limit_test_samples: null
  _target_: lightning_transformers.task.nlp.text_classification.TextClassificationDataModule
experiment_name: ${now:%Y-%m-%d}_${now:%H-%M-%S}
log: false
ignore_warnings: true
tokenizer:
  _target_: transformers.AutoTokenizer.from_pretrained
  pretrained_model_name_or_path: ${backbone.pretrained_model_name_or_path}
  use_fast: true
backbone:
  pretrained_model_name_or_path: bert-base-cased

Swap the backbone to RoBERTa and the optimizer to RMSprop:

python train.py \
    task=nlp/text_classification \
    dataset=nlp/text_classification/emotion
    backbone.pretrained_model_name_or_path=roberta-base
    optimizer=rmsprop
See the changed Hydra config under-the-hood
 optimizer:
-  _target_: torch.optim.AdamW
+  _target_: torch.optim.RMSprop
   lr: ${training.lr}
-  weight_decay: 0.001
 scheduler:
   _target_: transformers.get_linear_schedule_with_warmup
   num_training_steps: -1
....
tokenizer:
   pretrained_model_name_or_path: ${backbone.pretrained_model_name_or_path}
   use_fast: true
 backbone:
-  pretrained_model_name_or_path: bert-base-cased
+  pretrained_model_name_or_path: roberta-base

Enable Sharded Training.

python train.py \
    task=nlp/text_classification \
    dataset=nlp/text_classification/emotion \
    trainer=sharded
See the changed Hydra config under-the-hood Without the need to modify any code, the config updated automatically for sharded training:
optimizer:
   _target_: torch.optim.AdamW
   lr: ${training.lr}
trainer:
   process_position: 0
   num_nodes: 1
   num_processes: 1
-  gpus: null
+  gpus: 1
   auto_select_gpus: false
   tpu_cores: null
   log_gpu_memory: null
   ...
   log_every_n_steps: 50
-  accelerator: null
+  accelerator: ddp
   sync_batchnorm: false
-  precision: 32
+  precision: 16
   weights_summary: top
   ....
   terminate_on_nan: false
   auto_scale_batch_size: false
   prepare_data_per_node: true
-  plugins: null
+  plugins:
+    _target_: pytorch_lightning.plugins.DDPShardedPlugin
   amp_backend: native
   amp_level: O2
   move_metrics_to_cpu: false
tokenizer:
   pretrained_model_name_or_path: ${backbone.pretrained_model_name_or_path}
   use_fast: true
 backbone:
   pretrained_model_name_or_path: bert-base-cased

Enable DeepSpeed ZeRO Training.

python train.py \
    task=nlp/text_classification \
    dataset=nlp/text_classification/emotion \
    trainer=deepspeed
See the changed Hydra config under-the-hood Without the need to modify any code, the config updated automatically for DeepSpeed:
optimizer:
   _target_: torch.optim.AdamW
   lr: ${training.lr}
trainer:
   process_position: 0
   num_nodes: 1
   num_processes: 1
-  gpus: null
+  gpus: 1
   auto_select_gpus: false
   tpu_cores: null
   log_gpu_memory: null
   ...
   val_check_interval: 1.0
   flush_logs_every_n_steps: 100
   log_every_n_steps: 50
-  accelerator: null
+  accelerator: ddp
   sync_batchnorm: false
-  precision: 32
+  precision: 16
   ...
-  plugins: null
+  plugins:
+    _target_: pytorch_lightning.plugins.DeepSpeedPlugin
+    stage: 2
+    cpu_offload: true
   amp_backend: native
   amp_level: O2
   move_metrics_to_cpu: false
...

Train with a pre-trained t5-base backbone, on the XSUM dataset using the Summarization task.

python train.py \
    task=nlp/summarization \
    dataset=nlp/summarization/xsum \
    backbone.pretrained_model_name_or_path=t5-base

Train with a pre-trained mt5-base backbone, on the WMT16 dataset using the Translation task with 2 GPUs.

python train.py \
    task=nlp/translation \
    dataset=nlp/translation/wmt16 \
    backbone.pretrained_model_name_or_path=google/mt5-base \
    trainer.gpus=2

Custom Files & Datasets

You can train, validate and test Lightning transformers tasks on your own data files, and you can extend datasets for custom processing and your own tasks.

How to train, validate and test on custom files

How to extend datasets

Custom Tasks

Extending the Language Modeling Task

Contribute

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

Community

For help or questions, join our huge community on Slack!

License

Please observe the Apache 2.0 license that is listed in this repository. In addition, the Lightning framework is Patent Pending.

Comments
  • sparseml integration

    sparseml integration

    Closes #196

    Add option to use sparseml. This is still a little rough around the edges, but I welcome the feedback! Implementation on Google Colab.

    Still needs:

    1. Implementation of MODELS_PATH and RECIPE_PATH through Hydra instead of environment variable
    2. "test" stage needs to work, only training and evaluation work at the moment
    3. stylistic coding changes

    Will continue to add more changes after my reactions engineering exam tomorrow. Feel free to give me pointers or links to tutorials if you see any changes I need to make.

    enhancement 
    opened by mathemusician 12
  • compact packaging - all inside namespace

    compact packaging - all inside namespace

    🚀 Feature

    refactor to make the package more compact, move configs and train inside

    Motivation

    be able to call it from everywhere as python -m lightning_transformers.train --some asrgs

    Pitch

    Easier to use from everywhere reduce collision with configs (if any) from other packages

    Alternatives

    Additional context

    enhancement help wanted wontfix 
    opened by Borda 11
  • Question answering example throws an exception even if sanity check is skipped

    Question answering example throws an exception even if sanity check is skipped

    🐛 Bug

    Running the squad example python train.py task=nlp/question_answering dataset=nlp/question_answering/squad trainer.gpus=[1] training.batch_size=8 trainer.num_sanity_val_steps=0 throws an exception while finalizing training. This is not a replication of #218

    To Reproduce

    Steps to reproduce the behavior:

    1. Run python train.py task=nlp/question_answering dataset=nlp/question_answering/squad trainer.gpus=[1] training.batch_size=8 trainer.num_sanity_val_steps=0
    2. See error
    Epoch 0: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉| 12442/12445 [44:35<00:00,  4.65it/s, loss=0.957
    Error executing job with overrides: ['task=nlp/question_answering', 'dataset=nlp/question_answering/squad', 'trainer.gpus=[1]', 'training.batch_size=8', 'trainer.num_sanity_val_steps=0']3 [01:39<00:00, 13.59it/s]
    Traceback (most recent call last):
      File "/home/vrt/lightning-transformers/train.py", line 10, in hydra_entry
        main(cfg)
      File "/home/vrt/lightning-transformers/lightning_transformers/cli/train.py", line 69, in main
        run(
      File "/home/vrt/lightning-transformers/lightning_transformers/cli/train.py", line 60, in run
        trainer.fit(model, datamodule=data_module)
      File "/home/vrt/miniconda3/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 740, in fit
        self._call_and_handle_interrupt(
      File "/home/vrt/miniconda3/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 685, in _call_and_handle_interrupt
        return trainer_fn(*args, **kwargs)
      File "/home/vrt/miniconda3/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 777, in _fit_impl
        self._run(model, ckpt_path=ckpt_path)
      File "/home/vrt/miniconda3/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1199, in _run
        self._dispatch()
      File "/home/vrt/miniconda3/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1279, in _dispatch
        self.training_type_plugin.start_training(self)
      File "/home/vrt/miniconda3/lib/python3.9/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 202, in start_training
        self._results = trainer.run_stage()
      File "/home/vrt/miniconda3/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1289, in run_stage
        return self._run_train()
      File "/home/vrt/miniconda3/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1319, in _run_train
        self.fit_loop.run()
      File "/home/vrt/miniconda3/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 145, in run
        self.advance(*args, **kwargs)
      File "/home/vrt/miniconda3/lib/python3.9/site-packages/pytorch_lightning/loops/fit_loop.py", line 234, in advance
        self.epoch_loop.run(data_fetcher)
      File "/home/vrt/miniconda3/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 146, in run
        self.on_advance_end()
      File "/home/vrt/miniconda3/lib/python3.9/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 242, in on_advance_end
        self._run_validation()
      File "/home/vrt/miniconda3/lib/python3.9/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 337, in _run_validation
        self.val_loop.run()
      File "/home/vrt/miniconda3/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 151, in run
        output = self.on_run_end()
      File "/home/vrt/miniconda3/lib/python3.9/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 134, in on_run_end
        self._on_evaluation_epoch_end()
      File "/home/vrt/miniconda3/lib/python3.9/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 241, in _on_evaluation_epoch_end
        self.trainer.call_hook(hook_name)
      File "/home/vrt/miniconda3/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1501, in call_hook
        output = model_fx(*args, **kwargs)
      File "/home/vrt/lightning-transformers/lightning_transformers/task/nlp/question_answering/model.py", line 59, in on_validation_epoch_end
        metric_dict = self.metric.compute()
      File "/home/vrt/miniconda3/lib/python3.9/site-packages/torchmetrics/metric.py", line 380, in wrapped_func
        value = compute(*args, **kwargs)
      File "/home/vrt/lightning-transformers/lightning_transformers/task/nlp/question_answering/datasets/squad/metric.py", line 23, in compute
        example_ids = [reverse_lookup[i.item()] for i in self.example_ids]
      File "/home/vrt/lightning-transformers/lightning_transformers/task/nlp/question_answering/datasets/squad/metric.py", line 23, in <listcomp>
        example_ids = [reverse_lookup[i.item()] for i in self.example_ids]
    KeyError: 0
    
    Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
    
    

    Environment

    • PyTorch Version: 1.6.0
    • OS: Ubuntu 18.04.6 LTS
    • How you installed PyTorch: conda
    • Python version: 3.9.7
    • CUDA/cuDNN version: 11.4
    • GPU models and configuration: 2x NVIDIA GeForce RTX 2080 Ti (First device not used)
    • Any other relevant information: The same error occurs during sanity check if trainer.num_sanity_val_steps=-1 is used, as in #184
    bug / fix help wanted 
    opened by Pointy-Hat 10
  • Glue MNLI task fails due to missing 'validation' key in dataset

    Glue MNLI task fails due to missing 'validation' key in dataset

    🐛 Bug

    MNLI has two validation and two test sets, called validation_matched, validation_mismatched, test_matched and test_matched. I assume that this was not taken into account in the datamodule.

    To Reproduce

    Steps to reproduce the behavior:

    Run the following command:

    python train.py task=nlp/text_classification dataset=nlp/text_classification/glue dataset.cfg.dataset_config_name=mnli
    

    Expected behavior

    I would expect the dataloader to handle the special case of MNLI and load validation_matched and test_matched by default. Maybe add an option to additionally test on test_mismatched as well, when desired.

    Environment

    A standard pip install from source, as of 2021.12.01. Fails with or without GPU.

    bug / fix help wanted 
    opened by mariomeissner 10
  • Fix protobuf version

    Fix protobuf version

    See https://github.com/Lightning-AI/lightning-transformers/pull/284#issuecomment-1243050284

    This is based upon @rohitgr7 's awesome https://github.com/Lightning-AI/lightning-transformers/pull/284

    opened by turian 8
  • Should we get rid of Hydra?

    Should we get rid of Hydra?

    Motivation

    This is motivated by recent development of LightningCLI which IMO is simpler, and better supported primarily because it exists within Lightning.

    Over time I've noticed that due to the changes upstream in Lightning, hydra default configs are going out of sync every so often.

    I also notice that there are fundamental issues such as this https://github.com/PyTorchLightning/lightning-transformers/issues/149 where the confs exist within the package. We assume that the user will always clone the repo and make their modifications there. I would be curious if this works for users currently.

    Pitch

    Remove Hydra, treating this library more as a set of components people can import into their own scripts. If we'd like to keep the ability to run training and testing from the library, we should use the LightningCLI instead of Hydra.

    Alternatives

    Keep as is :)

    cc @mathemusician @Borda @carmocca

    help wanted User Experience refactor & code health 
    opened by SeanNaren 8
  • Use and rename special subset name if provided

    Use and rename special subset name if provided

    This PR addresses issue, closes #213

    I modify the load_dataset function in core/nlp/data.py to find and use special subset names if provided, then rename them back to the standard names to avoid failures further along.

    Cannot be done in [subset]_dataloader function as initially proposed, because other functions such as _select_samples are called before that (and would not work well).

    Two main reasons for patching core/nlp/data/py instead of core/data.py:

    1. load_dataset is not present in the former. Could create a function there and then let the function in the latter call that first...
    2. Never saw this issue happen in speech or vision, it really seems to be an MNLI only thing for now.

    It's up to debate if this should really be moved to core/data.py instead.

    This PR should likely also include documentation changes, but I didn't check yet how to do that. Pointers appreciated.

    I tested the following commands successfully:

    # A normal glue task without any subset name issues
    CUDA_VISIBLE_DEVICES=3 python train.py task=nlp/text_classification dataset=nlp/text_classification/glue dataset.cfg.dataset_config_name=sst2 trainer.val_check_interval=0.01 training.batch_size=128 trainer.gpus=1
    
    # MNLI issue with renamed subset, smaller validation size and more frequent validation to test that subsampling also works fine
    CUDA_VISIBLE_DEVICES=3 python train.py task=nlp/text_classification dataset=nlp/text_classification/glue dataset.cfg.dataset_config_name=mnli ++dataset.cfg.validation_subset_name=validation_matched trainer.val_check_interval=0.01 training.batch_size=128 trainer.gpus=1 ++dataset.cfg.limit_val_samples=256
    
    enhancement 
    opened by mariomeissner 7
  • Add an option to use Huggingface metrics

    Add an option to use Huggingface metrics

    🚀 Feature

    Support Huggingface metrics.

    Motivation

    Torchmetrics are great, but there are many metrics that are not publicly available. Luckily, Huggingface implemented lots of them. Can you please an easy way to add metrics from Huggingface?

    Pitch

    Specifying a metric from Huggingface, making sure I give it the correct arguments without needing to implement it on my own.

    enhancement help wanted 
    opened by yuvalkirstain 7
  • Custom data file for classification seems to be failing

    Custom data file for classification seems to be failing

    🐛 Bug

    When training a classification model on custom data file, the training fails because it expect num_classes

    To Reproduce

    Use this collab: https://colab.research.google.com/drive/1uamw6SNaOr_4ch24JNxAj2yfgLUKfJqO?usp=sharing Error:

    Traceback (most recent call last):
      File "/usr/local/lib/python3.7/dist-packages/lightning_transformers/cli/train.py", line 84, in hydra_entry
        main(cfg)
      File "/usr/local/lib/python3.7/dist-packages/lightning_transformers/cli/train.py", line 78, in main
        logger=logger,
      File "/usr/local/lib/python3.7/dist-packages/lightning_transformers/cli/train.py", line 61, in run
        trainer.fit(model, datamodule=data_module)
      File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 458, in fit
        self.call_setup_hook(model)
      File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1066, in call_setup_hook
        model.setup(stage_name)
      File "/usr/local/lib/python3.7/dist-packages/lightning_transformers/core/model.py", line 88, in setup
        self.configure_metrics(stage)
      File "/usr/local/lib/python3.7/dist-packages/lightning_transformers/task/nlp/text_classification/model.py", line 61, in configure_metrics
        self.prec = Precision(num_classes=self.num_classes)
      File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 948, in __getattr__
        type(self).__name__, name))
    AttributeError: 'TextClassificationTransformer' object has no attribute 'num_classes'
    
    Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
    

    Expected behavior

    It should start training.

    Environment

    check the notebook

    bug / fix help wanted wontfix 
    opened by zippeurfou 7
  • Ver0.2.4 compatibility PL v1.8

    Ver0.2.4 compatibility PL v1.8

    Suggest to assign to

    Ver0.2.4 new NotImplementedError: LightningDataModule.on_load_checkpoint was deprecated in v1.6 and is no longer supported as of v1.8. Use load_state_dict instead.

    🐛 Bug

    model.fit fails owing to Pytorch-Lightning checks failure

    > File ~\miniconda3\envs\UnBias-99-5\lib\site-packages\pytorch_lightning\trainer\configuration_validator.py:61, in verify_loop_configurations(trainer)
    >      59 _check_deprecated_logger_methods(trainer)
    >      60 # TODO: Delete this check in v2.0
    > ---> 61 _check_unsupported_datamodule_hooks(trainer)
    

    To Reproduce

    trainer.fit(model,datamodel)

    > ---------------------------------------------------------------------------
    > NotImplementedError                       Traceback (most recent call last)
    > Input In [58], in <cell line: 1>()
    > ----> 1 trainer.fit(model,dm)
    > 
    > File ~\miniconda3\envs\MyEnv\lib\site-packages\pytorch_lightning\trainer\trainer.py:579, in Trainer.fit(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
    >     577     raise TypeError(f"`Trainer.fit()` requires a `LightningModule`, got: {model.__class__.__qualname__}")
    >     578 self.strategy._lightning_module = model
    > --> 579 call._call_and_handle_interrupt(
    >     580     self, self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
    >     581 )
    > 
    > File ~\miniconda3\envs\MyEnv\lib\site-packages\pytorch_lightning\trainer\call.py:38, in _call_and_handle_interrupt(trainer, trainer_fn, *args, **kwargs)
    >      36         return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
    >      37     else:
    > ---> 38         return trainer_fn(*args, **kwargs)
    >      40 except _TunerExitException:
    >      41     trainer._call_teardown_hook()
    > 
    > File ~\miniconda3\envs\MyEnv\lib\site-packages\pytorch_lightning\trainer\trainer.py:621, in Trainer._fit_impl(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
    >     614 ckpt_path = ckpt_path or self.resume_from_checkpoint
    >     615 self._ckpt_path = self._checkpoint_connector._set_ckpt_path(
    >     616     self.state.fn,
    >     617     ckpt_path,  # type: ignore[arg-type]
    >     618     model_provided=True,
    >     619     model_connected=self.lightning_module is not None,
    >     620 )
    > --> 621 self._run(model, ckpt_path=self.ckpt_path)
    >     623 assert self.state.stopped
    >     624 self.training = False
    > 
    > File ~\miniconda3\envs\MyEnv\lib\site-packages\pytorch_lightning\trainer\trainer.py:984, in Trainer._run(self, model, ckpt_path)
    >     981 self._callback_connector._attach_model_callbacks()
    >     982 self._callback_connector._attach_model_logging_functions()
    > --> 984 verify_loop_configurations(self)
    >     986 # hook
    >     987 log.detail(f"{self.__class__.__name__}: preparing data")
    > 
    > File ~\miniconda3\envs\MyEnv\lib\site-packages\pytorch_lightning\trainer\configuration_validator.py:61, in verify_loop_configurations(trainer)
    >      59 _check_deprecated_logger_methods(trainer)
    >      60 # TODO: Delete this check in v2.0
    > ---> 61 _check_unsupported_datamodule_hooks(trainer)
    > 
    > File ~\miniconda3\envs\MyEnv\lib\site-packages\pytorch_lightning\trainer\configuration_validator.py:295, in _check_unsupported_datamodule_hooks(trainer)
    >     290     raise NotImplementedError(
    >     291         "`LightningDataModule.on_save_checkpoint` was deprecated in v1.6 and is no longer supported as of v1.8."
    >     292         " Use `state_dict` instead."
    >     293     )
    >     294 if is_overridden("on_load_checkpoint", datahook_selector.datamodule):
    > --> 295     raise NotImplementedError(
    >     296         "`LightningDataModule.on_load_checkpoint` was deprecated in v1.6 and is no longer supported as of v1.8."
    >     297         " Use `load_state_dict` instead."
    >     298     )
    > 
    > NotImplementedError: `LightningDataModule.on_load_checkpoint` was deprecated in v1.6 and is no longer supported as of v1.8. Use `load_state_dict` instead.
    

    Code sample

    import os
    from accelerate import (init_empty_weights)
    from transformers import (FlaubertTokenizer, FlaubertWithLMHeadModel, TrainingArguments, DataCollatorForLanguageModeling) 
    from datasets import (load_dataset)
    import pytorch_lightning as pl
    from lightning_transformers.task.nlp.masked_language_modeling import (MaskedLanguageModelingTransformer, MaskedLanguageModelingDataModule)
    
    dataset = load_from_disk(os.path.join(drive_letter,os.path.join(dataset_dir, 'dataset')))
    dataset = dataset.remove_columns(["text"])
    dataset = dataset.shuffle()
    dataset.set_format(type='torch', columns=['input_ids', 'token_type_ids', 'attention_mask', 'special_tokens_mask', 'labels'], device=device) 
    
    LM_tokenizer = FlaubertTokenizer.from_pretrained('./tokenizer/FlauBERT_tokenizer', do_lowercase=False)
    
    with init_empty_weights():
      model = MaskedLanguageModelingTransformer(
                  pretrained_model=LMhead_model,
                  tokenizer=LM_tokenizer,
                  load_weights=False,
                  low_cpu_mem_usage=True,
                  device_map="auto"
                  #deepspeed_sharding=True,  # Linux only, defer initialization of the model to shard/load pre-train weights
              )
    
    batch_size=2
    
    datamodel = MaskedLanguageModelingDataModule(
        batch_size=batch_size,
        dataset=dataset,
        tokenizer=LM_tokenizer,
        num_workers=os.cpu.count())
    
    trainer = pl.Trainer(
        accelerator="auto",
        devices="auto",
        #strategy="deepspeed_stage_3", # linux only
        precision=16,
        max_epochs=1,
        #strategy='dp',
        #auto_lr_find=True,
        #detect_anomaly=True
        #val_check_interval=0
        #progress_bar_refresh_rate=50
    )
    
    trainer.fit(model,datamodel)
    

    Expected behavior

    Trainer fits

    Environment

    • PyTorch Version (e.g., 1.0): 1.2.1
    • OS (e.g., Linux): Windows 10
    • How you installed PyTorch (conda, pip, source): conda

    py3.9_cuda11.6_cudnn8_0 pytorch

    • Python version: 3.9
    • CUDA/cuDNN version: CUDA 11.6, cuDNN 8.0
    • GPU models and configuration: NVIDIA Quadro RTX 3000
    • Any other relevant information: none

    Additional context

    Comes on top of 0.2.3 and despite 0.2.4 release

    bug / fix help wanted 
    opened by BenoitDalFerro 6
  • ViT Image Classification Support

    ViT Image Classification Support

    Starter code

    import pytorch_lightning as pl
    from transformers import AutoFeatureExtractor
    
    from lightning_transformers.task.vision.image_classification import (
        ImageClassificationDataConfig,
        ImageClassificationDataModule,
        ImageClassificationTransformer,
    )
    
    MODEL_NAME = "nateraw/vit-base-beans"
    feature_extractor = AutoFeatureExtractor.from_pretrained(pretrained_model_name_or_path=MODEL_NAME)
    
    dm = ImageClassificationDataModule(
        cfg=ImageClassificationDataConfig(
            batch_size=8,
            dataset_name="beans",
            num_workers=8
        ),
        feature_extractor=feature_extractor,
    )
    
    model = ImageClassificationTransformer(pretrained_model_name_or_path=MODEL_NAME)
    trainer = pl.Trainer(accelerator="gpu", max_epochs=5)
    trainer.fit(model, dm)
    
    opened by tanmoyio 6
  • `TransformerDataModule.setup()` run more than once unnecessarily

    `TransformerDataModule.setup()` run more than once unnecessarily

    🐛 Bug

    TransformerDataModule.setup() is run more than once unnecessarily. For example, when running the code included below, it runs setup() when calling dm.num_classes and then when calling trainer.fit(model, dm).

    setup() then calls self.load_dataset(), self.split_dataset(dataset) and self.process_data(dataset, stage=stage). Calling self.load_dataset() several times is not a big deal because it will load it from the cache, but the other two methods are expensive and I think it does not make sense to run them again (since they just override whatever self.ds was there before.

    To Reproduce

    Take the below example from the docs and just check the console output or run it in debug mode with a breakpoint. It can be seen that TransformerDataModule.setup() and the subsequent methods load_dataset(), split_dataset() and are run more than once.

    import pytorch_lightning as pl
    from transformers import AutoTokenizer
    
    from lightning_transformers.task.nlp.text_classification import (
        TextClassificationDataModule,
        TextClassificationTransformer,
    )
    
    tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path="bert-base-uncased")
    dm = TextClassificationDataModule(
        batch_size=1,
        dataset_name="glue",
        dataset_config_name="sst2",
        max_length=512,
        tokenizer=tokenizer,
    )
    model = TextClassificationTransformer(pretrained_model_name_or_path="bert-base-uncased", num_labels=dm.num_classes)
    trainer = pl.Trainer(accelerator="auto", devices="auto", max_epochs=1)
    
    trainer.fit(model, dm)
    

    Expected behavior

    Given that TransformerDataModule.setup() currently does the following:

    def setup(self, stage: Optional[str] = None): 
      dataset = self.load_dataset()
      dataset = self.split_dataset(dataset)
      dataset = self.process_data(dataset, stage=stage)
      self.ds = dataset
    

    Perhaps a way to avoid running it again would be creating the class attribute self.setup_stages_run = [] when the class is initialized and then defining the setup method as:

        def setup(self, stage: Optional[str] = None): 
            # Load and split dataset only if setup has not been run before
            if len(self.setup_stages_run) == 0: 
                dataset = self.load_dataset()
                dataset = self.split_dataset(dataset)
            else:
                dataset = self.ds
    
            # Process dataset only if setup has not been run before for this stage    
            if stage not in self.setup_stages_run:            
                self.ds = self.process_data(dataset, stage=stage)
                self.setup_stages_run.append(stage)
    

    Can create a PR if you think this makes sense. Thanks!

    enhancement help wanted 
    opened by RR-28023 0
  • Shuffling support

    Shuffling support

    🚀 Feature

    Add option for data shuffling in core/data.py Data shuffling is crucial for removing dataset structure bias.

    Motivation

    I noticed my model was not performing well when I was using a custom dataset with spikes in performance across the epoch. I then realized it was because the class data was in sequence, and there was no shuffling performed by default. I then looked into the code but couldn't find any option to add shuffling: core/data.py I had to then overwrite 3 functions, train_dataloader, val_dataloader, test_dataloader in order to get this functionality.

    Pitch

    Add a boolean shuffling argument in the constructor that enables this.

    Alternatives

    Additional context

    enhancement help wanted 
    opened by juliusfrost 0
  • Jointly train Question Answering and Multiple Choice

    Jointly train Question Answering and Multiple Choice

    🚀 Feature: Multi task NLP model

    For such an IELTS exam paper, there are several types of questions such as Question Answering and Multiple Choice. The current implementation of lightning_transformer does well for a single task but I wonder whether a case to jointly train 2 tasks at the same time? Because the context will be shared during two tasks, therefore sharing the encoder will be beneficial.

    Alternatives

    I found a reference to do this directly on huggingface transformer but dont know how to structure it to adapt with lightning transformers. https://colab.research.google.com/github/zphang/zphang.github.io/blob/master/files/notebooks/Multi_task_Training_with_Transformers_NLP.ipynb#scrollTo=xW8bnTgCsx5c

    enhancement help wanted 
    opened by tmquan 1
  • Deepspeed sharding and load from checkpoint with custom lightning module - setup() not called during checkpoint loading

    Deepspeed sharding and load from checkpoint with custom lightning module - setup() not called during checkpoint loading

    ❓ Questions and Help

    Before asking:

    1. search the issues.
    2. search the docs.

    What is your question?

    Hi, I'm doing training from scratch using deepspeed, pytorch lightning, and transformers in a multi node setting, and wanted to know how to setup the code to handle loading from a pytorch checkpoint.

    Going off of the docs here, I see that the model is intended to be defined in setup(). However, this doesn't work when loading from a state dict since setup is not called. What's the right way to structure the code here? Does enable_transformers_pretrained_deepspeed_sharding need to be called in setup or can it be called in the constructor?

    This has been my potential workaround in the constructor, because it does seem to fail on certain ranks

    def __init__(self, config):
            # irrelevant constructor things here
            try:
                enable_transformers_pretrained_deepspeed_sharding(self)
            except AttributeError:
                pl.utilities.rank_zero.rank_zero_warn(
                    "Transformers sharding initialization not enabled..."
                )
            # needed to load from checkpoint
             self.model = AutoModelForCausalLM.from_config(self.base_config)
    

    As opposed to:

        def setup(self, stage):
            if not hasattr(self, "model"):
                try:
                    enable_transformers_pretrained_deepspeed_sharding(self)
                 ### sometimes using ddp for inference so this will fail
                except AttributeError:
                    pl.utilities.rank_zero.rank_zero_warn(
                        "Transformers sharding initialization not enabled -  likely not using DeepSpeed..."
                    )
                self.model = AutoModelForCausalLM.from_config(self.base_config)
    

    Code

    What have you tried?

    What's your environment?

    Linux, conda/pip, deepspeed==0.7.3 pytorch-lightning==1.6.5 lighting-transformers==0.2.1

    • OS: [e.g. iOS, Linux, Win]
    • Packaging [e.g. pip, conda]
    • Version [e.g. 0.5.2.1]

    Thanks in advance for the help!

    question 
    opened by maxzvyagin 2
  • Can you demonstrate how to fine-tune a pretrained model on unlabeled data

    Can you demonstrate how to fine-tune a pretrained model on unlabeled data

    🚀 Feature

    Documentation or example showing how to fine-tune a pretrained model on unlabeled data.

    Motivation

    It's great to fine-tune your pretrained model on untrained data, so that---if you have precious few labels in the target domain---you still have adapted to that domain using untrained data.

    Pitch

    We have these super huge foundational models, but for niche domains without larges it's great to fine tune. Examples:

    • Want to work on a particular style of text.
    • Want to fine-tune on a spoken language that it was not exposed.
    • etc.

    Alternatives

    Hack around, maybe use hugging face. IDK?

    enhancement help wanted 
    opened by turian 6
  • TextClassificationTransformer should log torchmetrics object instead of computed Tensors

    TextClassificationTransformer should log torchmetrics object instead of computed Tensors

    🐛 Bug

    In line 58 of the TextClassificationTransformer.common_step() method (https://github.com/Lightning-AI/lightning-transformers/blob/master/lightning_transformers/task/nlp/text_classification/model.py#L58), Logging is called with a dictionary of metric values computed for the current batch. I am new to this package but I believe the logger has to be called with the torchmetrics.Metric subclass for cases where the computed value is different from the average of the per-batch values, such as for class Precision - otherwise the aggregation gives wrong results. Similar code exists in TokenClassificationTransformer and ImageClassificationTransformer as well.

    To Reproduce

    Using TextClassificationTransformer with Precision and Recall metrics (as configured in the default) will result in inaccurate per-epoch values for validation and testing.

    Environment

    • PyTorch Version: 1.12.1
    • OS: Linux
    • How you installed PyTorch (conda, pip, source): pip
    • Build command you used (if compiling from source):
    • Python version: 3.7
    • CUDA/cuDNN version:
    help wanted question 
    opened by stefan-schroedl 1
Releases(0.2.5)
  • 0.2.5(Nov 21, 2022)

    [0.2.5] - 2022-11-21

    Fixed

    • Fixed loading HF model (#306)
    • Fixed passing config name to CNNDailyMailSummarizationDataModule (#310)
    • Fixed move pipeline to self.device as default (#309)

    New Contributors

    • @lantiga made their first contribution in https://github.com/Lightning-AI/lightning-transformers/pull/305

    Full Changelog: https://github.com/Lightning-AI/lightning-transformers/compare/0.2.4...0.2.5

    Source code(tar.gz)
    Source code(zip)
  • 0.2.4(Nov 4, 2022)

    [0.2.4] - 2022-11-04

    Changed

    • Added support for Lightning v1.8.0 (#297)

    Contributors

    @rohitgr7

    Full Changelog: https://github.com/Lightning-AI/lightning-transformers/compare/0.2.3...0.2.4

    Source code(tar.gz)
    Source code(zip)
  • 0.2.3(Oct 8, 2022)

  • 0.2.2(Oct 7, 2022)

    Update tests for latest PL release (#284) by @rohitgr7

    Full Changelog: https://github.com/Lightning-AI/lightning-transformers/compare/0.2.1...0.2.2

    Source code(tar.gz)
    Source code(zip)
  • 0.2.1(Jun 28, 2022)

    ⚡ Lightning Transformers 0.2.1

    This is an incremental release with some documentation changes, DeepSpeed training support and a refactor to expose transformer model creation.

    What's Changed

    • Refractor the code for model creation by @espoirMur in https://github.com/Lightning-AI/lightning-transformers/pull/268
    • Simplify Big Model inference support/Add DeepSpeed Training by @SeanNaren in https://github.com/Lightning-AI/lightning-transformers/pull/269

    New Contributors

    • @espoirMur made their first contribution in https://github.com/Lightning-AI/lightning-transformers/pull/268

    Full Changelog: https://github.com/Lightning-AI/lightning-transformers/compare/0.2.0...0.2.1

    Source code(tar.gz)
    Source code(zip)
  • 0.2.0(Jun 23, 2022)

    ⚡ Lightning Transformers 0.2.0 Release

    Below is a summary of the features/fixes we’ve added since the previous release!

    ViT Image Classification 🚀

    Thanks to @tanmoyio we now have ViT support within Lightning Transformers!

    Here is a simple example showing how you can fine-tune a ViT model using Lightning Transformers:

    import pytorch_lightning as pl
    from transformers import AutoFeatureExtractor
    
    from lightning_transformers.task.vision.image_classification import (
        ImageClassificationDataModule,
        ImageClassificationTransformer,
    )
    
    feature_extractor = AutoFeatureExtractor.from_pretrained(pretrained_model_name_or_path="nateraw/vit-base-beans")
    dm = ImageClassificationDataModule(
        batch_size=8, 
        dataset_name="beans", 
        num_workers=8,
        feature_extractor=feature_extractor,
    )
    model = ImageClassificationTransformer(
        pretrained_model_name_or_path="nateraw/vit-base-beans", num_labels=dm.num_classes
    )
    
    trainer = pl.Trainer(accelerator="auto", devices="auto", max_epochs=5)
    trainer.fit(model, dm)
    

    More information in our documentation.

    Save HuggingFace Hub Compatible Checkpoints 💾

    Many users have requested the ability to save HF Hub-compatible models. Look no further, we offer manual support + saving an additional HF compatible checkpoint automatically during training.

    from lightning_transformers.task.nlp.text_classification import TextClassificationTransformer
    
    model = TextClassificationTransformer(pretrained_model_name_or_path="prajjwal1/bert-tiny")
    
    # saves a HF checkpoint to this path.
    model.save_hf_checkpoint("checkpoint")
    

    To save via training, just pass the HFSaveCheckpoint plugin within your training code:

    import pytorch_lightning as pl
    from lightning_transformers.plugins.checkpoint import HFSaveCheckpoint
    ...
    
    model = TextClassificationTransformer(pretrained_model_name_or_path="prajjwal1/bert-tiny")
    trainer = pl.Trainer(plugins=HFSaveCheckpoint(model=model))
    trainer.fit(model, dm)
    

    Big Model Inference 🤖

    As transformer models get larger, they require more compute to run. In Lightning Transformers, we've utilized HF Accelerate to allow users to run billion parameter model inference.

    This will in turn allow people who do not have the GPU memory or compute to run these models, by leveraging CPU memory & compute and disk space!

    import torch
    from accelerate import init_empty_weights
    from transformers import AutoTokenizer
    from lightning_transformers.task.nlp.language_modeling import LanguageModelingTransformer
    
    # initializes empty model for us to the load the checkpoint.
    with init_empty_weights():
    model = LanguageModelingTransformer(
       pretrained_model_name_or_path="EleutherAI/gpt-j-6B",
       tokenizer=AutoTokenizer.from_pretrained("EleutherAI/gpt-j-6B")
    )
    
    # automatically selects the best devices (cpu/gpu) to load model layers based on available memory
    # even offloads to disk if necessary.
    model.load_checkpoint_and_dispatch("sharded-gpt-j-6B", device_map="auto", no_split_module_classes=["GPTJBlock"])
    
    output = model.generate("Hello, my name is", device=torch.device("cuda"))
    print(model.tokenizer.decode(output[0].tolist()))
    

    SparseML Support 🔍

    We now have native support for SparseML! SparseML provides GPU-class performance on CPUs through sparsification, pruning, and quantization.

    To enable SparseML, all you do is pass the callback to the Lightning Trainer with paths to your recipe!

    import pytorch_lightning as pl
    
    from lightning_transformers.callbacks import TransformerSparseMLCallback
    
    pl.Trainer(
        callbacks=TransformerSparseMLCallback(
            output_dir="/content/MODELS",
            recipe_path="/content/recipe.yaml"
        )
    )
    

    See our medium blog post for more details.

    Align with PyTorch Lightning ⚡

    Within this release we've simplified the API, removing complicated internal boilerplate and configuration that should exist outside this library. Keeping this library minimal means easier extensibility and easier contributions for everyone 🔥

    Thanks to all the contributors that helped out!

    Source code(tar.gz)
    Source code(zip)
  • 0.2.0rc1(May 31, 2022)

  • 0.1.0(Apr 21, 2021)

    The first release for Lightning Transformers!

    Lightning Transformers offers a flexible interface for training and fine-tuning SOTA Transformer models using the PyTorch Lightning Trainer.

    • Train using HuggingFace Transformers models and datasets with Lightning custom Callbacks, Loggers, Accelerators and high performance scaling.
    • Seamless Memory and Speed Optimizations such as DeepSpeed ZeRO or FairScale Sharded Training with no code changes.
    • Powerful config composition backed by Hydra - Easily swap out models, optimizers, schedulers and many more configurations without touching the code.
    • Transformer Task Abstraction for Rapid Research & Experimentation - Built from the ground up to be task agnostic, the library supports creating transformer tasks across all modalities with little friction.

    Lightning Transformers tasks allow you to train models using HuggingFace Transformer models and datasets, use Hydra to hotswap models, optimizers or schedulers and leverage all the advances features that Lightning has to offer, including custom Callbacks, Loggers, Accelerators and high performance scaling with minimal changes.

    In this release, we introduce these Transformer Tasks to train and predict:

    • Casual Language Modeling
    • Multiple Choice
    • Question Answering
    • Summarization
    • Text Classification
    • Token Classification
    • Translation

    Each task supports various datasets, see our documentation for more information!

    Source code(tar.gz)
    Source code(zip)
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

OpenSpeech provides reference implementations of various ASR modeling papers and three languages recipe to perform tasks on automatic speech recogniti

Soohwan Kim 26 Dec 14, 2022
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

OpenSpeech provides reference implementations of various ASR modeling papers and three languages recipe to perform tasks on automatic speech recogniti

Soohwan Kim 86 Jun 11, 2021
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

?? Contributing to OpenSpeech ?? OpenSpeech provides reference implementations of various ASR modeling papers and three languages recipe to perform ta

Openspeech TEAM 513 Jan 3, 2023
A Non-Autoregressive Transformer based TTS, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS.

A Non-Autoregressive Transformer based TTS, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS.

Keon Lee 237 Jan 2, 2023
A python script that will use hydra to get user and password to login to ssh, ftp, and telnet

Hydra-Auto-Hack A python script that will use hydra to get user and password to login to ssh, ftp, and telnet Project Description This python script w

null 2 Jan 16, 2022
simpleT5 is built on top of PyTorch-lightning⚡️ and Transformers🤗 that lets you quickly train your T5 models.

Quickly train T5 models in just 3 lines of code + ONNX support simpleT5 is built on top of PyTorch-lightning ⚡️ and Transformers ?? that lets you quic

Shivanand Roy 220 Dec 30, 2022
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

MMF is a modular framework for vision and language multimodal research from Facebook AI Research. MMF contains reference implementations of state-of-t

Facebook Research 5.1k Dec 26, 2022
An example project using OpenPrompt under pytorch-lightning for prompt-based SST2 sentiment analysis model

pl_prompt_sst An example project using OpenPrompt under the framework of pytorch-lightning for a training prompt-based text classification model on SS

Zhiling Zhang 5 Oct 21, 2022
A toolkit for document-level event extraction, containing some SOTA model implementations

Document-level Event Extraction via Heterogeneous Graph-based Interaction Model with a Tracker Source code for ACL-IJCNLP 2021 Long paper: Document-le

null 84 Dec 15, 2022
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge

Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge This is an implementation of the paper,

Mutian He 19 Oct 14, 2022
Code for text augmentation method leveraging large-scale language models

HyperMix Code for our paper GPT3Mix and conducting classification experiments using GPT-3 prompt-based data augmentation. Getting Started Installing P

NAVER AI 47 Dec 20, 2022
LightSeq: A High-Performance Inference Library for Sequence Processing and Generation

LightSeq is a high performance inference library for sequence processing and generation implemented in CUDA. It enables highly efficient computation of modern NLP models such as BERT, GPT2, Transformer, etc. It is therefore best useful for Machine Translation, Text Generation, Dialog, Language Modelling, and other related tasks using these models.

Bytedance Inc. 2.5k Jan 3, 2023
Code from the paper "High-Performance Brain-to-Text Communication via Handwriting"

Code from the paper "High-Performance Brain-to-Text Communication via Handwriting"

Francis R. Willett 305 Dec 22, 2022
Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation, available for both PyTorch and Tensorflow.

Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation, available for both PyTorch and Tensorflow.

null 730 Jan 9, 2023
An open-source NLP research library, built on PyTorch.

An Apache 2.0 NLP research library, built on PyTorch, for developing state-of-the-art deep learning models on a wide variety of linguistic tasks. Quic

AI2 11.4k Jan 1, 2023
An open-source NLP research library, built on PyTorch.

An Apache 2.0 NLP research library, built on PyTorch, for developing state-of-the-art deep learning models on a wide variety of linguistic tasks. Quic

AI2 9.7k Feb 18, 2021
Code for the paper "Flexible Generation of Natural Language Deductions"

Code for the paper "Flexible Generation of Natural Language Deductions"

Kaj Bostrom 12 Nov 11, 2022
Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate nearest neighbors, in Pytorch

Memorizing Transformers - Pytorch Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memori

Phil Wang 364 Jan 6, 2023
PyTorch implementation of Microsoft's text-to-speech system FastSpeech 2: Fast and High-Quality End-to-End Text to Speech.

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

Chung-Ming Chien 1k Dec 30, 2022