We are excited to share Composer v0.5, a library of speed-up methods for efficient neural network training. This release features:
- Revamped checkpointing API based on community feedback
- New baselines: ResNet34-SSD, GPT-3, and Vision Transformers
- Additional improvements to our documentation
- Support for
bfloat16
- Streaming dataset support
- Unified functional API for our algorithms
Highlights
Checkpointing API
Checkpointing models are now a Callback, so that users can easily write and add their own callbacks. The callback is automatically appended if a save_folder
is provided to the Trainer.
trainer = Trainer(
model=model,
algorithms=algorithms,
save_folder="checkpoints",
save_interval="1ep"
)
Alternatively, CheckpointSaver
can be directly added as a callback:
trainer = Trainer(..., callbacks=[
CheckpointSaver(
save_folder='checkpoints',
name_format="ep{epoch}-ba{batch}/rank_{rank}",
save_latest_format="latest/rank_{rank}",
save_interval="1ep",
weights_only=False,
)
])
Subclass from CheckpointSaver
to add your own logic for saving the best model, or saving at specific intervals. Thanks to @mansheej @siriuslee and other users for their feedback.
bloat16
We've added experimental support for bfloat16
, which can be provided via the precision
argument to the Trainer:
trainer = Trainer(
...,
precision="bfloat16"
)
Streaming datasets
We've added support for fast streaming datasets. For NLP-based datasets such as C4, we use the HuggingFace datasets backend, and add dataset-specific shuffling, tokenization , and grouping on-the-fly. To support data parallel training, we added specific sharding logic for efficiency. See C4Datasets
for more details.
Vision streaming datasets are supported via a patched version of the webdatasets
package, and added support for data sharding by workers for fast augmentations. See composer.datasets.webdataset
for more details.
Baseline GPT-3, ResNet34-SSD, and Vision Transformer benchmarks
Configurations for GPT-3-like models ranging from 125m to 760m parameters are now released, and use DeepSpeed Zero Stage 0 for memory-efficient training.
We've also added the Single Shot Detection (SSD) model (Wei et al, 2016) with a ResNet34 backbone, based on the MLPerf reference implementation.
Our first Vision Transformer benchmark is the ViT-S/16 model from Touvron et al, 2021, and based on the vit-pytorch
package.
See below for the full details:
What's Changed
- Export Transforms in
composer.algorithms
by @ajaysaini725 in https://github.com/mosaicml/composer/pull/603
- Make batchnorm default for UNet by @dskhudia in https://github.com/mosaicml/composer/pull/535
- Fix no_op_model algorithm by @dskhudia in https://github.com/mosaicml/composer/pull/614
- Pin pre-1.0 packages by @bandish-shah in https://github.com/mosaicml/composer/pull/595
- Updated dark mode composer logo, and graph by @nqn in https://github.com/mosaicml/composer/pull/617
- Jenkins + Docker Improvements by @ravi-mosaicml in https://github.com/mosaicml/composer/pull/621
- update README links by @hanlint in https://github.com/mosaicml/composer/pull/628
- Remove all old timing calls by @ravi-mosaicml in https://github.com/mosaicml/composer/pull/594
- Remove state shorthand by @mvpatel2000 in https://github.com/mosaicml/composer/pull/629
- add bfloat16 support by @nikhilsardana in https://github.com/mosaicml/composer/pull/433
- v0.4.0 Hotfix: Docker documentation updates by @bandish-shah in https://github.com/mosaicml/composer/pull/631
- Fix wrong icons in the method cards by @hanlint in https://github.com/mosaicml/composer/pull/636
- fix autocast for pytorch < 1.10 by @nikhilsardana in https://github.com/mosaicml/composer/pull/639
- Add tutorial notebooks to the README by @moinnadeem in https://github.com/mosaicml/composer/pull/630
- Converted Stateless Schedulers to Classes by @ravi-mosaicml in https://github.com/mosaicml/composer/pull/632
- Jenkinsfile Fixes Part 2 by @ravi-mosaicml in https://github.com/mosaicml/composer/pull/627
- Add C4 Streaming dataset by @abhi-mosaic in https://github.com/mosaicml/composer/pull/489
- CONTRIBUTING.md additions by @kobindra in https://github.com/mosaicml/composer/pull/648
- Hide showing
object
as a base class; fix skipping documentation of forward
; fixed docutils dependency. by @ravi-mosaicml in https://github.com/mosaicml/composer/pull/643
- Matthew/functional docstrings update by @growlix in https://github.com/mosaicml/composer/pull/622
- docstrings improvements for core modules by @dskhudia in https://github.com/mosaicml/composer/pull/598
- ssd-resnet34 on COCO map 0.23 by @florescl in https://github.com/mosaicml/composer/pull/646
- Fix broken "best practices" link by @growlix in https://github.com/mosaicml/composer/pull/649
- Update progressive resizing to work for semantic segmentation by @coryMosaicML in https://github.com/mosaicml/composer/pull/604
- Let C4 Dataset overwrite
num_workers
if set incorrectly by @abhi-mosaic in https://github.com/mosaicml/composer/pull/655
- Lazy imports for
pycocotools
by @abhi-mosaic in https://github.com/mosaicml/composer/pull/656
- W&B excludes final eval metrics when plotted as a fxn of epoch or trainer/global_step by @growlix in https://github.com/mosaicml/composer/pull/633
- Update GPT3-yamls for default 8xA100-40GB by @abhi-mosaic in https://github.com/mosaicml/composer/pull/663
- Set WandB default to log rank zero only by @abhi-mosaic in https://github.com/mosaicml/composer/pull/461
- Update schedulers guide by @hanlint in https://github.com/mosaicml/composer/pull/661
- [XS] Fix a TQDM deserialization bug by @jbloxham in https://github.com/mosaicml/composer/pull/665
- Add defaults to the docstrings for algorithms by @hanlint in https://github.com/mosaicml/composer/pull/662
- Fix ZeRO config by @jbloxham in https://github.com/mosaicml/composer/pull/667
- [XS] fix formatting for colout by @hanlint in https://github.com/mosaicml/composer/pull/666
- Composer.core docstring touch-up by @ravi-mosaicml in https://github.com/mosaicml/composer/pull/657
- Add Uniform bounding box sampling option for CutOut and CutMix by @coryMosaicML in https://github.com/mosaicml/composer/pull/634
- Update README.md by @ravi-mosaicml in https://github.com/mosaicml/composer/pull/678
- Fix bug in trainer test by @hanlint in https://github.com/mosaicml/composer/pull/651
- InMemoryLogger has get_timeseries() method by @growlix in https://github.com/mosaicml/composer/pull/644
- Batchwise resolution for SWA by @growlix in https://github.com/mosaicml/composer/pull/654
- Fixed the conda build script so it runs on jenkins by @ravi-mosaicml in https://github.com/mosaicml/composer/pull/676
- Yahp version update to 0.1.0 by @Averylamp in https://github.com/mosaicml/composer/pull/674
- Streaming vision datasets by @knighton in https://github.com/mosaicml/composer/pull/284
- Fix DeepSpeed checkpointing by @jbloxham in https://github.com/mosaicml/composer/pull/686
- Vit by @A-Jacobson in https://github.com/mosaicml/composer/pull/243
- [S] cleanup tldr; standardize
__all__
by @hanlint in https://github.com/mosaicml/composer/pull/688
- Unify algorithms part 2: mixup, cutmix, label smoothing by @dblalock in https://github.com/mosaicml/composer/pull/658
composer.optim
docstrings by @jbloxham in https://github.com/mosaicml/composer/pull/653
- Fix DatasetHparams, WebDatasetHparams docstring by @growlix in https://github.com/mosaicml/composer/pull/697
- Models docstrings by @A-Jacobson in https://github.com/mosaicml/composer/pull/469
- docstrings improvements for composer.datasets by @dskhudia in https://github.com/mosaicml/composer/pull/694
- Updated contributing.md and the style guide by @ravi-mosaicml in https://github.com/mosaicml/composer/pull/670
- Ability to retry ADE20k crop transform by @Landanjs in https://github.com/mosaicml/composer/pull/702
- Add mmsegmentation DeepLabv3(+) by @Landanjs in https://github.com/mosaicml/composer/pull/684
- Unify functional API part 3 by @dblalock in https://github.com/mosaicml/composer/pull/715
- Update example notebooks by @coryMosaicML in https://github.com/mosaicml/composer/pull/707
- [Checkpointing - PR1] Store the
rank_zero_seed
on state by @ravi-mosaicml in https://github.com/mosaicml/composer/pull/680
- [Checkpointing - PR2] Added in new Checkpointing Events by @ravi-mosaicml in https://github.com/mosaicml/composer/pull/690
- [Checkpointing - PR3] Clean up RNG and State serialization by @ravi-mosaicml in https://github.com/mosaicml/composer/pull/692
- [Checkpointing - PR4] Refactored the
CheckpointLoader
into a load_checkpoint
function by @ravi-mosaicml in https://github.com/mosaicml/composer/pull/693
- Update {blurpool,factorize,ghostbn} method cards by @dblalock in https://github.com/mosaicml/composer/pull/711
- [Checkpointing - PR 5] Move the
CheckpointSaver
to a callback. by @ravi-mosaicml in https://github.com/mosaicml/composer/pull/687
- Update datasets docstrings by @growlix in https://github.com/mosaicml/composer/pull/709
- add notebooks and functional api by @hanlint in https://github.com/mosaicml/composer/pull/714
- Migrating from PTL notebook by @florescl in https://github.com/mosaicml/composer/pull/436
- Docs 0.4.1: Profiler section and tutorials by @bandish-shah in https://github.com/mosaicml/composer/pull/696
- Improve datasets docstrings by @knighton in https://github.com/mosaicml/composer/pull/695
- Update
C4Dataset
to repeat, handle max_samples
safely by @abhi-mosaic in https://github.com/mosaicml/composer/pull/722
- Fix docs build by @ravi-mosaicml in https://github.com/mosaicml/composer/pull/773
- v0.5 Release by @hanlint in https://github.com/mosaicml/composer/pull/732
New Contributors
- @nikhilsardana made their first contribution in https://github.com/mosaicml/composer/pull/433
- @knighton made their first contribution in https://github.com/mosaicml/composer/pull/284
Full Changelog: https://github.com/mosaicml/composer/compare/v0.4.0...v0.5.0
Source code(tar.gz)
Source code(zip)