SparseML is a libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

Overview

tool icon  SparseML

Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

Documentation Main GitHub release GitHub Contributor Covenant

Overview

SparseML is a toolkit that includes APIs, CLIs, scripts and libraries that apply state-of-the-art sparsification algorithms such as pruning and quantization to any neural network. General, recipe-driven approaches built around these algorithms enable the simplification of creating faster and smaller models for the ML performance community at large.

The GitHub repository contains integrations within the PyTorch, Keras, and TensorFlow V1 ecosystems, allowing for seamless model sparsification.

SparseML Flow

Highlights

Integrations

Creating Sparse Models

Transfer Learning from Sparse Models

Tutorials

Coming soon!

Installation

This repository is tested on Python 3.6+, and Linux/Debian systems. It is recommended to install in a virtual environment to keep your system in order. Currently supported ML Frameworks are the following: torch>=1.1.0,<=1.8.0, tensorflow>=1.8.0,<=2.0.0, tensorflow.keras >= 2.2.0.

Install with pip using:

pip install sparseml

More information on installation such as optional dependencies and requirements can be found here.

Quick Tour

To enable flexibility, ease of use, and repeatability, sparsifying a model is done using a recipe. The recipes encode the instructions needed for modifying the model and/or training process as a list of modifiers. Example modifiers can be anything from setting the learning rate for the optimizer to gradual magnitude pruning. The files are written in YAML and stored in YAML or markdown files using YAML front matter. The rest of the SparseML system is coded to parse the recipes into a native format for the desired framework and apply the modifications to the model and training pipeline.

ScheduledModifierManager classes can be created from recipes in all supported ML frameworks. The manager classes handle overriding the training graphs to apply the modifiers as described in the desired recipe. Managers can apply recipes in one shot or training aware ways. One shot is invoked by calling .apply(...) on the manager while training aware requires calls into initialize(...) (optional), modify(...), and finalize(...).

For the frameworks, this means only a few lines of code need to be added to begin supporting pruning, quantization, and other modifications to most training pipelines. For example, the following applies a recipe in a training aware manner:

model = Model()  # model definition
optimizer = Optimizer()  # optimizer definition
train_data = TrainData()  # train data definition
batch_size = BATCH_SIZE  # training batch size
steps_per_epoch = len(train_data) // batch_size

from sparseml.pytorch.optim import ScheduledModifierManager
manager = ScheduledModifierManager.from_yaml(PATH_TO_RECIPE)
optimizer = manager.modify(model, optimizer, steps_per_epoch)

# PyTorch training code

manager.finalize(model)

Instead of training aware, the following example code shows how to execute a recipe in a one shot manner:

model = Model()  # model definition

from sparseml.pytorch.optim import ScheduledModifierManager
manager = ScheduledModifierManager.from_yaml(PATH_TO_RECIPE)
manager.apply(model)

More information on the codebase and contained processes can be found in the SparseML docs:

Resources

Learning More

Release History

Official builds are hosted on PyPI

Additionally, more information can be found via GitHub Releases.

License

The project is licensed under the Apache License Version 2.0.

Community

Contribute

We appreciate contributions to the code, examples, integrations, and documentation as well as bug reports and feature requests! Learn how here.

Join

For user help or questions about SparseML, sign up or log in: Deep Sparse Community Discourse Forum and/or Slack. We are growing the community member by member and happy to see you there.

You can get the latest news, webinar and event invites, research papers, and other ML Performance tidbits by subscribing to the Neural Magic community.

For more general questions about Neural Magic, please fill out this form.

Cite

Find this project useful in your research or other communications? Please consider citing:

@InProceedings{
    pmlr-v119-kurtz20a, 
    title = {Inducing and Exploiting Activation Sparsity for Fast Inference on Deep Neural Networks}, 
    author = {Kurtz, Mark and Kopinsky, Justin and Gelashvili, Rati and Matveev, Alexander and Carr, John and Goin, Michael and Leiserson, William and Moore, Sage and Nell, Bill and Shavit, Nir and Alistarh, Dan}, 
    booktitle = {Proceedings of the 37th International Conference on Machine Learning}, 
    pages = {5533--5543}, 
    year = {2020}, 
    editor = {Hal Daumé III and Aarti Singh}, 
    volume = {119}, 
    series = {Proceedings of Machine Learning Research}, 
    address = {Virtual}, 
    month = {13--18 Jul}, 
    publisher = {PMLR}, 
    pdf = {http://proceedings.mlr.press/v119/kurtz20a/kurtz20a.pdf},
    url = {http://proceedings.mlr.press/v119/kurtz20a.html}, 
    abstract = {Optimizing convolutional neural networks for fast inference has recently become an extremely active area of research. One of the go-to solutions in this context is weight pruning, which aims to reduce computational and memory footprint by removing large subsets of the connections in a neural network. Surprisingly, much less attention has been given to exploiting sparsity in the activation maps, which tend to be naturally sparse in many settings thanks to the structure of rectified linear (ReLU) activation functions. In this paper, we present an in-depth analysis of methods for maximizing the sparsity of the activations in a trained neural network, and show that, when coupled with an efficient sparse-input convolution algorithm, we can leverage this sparsity for significant performance gains. To induce highly sparse activation maps without accuracy loss, we introduce a new regularization technique, coupled with a new threshold-based sparsification method based on a parameterized activation function called Forced-Activation-Threshold Rectified Linear Unit (FATReLU). We examine the impact of our methods on popular image classification models, showing that most architectures can adapt to significantly sparser activation maps without any accuracy loss. Our second contribution is showing that these these compression gains can be translated into inference speedups: we provide a new algorithm to enable fast convolution operations over networks with sparse activations, and show that it can enable significant speedups for end-to-end inference on a range of popular models on the large-scale ImageNet image classification task on modern Intel CPUs, with little or no retraining cost.} 
}
@misc{
    singh2020woodfisher,
    title={WoodFisher: Efficient Second-Order Approximation for Neural Network Compression}, 
    author={Sidak Pal Singh and Dan Alistarh},
    year={2020},
    eprint={2004.14340},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
Comments
  • 【question】Why my yolov5s model is 80MB after pruned?

    【question】Why my yolov5s model is 80MB after pruned?

    I followed this: https://github.com/neuralmagic/sparseml/blob/main/integrations/ultralytics-yolov5/tutorials/sparsifying_yolov5_using_recipes.md and I get a big model more than 80M size.

    bug 
    opened by mx2013713828 12
  • LayerThinningModifier

    LayerThinningModifier

    Reference:

    Hey @vjsrinivas, we do have this enabled for models like ResNet-50 to automatically thin the network. Specifically, we have a LayerThinningModifier that can be used.

    I've attached an example implementation of that for ResNet-50. Let us know if you need any support or run into any issues (the dependency graph generation can be tricky). Generally, we've seen 40% filter pruning being at the upper limit for ResNet-50 before it starts degrading in accuracy.

    resnet50-structured.yaml.zip

    Originally posted by @markurtz in https://github.com/neuralmagic/sparseml/issues/489#issuecomment-1071199309


    Since the other issue is more about TensorRT, I don't want to pollute it with structured pruning related discussion. I ran the YAML for ResNet50 with LayerThinningModifier, but I would get the following error:

    Traceback (most recent call last):
      File "classfication_channel.py", line 340, in <module>
        prune_train(model, input_size)
      File "classfication_channel.py", line 103, in prune_train
        model, train_loader, criterion, device, train=True, optimizer=optimizer
      File "classfication_channel.py", line 42, in run_model_one_epoch
        loss.backward()
      File "/home/vijay/anaconda3/envs/radar/lib/python3.7/site-packages/torch/tensor.py", line 245, in backward
        torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
      File "/home/vijay/anaconda3/envs/radar/lib/python3.7/site-packages/torch/autograd/__init__.py", line 147, in backward
        allow_unreachable=True, accumulate_grad=True)  # allow_unreachable flag
    RuntimeError: Function CudnnConvolutionBackward returned an invalid gradient at index 1 - got [2048, 332, 1, 1] but expected shape compatible with [2048, 512, 1, 1]
    

    It seems like the zeroed weights are removed but the layer objects themselves are still expecting the old weight and bias shapes. I changed the YAML file to run LayerThinningModifier at the end of training, and it successfully trained and removed the weights. I had to recreate the network when loading the pruned weights though.

    opened by vjsrinivas 10
  • ModuleNotFoundError: No module named 'sparseml.pytorch.utils.quantization'

    ModuleNotFoundError: No module named 'sparseml.pytorch.utils.quantization'

    Describe the bug in running train.py for yolov5 integration in sparseml, I get error that no module sparseml , (i report it in another issue but no answer i have seen yet) so I manualyy copy sparseml from src to yolo module, but now it get another error that:

    sparseml/integrations/ultralytics-yolov5/yolov5/models/export.py", line 21,

    from sparseml.pytorch.utils.quantization import skip_onnx_input_quantize ModuleNotFoundError: No module named 'sparseml.pytorch.utils.quantization'

    Expected behavior I Expect importing properly! but when I check adress that export.py try to import, I find out that he/she is wright! there is no quantization module in sparseml.pytorch.utils.

    Environment Include all relevant environment information:

    1. OS : Ubuntu 18.04
    2. Python version :3.7

    To Reproduce Exact steps to reproduce the behavior: git clone sparseml, pip install sparseml, go to integration of yolov5 and bash setup copy sparseml from src to yolo root (you didn't say that but if we don't copy, it cant recognize sparse module -I report this issue too but no answer received-) go to yolo root then run :

    python3.7 train.py --data voc.yaml --cfg ../models/yolov5l.yaml --weights zoo:cv/detection/yolov5-l/pytorch/ultralytics/coco/pruned_quant-aggressive_95?recipe_type=transfer --hyp data/hyp.finetune.yaml --recipe ../recipes/yolov5.transfer_learn_pruned_quantized.md

    Errors Traceback (most recent call last): File "train.py", line 23, in import test # import test.py to get mAP after each epoch File "/home/fteam/Desktop/naser/R.Zamani/sparseml/integratio ns/ultralytics-yolov5/yolov5/test.py", line 12, in from models.export import load_checkpoint File "/home/fteam/Desktop/naser/R.Zamani/sparseml/integratio ns/ultralytics-yolov5/yolov5/models/export.py", line 21, in from sparseml.pytorch.utils.quantization import skip_onnx_ input_quantize ModuleNotFoundError: No module named 'sparseml.pytorch.utils.q uantization'

    Additional context Add any other context about the problem here. Also include any relevant files.

    bug 
    opened by RasoulZamani 10
  • "Error(s) in loading state_dict for Model:" when I try to export a model edited with a recipe

    Hi, I have trained a yolov5s model for several epochs, and I followed the tutorial to apply a recipe to it. At the export step, I have this error:

    RuntimeError: Error(s) in loading state_dict for Model: Missing key(s) in state_dict: "model.0.conv.conv.quant.activation_post_process.scale", "model.0.conv.conv.quant.activation_post_process.zero_point", ...

    Unexpected key(s) in state_dict: "model.0.conv.conv.weight", "model.0.conv.conv.bias", "model.1.conv.weight", "model.1.conv.bias", ...
    

    Someone can help me?

    opened by antimo22 10
  • Structured pruning for YOLOv5

    Structured pruning for YOLOv5

    I have been using sparseml for pruning YOLOv5 recently and I can see big improvement in inference time, however, the model size stays the same. I have realised that it happens due to unstrustured pruning which only fills the 0 rather than removing the weights.

    I was wondering whether it is possible to implement structured pruning for YOLOv5

    enhancement 
    opened by hawrot 9
  • Runtime Error, no match for existing parameter

    Runtime Error, no match for existing parameter

    While trying to train using pruning and quantization recipe on my yolov5s model, the error raised is the following:

    in validate_all_params_found name_or_regex, found_param_names, name_or_regex_patterns RuntimeError: All supplied parameter names or regex patterns not found. No match for model.9.m.0.cv2.conv.weight in found parameters ['model.7.conv.weight', 'model.8.cv2.conv.weight', 'model.13.m.0.cv2.conv.weight', 'model.17.cv2.conv.weight', 'model.18.conv.weight', 'model.20.cv2.conv.weight', 'model.20.cv3.conv.weight', 'model.20.m.0.cv2.conv.weight', 'model.21.conv.weight', 'model.23.cv2.conv.weight', 'model.23.cv3.conv.weight', 'model.23.m.0.cv2.conv.weight']. Supplied ['model.23.m.0.cv2.conv.weight', 'model.21.conv.weight', 'model.23.cv3.conv.weight', 'model.23.cv2.conv.weight', 'model.20.m.0.cv2.conv.weight', 'model.18.conv.weight', 'model.9.m.0.cv2.conv.weight', 'model.7.conv.weight', 'model.20.cv3.conv.weight', 'model.20.cv2.conv.weight', 'model.8.cv2.conv.weight', 'model.13.m.0.cv2.conv.weight', 'model.17.cv2.conv.weight']

    I've highlighted the strange fact: the parameter the function could not find belongs to the list of the available one. Problem both with cpu or gpu.

    Command sent: python3 train.py --device cpu --img 640 --batch 16 --epochs 1 --data '../data.yaml' --cfg ./models/yolov5${VERSION}.yaml --weights $PROJECT/yolov5${VERSION}${MONTH}${DAY}/weights/best.pt --project $PROJECT --name yolov5${VERSION}${MONTH}${DAY}_pruned_quantized --hyp data/hyps/hyp.scratch-med.yaml --recipe ../sparseml/integrations/ultralytics-yolov5/recipes/yolov5${VERSION}.pruned_quantized.md --cache

    The work is running on a cluster provided by university, which is a linux-based modular environment. To make YOLO work, we loaded gcc/9.3 compiler and python/3.7.10, cuda toolkit 11.3. Sparseml version is 0.11.1, all necessary requirements are installed.

    bug 
    opened by MaxFeli 8
  • How to properly load manager state dict with modifiers?

    How to properly load manager state dict with modifiers?

    I am running a training of a model with a custom recipe involving several learning_rate modifiers and MFACPruningModifier. I checkpoint the model using the save_model function from sparseml.pytorch.utils:

        save_model(
           path=#path-to-save,
           model=model, 
           optimizer=optimizer, 
           epoch=epoch,
           use_zipfile_serialization_if_available=True,
           include_modifiers=True
        )
    

    But when attempting to restart from that checkpoint, having previously defined manager and modified it in the training script:

    manager = ScheduledModifierManager.from_yaml(args.sparseml_recipe)
    # wrap optimizer    
    optimizer = manager.modify(model, optimizer, steps_per_epoch=len(loader_train), epoch=start_epoch)
    # load state
    manager.load_state_dict(state_dict['manager'])
    

    I encounter the following error due to incompatibility of keys:

    IndexError: Found extra keys: {'MFACPruningModifier-bc0da17c953429b213f11f560b1f58e9'} and missing keys: {'MFACPruningModifier-0eeb649660da3fc081520285feb4aa0b'}

    There is a suffix - some hash generated in some way. What should I do to make the hashes in the created manager and checkpoint equal?

    Version sparseml-nightly 0.12.0.20220405

    bug 
    opened by Godofnothing 8
  • Error: No module named 'sparseml'

    Error: No module named 'sparseml'

    Hi, im using sparseml yolov5 integration. as you said in totorial for transfer learning I run this code:

    python train.py --data voc.yaml --cfg ../models/yolov5s.yaml --weights zoo:cv/detection/yolov5-s/pytorch/ultralytics/coco/pruned_quant-aggressive_94?recipe_type=transfer --hyp data/hyp.finetune.yaml --recipe ../recipes/yolov5.transfer_learn_pruned_quantized.md

    (I run this bash script in train.py folder i.e: sparseml/integrations/ultralytics-yolov5/yolov5)

    but I get this error: .../models/export.py", line 20, in from sparseml.pytorch.utils import ModuleExporter ModuleNotFoundError: No module named 'sparseml'

    it seems that it cant impoort from sparseml module is it a bug? or I am doing sth wrong? thanks!

    opened by RasoulZamani 8
  • Split vision.py into smaller scripts

    Split vision.py into smaller scripts

    Refactor: vision.py into 4 scripts

    • train.py for training specific tasks
    • lr_analysis.py for lr sensitivity analysis
    • pr_analysis.py for Kernel Sparsity (pruning) analysis
    • export.py for export a model to onnx and store input/outputs

    Added:

    • Quality fixes for function arguments and * imports
    • Scripts tested locally with example commands from old vision.py
    • vision.py modified to point user to new scripts
    • updation of README
    • updation of example commands in recipes
    • NmArgumentParser class for parsing dataclasses into arguments
    • Tests to show example usage of the parser
    • Support for both hyphenated and non-hyphenated arguments

    The scripts are tested locally using the following commands on imagenette dataset

    python integrations/pytorch/train.py \
        --recipe-path integrations/pytorch/recipes/resnet50-pruned.md \
        --arch-key resnet50 --dataset imagenette --dataset-path ~/datasets/ \
        --train-batch-size 256 --test-batch-size 1024
    
    
    python integrations/pytorch/export.py \
        --arch-key resnet50 --dataset imagenette --dataset-path ~/datasets/
    
    
    
    python integrations/pytorch/pr_sensitivity.py \
        --approximate --arch-key resnet50 --dataset imagenette \
        --dataset-path ~/datasets/
    
    
    python integrations/pytorch/lr_analysis.py \
        --arch-key mobilenet --dataset imagenette \
        --dataset-path ~/datasets/ --batch-size 2
    
    opened by rahul-tuli 8
  • Let modifiers implement epoch adjustment when composing recipes

    Let modifiers implement epoch adjustment when composing recipes

    This change simplifies the manager's recipe combination, also fixes a bug that the current code at the manager level does not account for all epoch related attributes from the quantization modifiers.

    opened by natuan 7
  • Jetson boards support

    Jetson boards support

    Hello. Thank you for your valuable solutions. I am interested to know the optimized yolov5 models can be deployed on Jetson boards? Are they compatible with Jetson CPU architecture?

    enhancement 
    opened by alikaz3mi 7
  • Implement multiple-choice pipeline

    Implement multiple-choice pipeline

    This PR implements multiple-choice pipeline in HF's transformers library with SparseML integration. It enables commonsense-reasoning experiments proposed in the "Sparsity May Cry benchmark".

    Supported datasets/tasks

    1. SWAG: https://arxiv.org/abs/1808.05326
    2. Commonsense_QA: https://arxiv.org/abs/1811.00937
    3. RACE: https://arxiv.org/abs/1704.04683
    4. Winogrande: https://arxiv.org/abs/1907.10641

    Data preprocessing pipeline for each dataset follows the proposed format from the corresponding paper.

    Verification The implementation is tested by reproducing results from the RoBERTa paper (https://arxiv.org/abs/1907.11692) and fairseq repository (https://github.com/facebookresearch/fairseq/tree/main/examples/roberta).

    opened by eldarkurtic 0
  • Log overall sparsity of the model

    Log overall sparsity of the model

    Issues like https://github.com/neuralmagic/sparseml/issues/1282 are very hard to detect when global_sparsity: True, as the final model will have the desired target sparsity but the sparsity scheduler might have followed the wrong interpolation function. This PR adds logging of the overall sparsity of the model, which makes this issue easily detectable.

    opened by eldarkurtic 1
  • Some fixes for torchvision training flow

    Some fixes for torchvision training flow

    The fixes here are needed to run the training flow w/o an explicit checkpoint, or w/ a checkpoint that has no recipe or optimizer.

    Qualification: tested (a) training command from torchvision readme, (b) training (limited) & validating efficientnet_v2-s from zoomodels.

    opened by natuan 0
  • Saving all hooks during quantization block fusing

    Saving all hooks during quantization block fusing

    Currently only the following hooks are transferred during quantization fusing process:

    1. the pre forward hooks for the first layer
    2. the post forward hooks for the last layer

    Notably, the following hooks are lost:

    1. The post forward hooks in the first layer
    2. The pre forward hooks in the last layer
    3. Any hooks from any layer between first and last layer

    This PR changes it so all pre/post hooks from every layer in the fused block is transferred to the new fused module.

    Testing plan

    Existing unit tests, and also this was tested for the #1272 PR

    opened by corey-nm 0
  • Attempting to fix onnx tests

    Attempting to fix onnx tests

    Background

    This is attempting to fix a bug in python 3.9 where the test_one_shot_ks_perf_sensitivity is failing.

    Two notes:

    1. I could not reproduce this failure locally
    2. These test pass in python 3.7/3.8/3.10 jenkins pipelines.

    So the only failure is in jenkins pipeline on python 3.9.

    My current hypothesis is that the results returned by model analysis can have multiple values where the x["index"] is the same: image

    This could make the sorted call be inconsistently ordering the actual results.

    Another piece of information supporting this hypothesis is the error messages of the 2 failures:

    failure 1:

    expected_layers = [{'averages': OrderedDict([(0.0, 3.500000000000001e-05), (0.4, 3.2000000000000005e-05), (0.8, 3.300000000000001e-05), ...'baseline_average': 7.089999999999999e-05, 'baseline_measurement_index': 0, 'baseline_measurement_key': 0.0, ...}, ...]
    actual_layers = [{'averages': OrderedDict([(0.0, 4.599999999999999e-05), (0.4, 4.499999999999999e-05), (0.8, 4.699999999999999e-05), (...0152823)]), 'baseline_average': 0.0137507, 'baseline_measurement_index': 0, 'baseline_measurement_key': 0.0, ...}, ...]
    is_perf = True
    

    failure 2:

    expected_layers = [{'averages': OrderedDict([(0.0, 4.9999999999999996e-05), (0.4, 3.0000000000000004e-05), (0.8, 4.599999999999999e-05),...9999e-05)]), 'baseline_average': 4.07e-05, 'baseline_measurement_index': 0, 'baseline_measurement_key': 0.0, ...}, ...]
    actual_layers = [{'averages': OrderedDict([(0.0, 3.4000000000000007e-05), (0.4, 3.2000000000000005e-05), (0.8, 3.500000000000001e-05),... 'baseline_average': 0.015620499999999997, 'baseline_measurement_index': 0, 'baseline_measurement_key': 0.0, ...}, ...]
    is_perf = True
    

    If you swap the actual_layers values from both of these failures, then it seems like both would pass.

    Fix

    If my hypothesis is correct, then adding a tie-breaker to the sorting should fix this bug. I changed the sorting key to sort based on:

    1. id
    2. if id's are equal, then index
    3. If id and index are equal, then name

    This will break the ties, so if this doesn't solve the issue then I don't know what the problem is.

    opened by corey-nm 1
Releases(v1.3.0)
  • v1.3.0(Dec 21, 2022)

    New Features:

    • NLP multi-label training and eval support added.
    • SQuAD v2.0 support provided.
    • Recipe template APIs introduced, enabling easier creation of recipes for custom models with standard sparsification pathways.
    • EfficientNetV2 model architectures implemented.
    • Sample inputs and outputs exportable for YOLOv5, transformers, and image classification integrations.

    Changes:

    • PyTorch 1.12 and Python 3.10 now supported.
    • YOLOv5 pipelines upgraded to the latest version from Ultralytics.
    • Transformers pipelines upgraded to latest version from Hugging Face.
    • PyTorch image classification pathway upgraded using torchvision standards.
    • Recipe arguments now support list types.

    Resolved Issues:

    • Improper URLs fixed for ONNX export documentation to proper documentation links.
    • Latest transformers version hosted by Neural Magic automatically installs; previously it would pin on older versions and not receive updates

    Known Issues:

    • None
    Source code(tar.gz)
    Source code(zip)
    sparseml-1.3.0-py3-none-any.whl(847.75 KB)
  • v1.2.0(Oct 28, 2022)

    New Features:

    • Document classification training and export pipelines added for transformers integration.

    Changes:

    • Refactor of transformers training and export integration code now enables more code reuse across use cases.
    • List of supported quantized nodes expanded to enable more complex quantization patterns for ResNet-50 and MobileBERT enabling better performance for similar models.
    • Transformers integration expanded to enable saving and reloading of optimizer state from trained checkpoints.
    • Deployment folder added for image classification integration which will export to deployment.
    • Gradient accumulation support added for image classification integration.
    • Minimum Python version changed to 3.7 as 3.6 as reached EOL.

    Resolved Issues:

    • Quantized checkpoints for image classification models now instantiates correctly, no longer leading to random initialization of weights rather than restore.
    • TraininableParamsModifier for PyTorch now enables and disables params properly so weights are frozen while training.
    • Quantized embeddings no longer causes crashes while training with distributed data parallel.
    • Improper EfficientNet definitions fixed that would lead to accuracy issues due to convolutional strides being duplicated.
    • Protobuf version for ONNX 1.12 compatibility pinned to prevent install failures on some systems.

    Known Issues:

    • None
    Source code(tar.gz)
    Source code(zip)
    sparseml-1.2.0-py3-none-any.whl(808.20 KB)
  • v1.1.1(Sep 14, 2022)

  • v1.1.0(Aug 25, 2022)

    New Features:

    • YOLACT Segmentation native training integration made for SparseML.
    • OBSPruning modifier added (https://arxiv.org/abs/2203.07259).
    • QAT now supported for MobileBERT.
    • Custom module support provided for QAT to enable quantization of layers such as GELU.

    Changes:

    • Updates made across the repository for new SparseZoo Python APIs.
    • Non-string keys are now supported in recipes for layer and module names.
    • Native support added for DDP training with pruning in PyTorch pathways.
    • YOLOV5p6 models default to their native activations instead of overwriting to Hardswish.
    • Transformers eval pathways changed to turn off Amp (fFP16) to give more stable results.
    • TensorBoard logger added to transformers integration.
    • Python setuptools set as required at 59.5 to avoid installation issues with other packages.
    • DDP now works for quantized training of embedding layers where tensors were being placed on incorrect devices and causing training crashes.

    Resolved Issues:

    • ConstantPruningModifier propagated None in place of the start_epoch value when start_epoch > 0. It now propagates the proper value.
    • Quantization of BERT models were dropping accuracy improperly by quantizing the identify branches.
    • SparseZoo stubs were not loading model weights for image classification pathways when using DDP training.

    Known Issues:

    • None
    Source code(tar.gz)
    Source code(zip)
    sparseml-1.1.0-py3-none-any.whl(797.78 KB)
  • v1.0.1(Jul 13, 2022)

  • v1.0.0(Jul 1, 2022)

    New Features:

    • One-shot and recipe arguments support added for transformers, yolov5, and torchvision.
    • Dockerfiles and new build processes created for Docker.
    • CLI formats and inclusion standardized on install of SparseML for transformers, yolov5, and torchvision.
    • N:M pruning mask creator deployed for use in PyTorch pruning modifiers.
    • Masked_language_modeling training CLI added for transformers.
    • Documentation additions made across all standard integrations and pathways.
    • GitHub action tests running for end-to-end testing of integrations.

    Changes:

    • Click as a root dependency added as the new preferred route for CLI invocation and arg management.
    • Provider parameter added for ONNXRuntime InferenceSessions.
    • Moved onnxruntime to optional install extra. onnxruntime no longer a root dependency and will only be imported when using specific pathways.
    • QAT export pipelines improved with better support for QATMatMul and custom operators.

    Resolved Issues:

    • Incorrect commands and models updated for older docs for transformers, yolov5, and torchvision.
    • YOLOv5 issues addressed with data files, configs, and datasets not being easily accessible with the new install pathway. They are now included in the sparseml src folder for yolov5.
    • An extra batch no longer runs for the PyTorch ModuleRunner.
    • None sparsity parameter was being improperly propagated for sparsity in the PyTorch ConstantPruningModifier.
    • PyPI dependency conflicts no longer occur with the latest ONNX and Protobuf upgrades.
    • When GPUs were not available, yolov5 pathways were not working.
    • Transformers export was not working properly when neither --do_train or --do_eval arguments were passed in.
    • Non-string keys now allowed within recipes.
    • Numerous fixes applied for pruning modifiers including improper masks casting, improper initialization, and improper arguments passed through for MFAC.
    • YOLOv5 export formatting error addressed.
    • Missing or incorrect data corrected for logging and recording statements.
    • PyTorch DistillationModifier for transformers was ignoring both "self" distillation and "disable" distillation values; instead, normal distillation would be used.
    • FP16 not deactivating on QAT start for torchvision.

    Known Issues:

    • PyTorch > 1.9 quantized ONNX export is broken; waiting on PyTorch resolution and testing.
    Source code(tar.gz)
    Source code(zip)
    sparseml-1.0.0-py3-none-any.whl(785.05 KB)
  • v0.12.2(Jun 2, 2022)

  • v0.12.1(May 5, 2022)

    This is a patch release for 0.12.0 that contains the following changes:

    • Disabling of distillation modifiers no longer crashes Hugging Face Transformers integrations --distillation_teacher disable
    • Numeric stability is provided for distillation modifiers using log_softmax instead of softmax.
    • Accuracy and performance issues were addressed for quantized graphs in image classification and NLP.
    • When using mixed precision for a quantized recipe with image classification, crashes no longer occur.
    Source code(tar.gz)
    Source code(zip)
    sparseml-0.12.1-py3-none-any.whl(730.87 KB)
  • v0.12.0(Apr 22, 2022)

    New Features:

    • SparseML recipe stages support: recipes can be chained together to enable easier prototyping with compound sparsification.
    • SparseML image classification CLIs implemented to enable easy commands for training models like ResNet-50: sparseml.image_classification.train --help
    • FFCV support provided for PyTorch image classification pipelines.
    • Masked language modeling CLI added for Hugging Face transformers integration: sparseml.transformers.masked_language_modeling --help
    • DistilBERT support provided for Hugging Face transformers integration.

    Changes:

    • Modifiers logging upgraded to standardize logging across SparseML and integrations with hyperparameter stores like Weights and Biases.
    • Hugging Face Transformers integration updated to the latest state from the main upstream branch.
    • Ultralytics YOLOv5 Integration updated to the latest state from the main upstream branch.
    • Quantization-aware training graphs updated to enable better recovery and to provide optional support for deployment environments like TensorRT.

    Resolved Issues:

    • MFAC Pruning modifier multiple minor issues addressed that were preventing proper functioning in recipes leading to exceptions.
    • Distillation loss for transformers integration was not calculated correctly when inputs were multidimensional.
    • Minor fixes made across modifiers and transformers integration.

    Known Issues:

    • None
    Source code(tar.gz)
    Source code(zip)
    sparseml-0.12.0-py3-none-any.whl(733.96 KB)
  • v0.11.1(Mar 23, 2022)

    This is a patch release for 0.11.0 that contains the following changes:

    • Addressed removal of phased, score_type, and global_sparsity flag for PyTorch - GMPruningModifier; rather than crashing, exceptions are only thrown if they are turned on for instances of those modifiers with deprecation notices.
    • Crashes no longer occur when using sparseml.transformers training pipelines and distillation modifiers not working without the FP16 training flagged turned on.
    Source code(tar.gz)
    Source code(zip)
    sparseml-0.11.1-py3-none-any.whl(712.45 KB)
  • v0.11.0(Mar 11, 2022)

    New Features:

    • Hugging Face NLP masked language modeling CLI and support implemented for training and export.
    • PyTorch Image classification CLIs deployed.
    • WoodFisher/M-FAC pruning algorithm, AC/DC pruning algorithm, and structured pruning algorithm support added with modifiers for PyTorch.
    • Reduced precision support provided for quantization in PyTorch (< INT8).

    Changes:

    • Refactored pruning and quantization algorithms from the sparseml.torch.optim package to the sparseml.torch.sparsification package.

    Resolved Issues:

    • None

    Known Issues:

    • None
    Source code(tar.gz)
    Source code(zip)
    sparseml-0.11.0-py3-none-any.whl(711.69 KB)
  • v0.10.1(Feb 10, 2022)

  • v0.10.0(Feb 3, 2022)

    New Features:

    • Hugging Face Transformers native integration and CLIs implemented for installation to train transformer models.
    • Cyclic LR support added to LearningRateFunctionModifier in PyTorch.
    • ViT (vision transformer) examples added with the rwightman/timm integration.

    Changes:

    • Quantization implementation for BERT models improved (shorter schedules and better recovery).
    • PyTorch image classification script saves based on top 1 accuracy now instead of loss.
    • Integration rwightman/timm updated for ease-of-use with setup_integration.sh to set up the environment properly.

    Resolved Issues:

    • Github actions now triggering for external forks.

    Known Issues:

    • Conversion of quantized Hugging Face BERT models from PyTorch to ONNX is currently dropping accuracy, ranging from 1-25% depending on the task and dataset. A hotfix is being pursued; users can fall back to version 0.9.0 to prevent the issue.
    • Export for masked language modeling with Hugging Face BERT models from PyTorch is currently exporting incorrectly due to a configuration issue. A hotfix is being pursued; users can fall back to version 0.9.0 to prevent the issue.
    Source code(tar.gz)
    Source code(zip)
    sparseml-0.10.0-py3-none-any.whl(635.19 KB)
  • v0.9.0(Dec 1, 2021)

    New Features:

    • dbolya/yolact integration added with recipes, tutorial, and performant models for the YOLACT segmentation model.
    • Automatic recipe creation API for pruning recipes added, create_pruning_recipe, along with base class implementations for future expansion of RecipeEditor and RecipeBuilder.
    • Structured pruning now supported for channels and filters with StructuredPruningModifier and LayerThinningModifier.
    • PyTorch QAT pipelines: added support for automatic fusing of Conv-ReLU blocks, FPN layers, and Convs with shared weights.
    • Analyzer implementations for evaluating a model's performance and loss sensitivity to pruning and other algorithms added for ONNX framework.
    • Up-to-date version check implemented for SparseML.

    Changes:

    • Automatically unwrap PyTorch distributed modules so recipes do not need to be changed for distributed pipelines.
    • BERT recipes updated to use the distillation modifier.
    • References to num_sockets for the DeepSparse engine were removed following the deprecated support for DeepSparse 0.9.
    • Changed the block pruning flow to use FourBlockMaskCreator for block sparsity which will not impose any constraint on the divisibility of the channel's dimensions to be pruned on with the block size.
    • API docs recompiled.

    Resolved Issues:

    • Improper variable names corrected that were causing crashes for specific flows in the WoodFisher pruning algorithm.

    Known Issues:

    • None
    Source code(tar.gz)
    Source code(zip)
    sparseml-0.9.0-py3-none-any.whl(585.97 KB)
  • v0.8.0(Oct 26, 2021)

    New Features:

    • ONNX benchmarking APIs added.
    • QAT and export support added for torch.nn.Embedding layers.
    • PyTorch distillation modifier implemented.
    • Arithmetic and equation support for recipes added.
    • Sparsification oracle available now with initial support for automatic recipe creation.

    Changes:

    • Torchvision integration and training pipeline reworked to simplify and streamline the codebase.
    • Migration of PyTorch modifiers to base classes to be shared across all frameworks.

    Resolved Issues:

    • None

    Known Issues:

    • None
    Source code(tar.gz)
    Source code(zip)
    sparseml-0.8.0-py3-none-any.whl(560.47 KB)
  • v0.7.0(Sep 13, 2021)

    New Features:

    • Support added for
      • PyTorch 1.9.0.
      • Python 3.9.
      • ONNX versions 1.8 - 1.10.
    • PyTorch QATWrapper class to support quantization of custom modules through recipes added.
    • PyTorch image classification sparse transfer learning recipe and tutorial created.
    • Generic benchmarking API provided that can be overwritten for specific framework implementations.
    • M-FAC (WoodFisher) pruning implemented along with relat3ed documentation, and tutorials for one-shot and training-aware: https://arxiv.org/abs/2004.14340

    Changes:

    • Performance sensitivity analysis tests updated to respect new information coming from a change in the DeepSparse analysis API.

    Resolved Issues:

    • Repeated apply calls no longer occur for PyTorch pruning masks.
    • Neural Magic dependencies no longer require only matching major.minor versions (allow any bug version).
    • Support added for getting nightly versions if installed for framework info and Neural Magic package versions.

    Known Issues:

    • Hugging Face transformers integrations with num_epochs override from recipes is not currently working. The workaround is to set the num_epochs argument to the maximum number of epochs in the recipe.
    Source code(tar.gz)
    Source code(zip)
    sparseml-0.7.0-py3-none-any.whl(534.63 KB)
  • v0.6.0(Jul 30, 2021)

    New Features:

    Changes:

    • README updated for Hugging Face transformers integration based on the new implementation.
    • ONNX export in PyTorch now supports dictionary inputs.
    • Quantized graph export optimizations for YOLOv5.
    • PyTorch image classification integration updated to use new manager.modify(...) apis and saves recipes to runs folder.
    • DeepSparse YOLO links updated to point at new example location.
    • kwargs support added for ONNX export in PyTorch to enable dyanmic_axes and named inputs.

    Resolved Issues:

    • torch 1.8 quantization export no longer folds incorrectly.
    • ONNX toposort issue addressed for nodes with more than two outputs.
    • Unused initializers removed in quantized ONNX graphs.

    Known Issues:

    • None
    Source code(tar.gz)
    Source code(zip)
    sparseml-0.6.0-py3-none-any.whl(521.33 KB)
  • v0.5.1(Jun 30, 2021)

  • v0.5.0(Jun 28, 2021)

    New Features:

    • research folder added to root directory intended for research contributions.
    • First research contributions added for information retrieval.
    • Tutorial for sparsifying BERT on the SQuAD dataset created.
    • LayerPruningModifier and LearningRateFunctionModifier implementations added for PyTorch.

    Changes:

    • Hugging Face transformers integration reworked to match new integration standards.
    • CIFAR data augmentations updated for PyTorch datasets.
    • Pruning algorithms using a pruning scorer object for better extensibility refactored with new pruning methods.

    Resolved Issues:

    • If the source URL is down, tests no longer fail for VOC dataset.
    • Because the DeepSparse API includes more information for kernel sparsify performance analysis, previously failing tests have been updated to correctly check and return the updated info.
    • Models with more than 1 input can now complete the PyTorch ONNX export process.
    • Edge cases and better defaults improved with the WoodFisher/M-FAC algorithm for better recovery.
    • Deprecated use of torch.nonzero API call in the pruning modifiers to .nonzero(as_tuple=False).

    Known Issues:

    • None
    Source code(tar.gz)
    Source code(zip)
    sparseml-0.5.0-py3-none-any.whl(519.61 KB)
  • v0.4.0(Jun 4, 2021)

    New Features:

    • M-FAC/Woodfisher pruning algorithm alpha implemented.
    • Movement pruning algorithm alpha implemented.
    • Distillation code added for GLUE dataset in Hugging Face/transformers integration.
    • BERT quantization pipeline enabled for training and export to ONNX.

    Changes:

    • Readme redesigned for better clarity on the repository's purpose.
    • All examples, notebooks, and scripts moved under the integrations directory.
    • Integrations for ultralytics/yolov3, ultralytics/yolov5, pytorch, keras, tensorflow reworked to match new integrations standards for better ease of use.

    Resolved Issues:

    • rwightman/timm integration bugs dealing with checkpoint paths and loading models addressed.
    • tensorflow-gpu for tensorflow v1 now recognized correctly.
    • Neural Magic dependencies upgrade to intended bug versions instead of minor versions.

    Known Issues:

    • Movement pruning is currently only working with FP16 training on GPUs, FP32 is diverging to NaN.
    Source code(tar.gz)
    Source code(zip)
    sparseml-0.4.0-py3-none-any.whl(511.70 KB)
  • v0.3.1(May 14, 2021)

    This is a patch release for 0.3.0 that contains the following changes:

    • DeepSparse APIs now properly referencing VNNI check
    • Block sparse masks now applied for pruning modifiers
    • Some tests marked as flaky to make tests more consistent
    • Docs updated for new Discourse and Slack links
    • Modifier code refactored to better support Automatic Mixed Precision Training (AMP) in PyTorch
    • Emulated_step added to manager for inconsistent steps_per_epoch in PyTorch
    • Serialization of block sparse-enabled pruning modifiers no longer fail on reload
    Source code(tar.gz)
    Source code(zip)
    sparseml-0.3.1-py3-none-any.whl(506.37 KB)
  • v0.3.0(Apr 30, 2021)

    New Features:

    • YOLO integration with Ultralytics deployed including DeepSparse examples for benchmarking and running detection over videos.
    • Framework and Sparsification Info APIs now available for all supported ML frameworks.
    • Properties added to the ScheduledManager class to allow for lookup of contained modifiers such as pruning and quantization.
    • ALL_PRUNABLE token added for pruning modifiers.
    • PyTorch global magnitude pruning support implemented.
    • QAT support added for BERT.

    Changes:

    • Version changed to be loaded from version.py file, default build on branches is now nightly.
    • Additional unit tests added in for Keras integration.
    • PyTorch max supported version updated to 1.7.
    • Improved performance for parsing and fixing QAT ONNX graphs from PyTorch.

    Resolved Issues:

    • Docs typos and broken links addressed.
    • Pickling models with PyTorch pruning hooks work as expected.
    • Incorrect loss scaling for DDP in PyTorch vision.py script addressed.

    Known Issues:

    • None
    Source code(tar.gz)
    Source code(zip)
    sparseml-0.3.0-py3-none-any.whl(499.91 KB)
  • v0.2.0(Mar 31, 2021)

    New Features:

    • Keras sparsification beta supporting pruning and examples with Keras Applications available.
    • Training and sparsification integrated with the rwightman/pytorch-image-models repository.
    • Training, sparsification, and deployment integrated with the ultralytics/yolov5 repository.
    • Integrations with the SparseZoo to run PyTorch and Keras code implemented with recipes directly from the zoo.
    • PyTorch sparse-quantized transfer learning notebook available.
    • Keras pruned ResNet models implemented.
    • Groups of modifiers enabled in SparseML recipes.

    Changes:

    • Examples directory renamed to integrations.

    Resolved Issues:

    • GroupedPruningMaskCreator now able to save to PyTorch state dicts.
    • Quantization-aware training compatibility issues addressed with PyTorch.
    • Docs and readme corrections made for minor issues and broken links.
    • Makefile no longer deletes files for docs compilation and cleaning.

    Known Issues:

    • None
    Source code(tar.gz)
    Source code(zip)
    sparseml-0.2.0-py3-none-any.whl(454.97 KB)
  • v0.1.1(Mar 1, 2021)

    This is a patch release for 0.1.0 that contains the following changes:

    • Docs updates: tagline, overview, update to use sparsification for verbiage, fix broken links for recipes
    • Flaky decorator added to some sparsity tests so if they fail due to random chance will immediately retest
    • Modifier groups enabled for recipes
    • DeepSparse nightly build dependencies now match on major.minor and not full version
    • Serialization support for MaskedLayer in Keras added
    • Support implemented for loading recipes from SparseZoo to common scripts and APIs
    • Examples directory renamed to integrations to clarify function
    • Rwightman integration added
    • Ultralytics integration added
    • PyTorch sparse-quantized transfer learning notebook added
    Source code(tar.gz)
    Source code(zip)
    sparseml-0.1.1-py3-none-any.whl(416.46 KB)
  • v0.1.0(Feb 4, 2021)

    Welcome to our initial release on GitHub! Older release notes can be found here.

    New Features:

    • Keras Alpha for optimizing models using pruning added.
    • PyTorch 1.7 is supported.
    • PyTorch distributed supported for built-in training flows.
    • Torchvision models added to PyTorch ModelRegistry class.
    • MakeFile flows and utilities implemented for GitHub repo structure.

    Changes:

    • Software packaging updated to reflect new GitHub distribution channel, from file naming conventions to license enforcement removal.
    • Migration made to use the SparseZoo package for loading pre-trained models and recipes.
    • Migration made to use DeepSparse Engine for analyzing and benchmarking ONNX models.
    • ONNX and ONNXRuntime dependency versions updated to include latest.

    Resolved Issues:

    • Infinite recursion resolved for the PyTorch ScheduledOptimizer in nested optimizer flows.
    • tf2onnx folding nodes now working with Sparsify.

    Known Issues:

    • TensorFlow pre-trained models are not pushed currently in the SparseZoo and will fail to load.
    • TensorFlow V1 is no longer being built for newer operating systems such as Ubuntu 20.04. Therefore, SparseML with TensorFlow V1 is unsupported on these operating systems as well.
    Source code(tar.gz)
    Source code(zip)
    sparseml-0.1.0-py3-none-any.whl(413.09 KB)
Owner
Neural Magic
Neural Magic helps developers in accelerating deep learning performance using automated model sparsification technologies and a CPU inference engine.
Neural Magic
Face2webtoon - Despite its importance, there are few previous works applying I2I translation to webtoon.

Despite its importance, there are few previous works applying I2I translation to webtoon. I collected dataset from naver webtoon 연애혁명 and tried to transfer human faces to webtoon domain.

이상윤 64 Oct 19, 2022
Python wrapper class for OpenVINO Model Server. User can submit inference request to OVMS with just a few lines of code

Python wrapper class for OpenVINO Model Server. User can submit inference request to OVMS with just a few lines of code.

Yasunori Shimura 7 Jul 27, 2022
A faster pytorch implementation of faster r-cnn

A Faster Pytorch Implementation of Faster R-CNN Write at the beginning [05/29/2020] This repo was initaited about two years ago, developed as the firs

Jianwei Yang 7.1k Jan 1, 2023
This is the code of NeurIPS'21 paper "Towards Enabling Meta-Learning from Target Models".

ST This is the code of NeurIPS 2021 paper "Towards Enabling Meta-Learning from Target Models". If you use any content of this repo for your work, plea

Su Lu 7 Dec 6, 2022
Deep generative modeling for time-stamped heterogeneous data, enabling high-fidelity models for a large variety of spatio-temporal domains.

Neural Spatio-Temporal Point Processes [arxiv] Ricky T. Q. Chen, Brandon Amos, Maximilian Nickel Abstract. We propose a new class of parameterizations

Facebook Research 75 Dec 19, 2022
Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.

Smaller Multilingual Transformers This repository shares smaller versions of multilingual transformers that keep the same representations offered by t

Geotrend 79 Dec 28, 2022
Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"

T-Few This repository contains the official code for the paper: "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learni

null 220 Dec 31, 2022
Train neural network for semantic segmentation (deep lab V3) with pytorch in less then 50 lines of code

Train neural network for semantic segmentation (deep lab V3) with pytorch in 50 lines of code Train net semantic segmentation net using Trans10K datas

null 17 Dec 19, 2022
Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo Thomas Kollar, Michael Laskey, Kevin Stone, Brijen Thananjeyan

null 68 Dec 14, 2022
Collection of in-progress libraries for entity neural networks.

ENN Incubator Collection of in-progress libraries for entity neural networks: Neural Network Architectures for Structured State Entity Gym: Abstractio

null 25 Dec 1, 2022
Image morphing without reference points by applying warp maps and optimizing over them.

Differentiable Morphing Image morphing without reference points by applying warp maps and optimizing over them. Differentiable Morphing is machine lea

Alex K 380 Dec 19, 2022
Fuzzification helps developers protect the released, binary-only software from attackers who are capable of applying state-of-the-art fuzzing techniques

About Fuzzification Fuzzification helps developers protect the released, binary-only software from attackers who are capable of applying state-of-the-

gts3.org (SSLab@Gatech) 55 Oct 25, 2022
Few-NERD: Not Only a Few-shot NER Dataset

Few-NERD: Not Only a Few-shot NER Dataset This is the source code of the ACL-IJCNLP 2021 paper: Few-NERD: A Few-shot Named Entity Recognition Dataset.

THUNLP 319 Dec 30, 2022
Enabling dynamic analysis of Legacy Embedded Systems in full emulated environment

PENecro This project is based on "Enabling dynamic analysis of Legacy Embedded Systems in full emulated environment", published on hardwear.io USA 202

Ta-Lun Yen 10 May 17, 2022
Fine-Tune EleutherAI GPT-Neo to Generate Netflix Movie Descriptions in Only 47 Lines of Code Using Hugginface And DeepSpeed

GPT-Neo-2.7B Fine-Tuning Example Using HuggingFace & DeepSpeed Installation cd venv/bin ./pip install -r ../../requirements.txt ./pip install deepspe

Nikita 180 Jan 5, 2023
sequitur is a library that lets you create and train an autoencoder for sequential data in just two lines of code

sequitur sequitur is a library that lets you create and train an autoencoder for sequential data in just two lines of code. It implements three differ

Jonathan Shobrook 305 Dec 21, 2022
Deploy a ML inference service on a budget in less than 10 lines of code.

BudgetML is perfect for practitioners who would like to quickly deploy their models to an endpoint, but not waste a lot of time, money, and effort trying to figure out how to do this end-to-end.

null 1.3k Dec 25, 2022
Create Data & AI apps in 20 lines of code with Shimoku

Install with: pip install shimoku-api-python Start with: from os import getenv import shimoku_api_python.client as Shimoku

Shimoku 5 Nov 7, 2022