BigDL: Distributed Deep Learning Framework for Apache Spark

Overview


BigDL: Distributed Deep Learning on Apache Spark

What is BigDL?

BigDL is a distributed deep learning library for Apache Spark; with BigDL, users can write their deep learning applications as standard Spark programs, which can directly run on top of existing Spark or Hadoop clusters. To makes it easy to build Spark and BigDL applications, a high level Analytics Zoo is provided for end-to-end analytics + AI pipelines.

  • Rich deep learning support. Modeled after Torch, BigDL provides comprehensive support for deep learning, including numeric computing (via Tensor) and high level neural networks; in addition, users can load pre-trained Caffe or Torch models into Spark programs using BigDL.

  • Extremely high performance. To achieve high performance, BigDL uses Intel MKL / Intel MKL-DNN and multi-threaded programming in each Spark task. Consequently, it is orders of magnitude faster than out-of-box open source Caffe, Torch or TensorFlow on a single-node Xeon (i.e., comparable with mainstream GPU). With adoption of Intel DL Boost, BigDL improves inference latency and throughput significantly.

  • Efficiently scale-out. BigDL can efficiently scale out to perform data analytics at "Big Data scale", by leveraging Apache Spark (a lightning fast distributed data processing framework), as well as efficient implementations of synchronous SGD and all-reduce communications on Spark.

Why BigDL?

You may want to write your deep learning programs using BigDL if:

  • You want to analyze a large amount of data on the same Big Data (Hadoop/Spark) cluster where the data are stored (in, say, HDFS, HBase, Hive, etc.).

  • You want to add deep learning functionalities (either training or prediction) to your Big Data (Spark) programs and/or workflow.

  • You want to leverage existing Hadoop/Spark clusters to run your deep learning applications, which can be then dynamically shared with other workloads (e.g., ETL, data warehouse, feature engineering, classical machine learning, graph analytics, etc.)

How to use BigDL?

Citing BigDL

If you've found BigDL useful for your project, you can cite the paper as follows:

@inproceedings{SOCC2019_BIGDL,
  title={BigDL: A Distributed Deep Learning Framework for Big Data},
  author={Dai, Jason (Jinquan) and Wang, Yiheng and Qiu, Xin and Ding, Ding and Zhang, Yao and Wang, Yanzhang and Jia, Xianyan and Zhang, Li (Cherry) and Wan, Yan and Li, Zhichao and Wang, Jiao and Huang, Shengsheng and Wu, Zhongyuan and Wang, Yang and Yang, Yuhao and She, Bowen and Shi, Dongjie and Lu, Qi and Huang, Kai and Song, Guoqiong},
  booktitle={Proceedings of the ACM Symposium on Cloud Computing},
  publisher={Association for Computing Machinery},
  pages={50--60},
  year={2019},
  series={SoCC'19},
  doi={10.1145/3357223.3362707},
  url={https://arxiv.org/pdf/1804.05839.pdf}
}
Comments
  • How to save a BigDL model in the following example ? is there any api doc ?

    How to save a BigDL model in the following example ? is there any api doc ?

    https://github.com/mrafayaleem/transfer-learning-bigdl/blob/master/transfer-learning-bigdl.ipynb

    It was not saved as xx.model when I ran antbeeModel.save("/root/Desktop/model.model")

    user issue 
    opened by 704572066 49
  • Example test on yarn

    Example test on yarn

    1. Change code to add a deploymode option. Reference: https://github.com/intel-analytics/BigDL/blob/branch-2.0/python/dllib/src/bigdl/dllib/models/inception/inception.py, https://github.com/intel-analytics/BigDL/blob/branch-2.0/python/dllib/src/bigdl/dllib/models/lenet/lenet5.py
    2. Add test to python/dllib/src/bigdl/dllib/examples/run-example-tests-yarn-integration.sh
    3. Run jenkins http://10.112.231.51:18888/view/ZOO-PR/job/ZOO-PR-Python-integration-test/

    TODO: Move run-example-tests-yarn-integration.sh to python/dllib/examples/. (xin)

    • [ ] dllib examples
    • [ ] orca examples

    dllib examples, use init_nncontext | Module | Example | Added | Client Mode | Cluster Mode | | ----------- | ----------- | ----------- | ----------- | ----------- | | autograd | custom.py |Y | Succeed | Succeed | | autograd | customloss.py | Y |Succeed | Succeed | | nnframes | imageInference | Y | Succeed | Succeed | | nnframes | imageTransferLearning | Y | Succeed | Succeed |

    orca examples, use init_orca_context |Module|Example|Added|Client Mode|Cluster Mode| |-|-|-|-|-| | automl | autoestimator/autoestimator_pytorch.py |Y | Succeed | Succeed | | automl | autoxgboost/AutoXGBoostClassifier.p (https://github.com/intel-analytics/analytics-zoo/issues/5049) |Y | Succeed | Succeed | | automl | autoxgboost/AutoXGBoostRegressor.py (https://github.com/intel-analytics/analytics-zoo/issues/5049)|Y | Succeed | Succeed | | data | spark_pandas.py | Y | Succeed | Succeed | | bigdl | learn/bigdl/attention/transformer.py | Y | Succeed | Failed | | bigdl | learn/bigdl/imageInference/imageInference.py | Y | Succeed | Failed | | horovod | learn/horovod/pytorch_estimator.py | Y | Succeed | Succeed | | horovod | simple_horovod_pytorch.py | Y | Succeed | | | mxnet | learn/mxnet/lenet_mnist.py | Y | Succeed | | | openvino | learn/openvino/predict.py | Not Added | | | | pytorch | learn/pytorch/cifar10/cifar10.py | Y | Succeed | Failed | | pytorch | learn/pytorch/fashion_mnist/fashion_mnist.py | Y | Succeed | Failed | | pytorch | learn/pytorch/super_resolution/super_resolution.py | Y | Succeed | Failed | | tf | learn/tf/basic_text_classification/basic_text_classification.py | Y | Succeed | Failed | | tf | learn/tf/image_segmentation/image_segmentation.py | Y | Succeed | Failed | | tf | learn/tf/inception/inception.py | Y | Succeed | Failed | | tf | learn/tf/transfer_learning/transfer_learning.py | Y | Failed | Failed | | tf2 | learn/tf2/resnet/resnet-50-imagenet.py | Y | Failed | Failed | | tf2 | learn/tf2/yolov3/yoloV3.py | Y | Succeed | Succeed | | ray_on_spark | ray_on_spark/parameter_server/async_parameter_server.py | Y | Succeed | Failed | | ray_on_spark | ray_on_spark/parameter_server/sync_parameter_server.py | Y | Succeed | Succeed | | ray_on_spark | ray_on_spark/rl_pong/rl_pong.py | Y | Succeed | Succeed | | ray_on_spark | ray_on_spark/rllib/multiagent_two_trainers.py | Y | Succeed | Succeed | | tfpark | tfpark/estimator/estimator_dataset.py | Y | Succeed | Failed | | tfpark | tfpark/estimator/estimator_inception.py | Y | Succeed | Failed | | tfpark | tfpark/estimator/pre-made-estimator.py | Not Added | | | | tfpark | tfpark/gan/gan_train_and_evaluate.py | Y | Failed | | | tfpark | tfpark/keras/keras_dataset.py | Y | Succeed | Failed | | tfpark | tfpark/keras/keras_ndarray.py | Y | Succeed | Failed | | tfpark | tfpark/tf_optimizer/evaluate.py | Y | Succeed | Failed | | tfpark | tfpark/tf_optimizer/train.py | Y | Succeed | Failed | | torchmodel | torchmodel/train/imagenet/main.py | Y | Succeed | Failed | | torchmodel | torchmodel/train/mnist/main.py | Y | Succeed | Failed | | torchmodel | torchmodel/train/resnet_finetune/resnet_finetune.py | Y | Succeed | Failed |

    opened by qiuxin2012 29
  • the consistency of preprocessing between NNFrame transform and customized one

    the consistency of preprocessing between NNFrame transform and customized one

    I did transfer learning of image classification and deployed the saved the model in cluster-serving,then I found the prediction results between pyspark pipeline transform and cluster serving http api inference are different. It turns out to be the model inputs (aka. preprocessing output) are different. The preprocessing of NNFrame is ChainedPreprocessing and my customized one seems the same with it. the original issue https://github.com/intel-analytics/BigDL/issues/3764

    transfer learning preprocessing code:

    import os.path as osp
    from bigdl.dllib.nn.criterion import *
    from bigdl.dllib.nn.layer import *
    from bigdl.dllib.nnframes import *
    from bigdl.dllib.feature.image import *
    
    
    def build_transforms(params):
        from bigdl.dllib.feature.common import ChainedPreprocessing
        transformer = ChainedPreprocessing(
            [RowToImageFeature(), ImageResize(256, 256), ImageCenterCrop(224, 224),
             ImageChannelNormalize(123.0, 117.0, 104.0), ImageMatToTensor(), ImageFeatureToTensor()])    
        return transformer
    
    def build_classifier():
        ...
    
    def train(task_path, dataset_path, params):
        from pyspark.ml import Pipeline
        from pyspark.ml.evaluation import MulticlassClassificationEvaluator
        from pyspark.sql.functions import udf
        from pyspark.sql.types import DoubleType, StringType
    
        from bigdl.dllib.nnframes import NNImageReader
        from bigdl.dllib.utils.common import redire_spark_logs
    
        spark_conf = SparkConf().set("spark.driver.memory", "10g") \
            .set("spark.driver.cores", 4)
    
        sc = init_nncontext(spark_conf, cluster_mode="local")
        # redire_spark_logs("float", osp.join(task_path, 'out.log'))
    
        getFileName = udf(lambda row: os.path.basename(row[0]), StringType())
        getLabel = udf(lambda row: 1.0 if 'ants' in row[0] else 2.0, DoubleType())
    
        trainingDF = NNImageReader.readImages(osp.join(dataset_path, 'train/*'), sc, resizeH=300, resizeW=300, image_codec=1)
        trainingDF = trainingDF.withColumn('filename', getFileName('image')).withColumn('label', getLabel('image'))
    
        validationDF = NNImageReader.readImages(osp.join(dataset_path, 'val/*'), sc, resizeH=300, resizeW=300, image_codec=1)
        validationDF = validationDF.withColumn('filename', getFileName('image')).withColumn('label', getLabel('image'))
    
    
        transformer = build_transforms(params)
        preTrainedNNModel = NNModel(Model.loadModel(osp.join(dataset_path,'analytics-zoo_resnet-50_imagenet_0.1.0.model')), transformer) \
            .setFeaturesCol("image") \
            .setPredictionCol("embedding")
    
        classifier = build_classifier()
        pipeline = Pipeline(stages=[preTrainedNNModel, classifier])
    
        antbeeModel = pipeline.fit(trainingDF)
        predictionDF = antbeeModel.transform(validationDF).cache()
    

    customized preprocessing code:

    import base64
    import cv2
    import numpy as np
    from urllib import request
    import json
    import matplotlib.pyplot as plt
    import pylab
    # import torch
    def resize_img(img, target_size):
        img = cv2.resize(img, (target_size, target_size))
        return img
    
    
    def center_crop_img(im, target_size):
        w, h = im.shape[0], im.shape[1]
        tw, th = target_size, target_size
        assert (w >= target_size) and (h >= target_size), \
                "image width({}) and height({}) should be larger than crop size".format(w, h, target_size)
        x1 = int(round((w - tw) / 2.))
        y1 = int(round((h - th) / 2.))
        im = im[x1:x1+tw, y1:y1+th]
        return im
    
    def normalize_image(x, mean=(0., 0., 0.), std=(1.0, 1.0, 1.0)):
        '''Normalization.
    
        Args:
            x: input image.
            mean: mean value of the input image.
            std: standard deviation value of the input image.
    
        Returns:
            Normalized image.
        '''
    
        x = np.asarray(x, dtype=np.float32)
        if len(x.shape) == 4:
            for dim in range(3):
                x[:, :, :, dim] = (x[:, :, :, dim] - mean[dim]) / std[dim]
        if len(x.shape) == 3:
            for dim in range(3):
                x[:, :, dim] = (x[:, :, dim] - mean[dim]) / std[dim]
        return x
    
    def preprocess(img):
        img = img[..., ::-1]
        image_resize = resize_img(img, 256)
        ccrop_img = center_crop_img(image_resize, 224)
        n_img = normalize_image(ccrop_img, [123.0, 117.0, 104.0])
        return n_img
    
    if __name__ == '__main__':
        bee = cv2.imread("/root/Desktop/ws/datasets/D0002/val/bees/586041248_3032e277a9.jpg")
        t_bee = preprocess(bee)
    

    result of NNFrame preprocessing:

    -49.0	-48.0	-48.0	...	-12.0	-12.0	-11.0	
    -50.0	-49.0	-49.0	...	-12.0	-12.0	-11.0	
    -51.0	-50.0	-51.0	...	-12.0	-12.0	-10.0
    

    result of customized preprocessing:

    [-56., -38., -49.],
    [-55., -37., -48.],
    [-54., -36., -47.],
    ...,
    
    • Image link: https://user-images.githubusercontent.com/23404868/148499487-d0b723c9-df53-4213-8779-65f1e6b8c5f3.jpg
    • model link: https://github.com/704572066/online-ai-platform/raw/master/model/20220106.model
    user issue 
    opened by 704572066 23
  • Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.

    Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.

    Hello, Im using docker container built by this BigDL image. when I tried to collect the predictions using collect() this error occurs: Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.

    this is the code:

    def retrain(self, batch_size):    
            minibatch =random.sample(self.experience_replay, batch_size)
            for state, action, reward, next_state in minibatch:
                state = np.asmatrix(state)
                next_state = np.asmatrix(next_state)
                print('state type',state)
                print('next state type',next_state)
                target = self.q_network.predict(state)
                p= target.collect()          
                tt = self.target_network.predict(next_state)
                t=tt.collect()
                p[0][action] = reward+self.gamma * np.amax(t)           
                self.q_network.fit(state, p, verbose=0)
            self.dqn_update_time-=1
            print("***********",self.dqn_update_time,"************ ")
            if self.dqn_update_time==0: 
              self.dqn_update_time=100 #dqn_time
              self.alighn_target_model()
              print('model updated')
    

    this is the error:

    /tmp/ipykernel_1032/2958540146.py in retrain(self, batch_size)
         71             print('next state type',next_state)
         72             target = self.q_network.predict(state)
    ---> 73             p= target.collect()
         74 
         75             tt = self.target_network.predict(next_state)
    
    /opt/work/spark-3.1.2/python/lib/pyspark.zip/pyspark/rdd.py in collect(self)
        947         """
        948         with SCCallSiteSync(self.context) as css:
    --> 949             sock_info = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd())
        950         return list(_load_from_socket(sock_info, self._jrdd_deserializer))
        951 
    
    /usr/local/envs/bigdl/lib/python3.7/site-packages/py4j/java_gateway.py in __call__(self, *args)
       1303         answer = self.gateway_client.send_command(command)
       1304         return_value = get_return_value(
    -> 1305             answer, self.gateway_client, self.target_id, self.name)
       1306 
       1307         for temp_arg in temp_args:
    
    /opt/work/spark-3.1.2/python/lib/pyspark.zip/pyspark/sql/utils.py in deco(*a, **kw)
        109     def deco(*a, **kw):
        110         try:
    --> 111             return f(*a, **kw)
        112         except py4j.protocol.Py4JJavaError as e:
        113             converted = convert_exception(e.java_exception)
    
    /usr/local/envs/bigdl/lib/python3.7/site-packages/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
        326                 raise Py4JJavaError(
        327                     "An error occurred while calling {0}{1}{2}.\n".
    --> 328                     format(target_id, ".", name), value)
        329             else:
        330                 raise Py4JError(
    
    Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
    

    could anyone explain why this error occured and how fix it

    user issue 
    opened by fatenlouati 22
  • Provide quantization for nano

    Provide quantization for nano

    Description

    Propose to integrate quantization methods into nano to reduce the model size and accelerate inference. Neural Compressor provides a set of methods to quantize a model to simplify the usage.
    Discussion on the details is as below in comments.

    Related tasks

    • Intel Neural Compressor (INC)
      • Post-training Quantization
        • [ ] Pytorch quantization API.
          • PR: #3602
        • [ ] Keras quantization API.
          • Issue: #3651
          • PR: #3856
        • [ ] Examples
      • Quantization-Aware Training
        • [ ] TBD
    • OpenVINO
      • [ ] OpenVINO Support
      • NNCF
        • [ ] TBD
      • POT
        • [ ] TBD
    opened by zhentaocc 20
  • How any epochs to be run in Bigdl.sh

    How any epochs to be run in Bigdl.sh

    I have 64 cores, 1 node and 32g driver memory allocated. I am running the training, but it has already crossed 12 epcochs. Wanted to check the number of epoch and time that this bigdl.sh will run in such a configuration.

    opened by jaymahesh 20
  • Orca: Align the data analysis method of dataloader and dataframe

    Orca: Align the data analysis method of dataloader and dataframe

    Description

    when model has only one input which is a list or tuple consists of tensors, we should not extract it in args.

    Basic Assumption: There are only three possible types in features: torch.Tensor, list\tuple and dict

    features and lables type list:

    | | features | labels | | -----| ---- | ---- | | dataframe/Xshards | a (list or tuple) of tensor, or a dict(Xshards only) | a single tensor, tuple, list | | raydataset | a single tensor or a list of tensors | a single tensor or a list of tensors | | dataloader | a single tensor or a list of user's input(all elements beside last one which is lable ) | last one which is label |

    When will features be a single tensor?

    1. dataloader yields feature consists of only one tensor.
    2. there is only one feature_column specified by user.(df, xshard and rayDataset)

    When will features be a list or tuple?

    1. dataloader yields feature consists of more than one tensor or object is not a tensor.
    2. there is more than only one feature_column specified by user.(df, xshard and rayDataset)

    When will features be a dict? only when input is XShards of dictionary

    1. Why the change?

    https://github.com/intel-analytics/BigDL/issues/5762 In some case, the model does take x as a list of two tensors as input:

        def forward(self, x, bboxes=None):
            x = x[:]  # avoid pass by reference
            x = self.s1(x)
    

    code but our torchrunner will extract this as two separated ones: https://github.com/intel-analytics/BigDL/blob/affe54803c320afd4fc0631dc3fa02f8be1cfcdc/python/orca/src/bigdl/orca/learn/pytorch/training_operator.py#L279

    2. User API changes

    none

    3. Summary of the change

    ~~before: output = self.model(*features)~~ ~~after: output = self.model(*features) if not isSingleListInput else self.model(features)~~

    if data is a pt dataloader of creator, reload_dataloader_creator wil combine all elements besides lables into a list, and if feature consists of only one tensor it remains the same:

    
    def make_dataloader_list_wrapper(func):
        import torch
        def make_feature_list(batch):
            if func is not None:
                batch = func(batch)
            *features, target = batch
            if len(features) == 1 and torch.is_tensor(features[0]):
                features = features[0]
            return features, target
    
        return make_feature_list
    

    and will parse features here:

            features, target = batch
            # Compute output.
            with self.timers.record("fwd"):
                if torch.is_tensor(features):
                    output = self.model(features)
                elif isinstance(features, (tuple, list)):
                    output = self.model(*features)
    

    This ensure the consistency of *features and user input.

    And current df, xshard and raydataset logic is right, we keep it safe.

    4. How to test?

    • [ ] N/A
    • [x] Unit test
    • [ ] Application test
    • [ ] Document test
    • [ ] ...
    orca 
    opened by leonardozcm 19
  • Spark 3 with Scala 2.12

    Spark 3 with Scala 2.12

    Hi devs!

    Not an issue but a question (feel free to add the label):

    Given that Spark 3 has been released now a few month ago, do you have any plans to go for it and update to Spark 3 which also needs to move to Scala 2.12? I'm also wondering if this Spark 3 deployment script can work with the current version still being on Scala 2.11

    Thanks in advance.

    user issue 
    opened by LorenzBuehmann 18
  • Add Back PR Template

    Add Back PR Template

    Description

    This PR adds a pull request template.

    Motivation and Context

    This is a good practice to add a structured short description of a submitted PR. This way the reviewer and future developers can better understand the changes and the context of the changes.

    API Usage or Code Design.

    The proposed PR template is presented as in this description.

    Related Link or Issue

    N/A

    How was this PR tested?

    N/A

    • [ ] jenkins passed (please provide link):

    • [ ] documentation build passed (please provide link):

    License & Dependency

    N/A

    opened by yangw1234 17
  • Replace DL_ENGINE_TYPE env variable with property bigdl.engineType for now, and also make it default to mklblas

    Replace DL_ENGINE_TYPE env variable with property bigdl.engineType for now, and also make it default to mklblas

    What changes were proposed in this pull request?

    Replace DL_ENGINE_TYPE env variable with property bigdl.engineType for now, and also make it default to mklblas. This makes it work with the only value that's supported out of the box.

    How was this patch tested?

    Existing tests.

    Related links or issues (optional)

    This is a small subset of https://github.com/intel-analytics/BigDL/issues/788

    opened by srowen 17
  • add backwardGraphPruning feature

    add backwardGraphPruning feature

    What changes were proposed in this pull request?

    Add backwardGraph pruning feature for [[StaticGraph]], which means remove all pre-processing nodes from [[backwardGraph]] whose element is an [[Operation]] or who is simply depended on [[Operation]] based nodes during backward.

    I think this feature will be useful when building a graph with numerous operations for data pre-processing. Futhermore, it make possible to retreat modules with parameters as operations(such as nn.Abs), as long as they are full depended on operation nodes during backward.

    And we are working on data pre-processing operations, mainly for data-mining. :)

    How was this patch tested?

    unit test

    opened by sperlingxx 16
  • Chronos: add forecaster alg choose guide and some cleaning for how to guide

    Chronos: add forecaster alg choose guide and some cleaning for how to guide

    Description

    1. Why the change?

    This is a "remake" for https://github.com/intel-analytics/BigDL/pull/5364, since previous PR has too many conflicts.

    2. User API changes

    nothing

    3. Summary of the change

    choose forecaster guide: https://bigdl-junweid.readthedocs.io/en/forecaster-alg-choose/doc/Chronos/Howto/how_to_choose_forecasting_alg.html renewed how to guide overview: https://bigdl-junweid.readthedocs.io/en/forecaster-alg-choose/doc/Chronos/Howto/index.html

    4. How to test?

    • [ ] Document test
    document Chronos 
    opened by TheaperDeng 0
  • Nano: fix keras onnx model output shape

    Nano: fix keras onnx model output shape

    Description

    Fix the output shape of keras onnx model

    1. Why the change?

    tf.py_function(func, ..., Tout=tf.float32) will convert the output of func to a single Tensor, that means [t] or [[t]] (t is a tf.Tensor) all will be converted to t.

    Besides, if the original model return a single tensor t, then after tracing, onnxruntime model will return [t].

    These two rules will destroy the output shape when the original output is a single tensor, or a list of tensor with only one element, or a list of list of tensor with only one element, ...

    Before this PR, we simply check whether the output is a list and has only one element, if is, then return its first element. We use this method to

    However, this method cannot distinguish whether the original model return t or [t] (onnxruntime model will return [t] in both case). And we don't handle the influence of tf.py_function.

    This PR fix both, it save the right output shape first, and convert the final output to this right shape.

    2. User API changes

    N/A

    3. Summary of the change

    N/A

    4. How to test?

    • [ ] Unit test
    Nano 
    opened by MeouSker77 0
  • chronos

    chronos

    Traceback (most recent call last): File "autolstm_nyc_taxi.py", line 21, in from bigdl.chronos.data import get_public_dataset ImportError: cannot import name 'get_public_dataset' from 'bigdl.chronos.data'

    Chronos 
    opened by YUKUN-XIAO 12
  • Nano: fix documentation typos in tensorflow inference and quantization

    Nano: fix documentation typos in tensorflow inference and quantization

    Description

    For quantization API in tensorflow inference engine, there's no keyword argument of "calib_dataset", instead using "x" simply.

    Nano 
    opened by HensonMa 0
  • [Nano] Add a generalized how-to guide for accelerate PyTorch cv data process pipeline

    [Nano] Add a generalized how-to guide for accelerate PyTorch cv data process pipeline

    Description

    Add a generalized how-to guide for accelerate PyTorch cv data process pipeline

    1. Why the change?

    The cv data process pipeline acceleration are exactly the same for PyTorch and PyTorch Lightning applications. There is no need to add separated how-to guides in PyTorch/PyTorch Lightning Training sections.

    2. Summary of the change

    • Add a Nano how-to guide section "Preprocessing"

    • Add how to guide “How to accelerate a computer vision data processing pipeline” for PyTorch

    • Restyled quote blocks for better note/warning/related reading box styles Before:

      After:

    3. How to test?

    • [x] Document test: https://yuwentestdocs.readthedocs.io/en/nano-pytorch-cv-pipeline/doc/Nano/Howto/index.html
    • [x] Github Notebook preview: https://github.com/Oscilloscope98/BigDL/blob/nano-pytorch-cv-pipeline/python/nano/tutorial/notebook/preprocessing/pytorch/accelerate_pytorch_cv_data_pipeline.ipynb
    • [ ] Notebook test locally (conda create an empty environment with python=3.7)
    document Nano 
    opened by Oscilloscope98 0
Releases(v2.1.0)
  • v2.1.0(Sep 28, 2022)

    Highlights

    Note: BigDL v2.1.0 has been updated to include functional and security updates. Users should update to the latest version.

    • Orca
      • Improve user experience and API consistency for Orca Estimators.
      • Support directly save and load TensorFlow model format in Orca TensorFlow2 Estimator.
      • Provide more examples (e.g. PyTorch brain image segmentation, XShards tutorials for distributed Python data processing), etc.
      • Support customized metrics in Orca PyTorch Estimator.
    • Nano
      • New inference optimization pipelines, with more optimization methods and a new InferenceOptimizer
      • More training optimization methods (bf16, channel last)
      • Add TorchNano support for PyTorch model customized training loop
      • Auto-scale learning rate for multi-instance training
      • Built-in AutoML support through hyperparameter optimization
      • Support a wide range versions of pytorch (1.9-1.12) and tensorflow (2.7-2.9)
    • DLlib
      • Add LightGBM support
      • Improve Keras-style model summary API
      • Add Python support for loading HDFS files
    • Chronos
      • Add new Autoformer (https://arxiv.org/abs/2106.13008) Forecaster and pipeline that are optimized on CPU
      • Tensorflow 2 support for LSTM, Seq2Seq, TCN and MTNet Forecasters
      • Add light-weight (does not rely on Spark/Ray Tune) auto tunning
      • Better support on distributed workflow (spark df and distributed pandas processing)
      • Add more installation options is now supported to make the installation lighter
    • Friesian:
      • Integration of DeepRec (https://github.com/alibaba/DeepRec) with Friesian.
      • Add more reference examples, e.g. multi-task recommendation, TFRS (https://www.tensorflow.org/recommenders) list-wise ranking, LightGBM training, etc.
      • Add a reference example for offline distributed similarity search (using FAISS)
      • More operations in FeatureTable (e.g. string embeddings with BERT, etc.).
    • PPML
      • Upgrade BigDL PPML on Gramine.
      • Improve the attestation and key managing process
      • More Big Data frameworks on BigDL PPML (including spark, flink, hive, hdfs, etc.)
      • Add PPMLContext API for encryption IO and KMS, supports different file formats, encryption algorithms and KMS services
      • Support PSI, Pytorch NN, Keras NN, FGBoost (federated XGBoost) in VFL scenario, linear regression & logistic regression for VFL
    Source code(tar.gz)
    Source code(zip)
  • v2.0.0(Mar 9, 2022)

  • v0.12.2(Apr 21, 2021)

  • v0.10.0(Nov 5, 2019)

    Highlights

    • Continue RNN optimization. We support both LSTM and GRU integration with MKL-DNN which acheives ~3x performance

    • ONNX support. We support loading third party framework models via ONNX

    • Richer data preprocssing support and segmentation inference pipeline support

    Details

    • [New Feature] Full MaskRCNN model support with data processing
    • [New Feature] Support variable-size Resize
    • [New Feature] Support batch input for region proposal
    • [New Feature] Support samples of different size in one minibatch
    • [New Feature] MAP validation method implementation
    • [New Feature] ROILabel enhancement to support both object detection and segmentation
    • [New Feature] Grey image support for segmentation
    • [New Feature] Add TopBlocks support for Feature Pyramid Networks (FPN)
    • [New Feature] GRU integration with MKL-DNN support
    • [New Feature] MaskHead support for MaskRCNN
    • [New Feature] BoxHead support for MaskRCNN
    • [New Feature] RegionalProposal support for MaskRCNN
    • [New Feature] Shape operation support for ONNX
    • [New Feature] Gemm operation support for ONNX
    • [New Feature] Gather operation support for ONNX
    • [New Feature] AveragePool operation support for ONNX
    • [New Feature] BatchNormalization operation support for ONNX
    • [New Feature] Concat operation support for ONNX
    • [New Feature] Conv operation support for ONNX
    • [New Feature] MaxPool operation support for ONNX
    • [New Feature] Reshape operation support for ONNX
    • [New Feature] Relu operation support for ONNX
    • [New Feature] SoftMax operation support for ONNX
    • [New Feature] Sum operation support for ONNX
    • [New Feature] Squeeze operation support for ONNX
    • [New Feature] Const operation support for ONNX
    • [New Feature] ONNX model loader implementation
    • [New Feature] RioAlign layer support
    • [Enhancement] Align batch normalization layer between mklblas and mkl-dnn
    • [Enhancement] Python API enhancement to support nested list input
    • [Enhancement] Multi-model training/inference support with MKL-DNN
    • [Enhancement] BatchNormalization fusion with Scale
    • [Enhancement] SoftMax companion object support no argument initialization
    • [Enhancement] Python support for training with MKL-DNN
    • [Enhancement] Docs enhancement
    • [Bug Fix] Fix model version comparison
    • [Bug Fix] Fix graph backward bug for ParallelTable
    • [Bug Fix] Fix memory leak for training with MKL-DNN
    • [Bug Fix] Fix performance caused by denormal values during training
    • [Bug Fix] Fix SoftMax segment fault issue under MKL-DNN
    • [Bug Fix] Fix TimeDistributedCriterion python API inconsistent with Scala
    Source code(tar.gz)
    Source code(zip)
  • v0.9.0(Jul 22, 2019)

    Highlights

    • Continue VNNI acceleration support, we add optimization for more CNN models including object detection models, enhance model scales generation support for VNNI.

    • Add attention based model support, we add Transformer implementation for both lanuage model and translation model.

    • RNN optimization, We support LSTM integration with MKL-DNN which acheives ~3x performance speedup.

    Details

    • [New Feature] Add attention layer support
    • [New Feature] Add FeedForwardNetwork layer support
    • [New Feature] Add ExpandSize layer support
    • [New Feature] Add TableOperation layer to support table calculation with different input sizes
    • [New Feature] Add LayerNormalizaiton layer support
    • [New Feature] Add Transformer support for both language and translation models
    • [New Feature] Add beam search support in Transformer model
    • [New Feature] Add Layer-wise adaptve rate scaling optim method
    • [New Feature] Add LSTM integration with MKL-DNN support
    • [New Feature] Add dilated convolution integration with MKL-DNN support
    • [New Feature] Add parameter process for LarsSGD optim method
    • [New Feature] Support Affinity binding option with mkl-dnn
    • [Enhancement] Document enhancement for configuration and build
    • [Enhancement] Reflection enhancement to get default values for constructor parameters
    • [Enhhancement] User one AllReducemParameter for multi-optim method training
    • [Enhancement] CAddTable layer enhancement to support input expansion along specific dimension
    • [Enhancement] Resnet-50 preprocessing pipeline enhancement to replace RandomCropper with CenterCropper
    • [Enhancement] Calculate model scales for arbitrary mask
    • [Enhancment] Enable global average pooling
    • [Enhancement] Check input shape and underlying MKL-DNN layout consistency
    • [Enhancement] Threadpool enhancement to throw proper exception at executor runtime
    • [Enhancement] Support mkl-dnn format conversion from ntc to tnc
    • [Bug Fix] Fix backward graph generation topology ordering issue
    • [Bug Fix] Fix MemoryData hash code calculation
    • [Bug Fix] Fix log output for BCECriterion
    • [Bug Fix] Fix setting mask for container quantization
    • [Bug Fix] Fix validation accuracy issue when multi-executor running with the same worker
    • [Bug Fix] Fix INT8 layer fusion between conlution with multi-group masks and BatchNormalization
    • [Bug Fix] Fix JoinTable scales generation issue
    • [Bug Fix] Fix CMul forward issue with special input format
    • [Bug Fix] Fix weights change issue after model fusion issue
    • [Bug Fix] Fix SpatinalConvolution primitives initializaiton issue
    Source code(tar.gz)
    Source code(zip)
  • v0.8.0(Mar 28, 2019)

    Highlights

    • Add MKL-DNN Int8 support, especially for VNNI acceleration support. Low precision inference accelerates both latency and throughput significantly
    • Add support for runnning MKL-BLAS models under MKL-DNN. We leverage MKL-DNN to speed up both training and inference for MKL-BLAS models
    • Add Spark 2.4 support. Our examples and APIs are fully compatible with Spark 2.4, we released the binary for Spark 2.4 together with other Spark versions

    Details

    • [New Feature] Add MKL-DNN Int8 support, especially for VNNI support
    • [New Feature] Add support for runnning MKL-BLAS models under MKL-DNN
    • [New Feature] Add Spark 2.4 support
    • [New Feature] Add auto fusion to speed up model inference
    • [New Feature] Memoery reorder support for low precision inference
    • [New Feature] Add bytes support for DNN Tensor
    • [New Feature] Add SAME padding in MKL-DNN layers
    • [New Feature] Add combined (add/or) triggers for training completion
    • [Enhancement] Inception-V1 python training support enhancement
    • [Enhancement] Distributed Optimizer enhancement to support customized optimizer
    • [Enhancement] Add compute output shape for DNN supported layers
    • [Enhancement] New MKL-DNN computing thread pool
    • [Enhancement] Add MKL-DNN support for Predictor
    • [Enhancement] Documentation enhancement for Sparse Tensor, MKL-DNN support, etc
    • [Enhancement] Add ceilm mode for AvgPooling and MaxPooling layers
    • [Enhacement] Add binary classification support for DLClassifierModel
    • [Enhacement] Improvement to support conversion between NHWC and NCHW for memory reoder
    • [Bug Fix] Fix SoftMax layer with narrowed input
    • [Bug Fix] TensorFlow loader to support checking all data types
    • [Bug Fix] Fix Add operation bug to support double type when loading TensorFlow graph
    • [Bug Fix] Fix one-step weight update missing issue in validation during training
    • [Bug Fix] Fix scala compiler security issue in 2.10 & 2.11
    • [Bug Fix] Fix model broadcast cache UUID issue
    • [Bug Fix] Fix predictor issue for batch size == 1
    Source code(tar.gz)
    Source code(zip)
  • v0.7.0(Oct 13, 2018)

    Highlights

    • MKL-DNN support enhancement, which includes training optimization, more models training support and model serialization support
    • A new distributed optimizer for models powered by MKL-DNN. This optimizer can overlap training and communication during the distributed training, which lead to a better scalability on multi-nodes

    Details

    • [New Feature] A new optim method ParallelAdam which leverages the multi-thread capacity
    • [New Feature] Add new validation methods HitRate, which is widely used in recommendation
    • [New Feature] Add new validation methods NDCG, which is widely used in recommendation
    • [New Feature] Support communication priority when synchronize parameter in the distributed training
    • [New Feature] Support ModelBroadcast customization
    • [New Feature] Add a new distributed optimizer for models powered by MKL-DNN. This optimizer can overlap training and communication during the distributed training, which lead to a better scalability on multi-nodes
    • [API Change] Add batch size into the Python model.predict API
    • [Enhancement] Add MKL-DNN training example for LeNet
    • [Enhancement] Improve the training performance by getting rid of narrowing gradients and zero gradients for model powered by MKL-DNN
    • [Enhancement] Add training example for VGG-16 based on MKL-DNN
    • [Enhancement] Support nested table in Graph output
    • [Enhancement] Enhancement on thread pool to make it compatible with MKL-DNN engine
    • [Enhancement] MKL-DNN model serialization support
    • [Enhancement] Add VGG-16 validation example
    • [Bug Fix] Fix JoinTable throwing exception during backward if batch size is changed
    • [Bug Fix] Change Reshape to InferReShape in ReshapeLoadTF
    • [Bug Fix] Fix splitBatch issue in Predictor, where the model has multiple Graph and each Graph outputs a table
    • [Bug Fix] Fix MDL-DNN inference performance issue not to copy weights at inference
    • [Bug Fix] Fix the issue that the training will crash if there are unlabeled data
    • [Bug Fix] Fix the issue that the input is grey image while the model needs 3 channels input
    • [Bug Fix] Correct the style check job to make both input and output file format to UTF-8 format
    • [Bug Fix] Load the relevant library only if MKL-DNN engine specified
    • [Bug Fix] Shade org.tensorflow.framework to avoid conflict
    • [Bug Fix] Fix dlframes not packaged in pip issue
    • [Bug Fix] Fix LocalPredictor cannot be serialized because of nested logger variable
    • [Bug Fix] Need to clear Recurrent preTopology's output while cloneCells
    • [Bug Fix] MM layer output different output for same input if ran multiple times
    • [Bug Fix] Distribute predictor will send model twice when do mapPartition
    • [Document] Kubernetes programming guide to spark2.3
    • [Document] Add document for wrap preprocessor and model in one graph and add its python API
    Source code(tar.gz)
    Source code(zip)
  • v0.6.0(Jun 29, 2018)

    Highlights

    • We integrate MKL-DNN as an alternative execution engine for CNN models. MKL-DNN provides better training/inference performance and less memory consuming. On some CNN models, we find there’s 2x throughput improvement in our experiment.
    • Support using different optimization methods to optimize different parts of the model. This is necessary when train some models.
    • Spark 2.3 support. We have tested our code and examples on Spark 2.3. We release the binary for Spark 2.3, and Spark 1.5 will not be supported.

    Details

    • [New Feature] MKL-DNN integration. We integrate MKL-DNN as an alternative execution engine for CNN models. It supports speedup layers like: AvgPooling, MaxPooling, CAddTable, LRN, JoinTable, Linear, ReLU, SpatialConvolution, SpatialBatchnormalization, Softmax. MKL-DNN provides better training/inference performance and less memory consuming.
    • [New Feature] Layer fusion. Support layer fusion on conv + relu, batchnorm + relu, conv + batchnorm and conv + sum(some of the fusion can only be applied in the inference). Layer fusion provides better performance especially on inference. Currently layer fusion are only available for MKL-DNN related layers.
    • [New Feature] Multiple optimization method support in optimizer. Support using different optimization methods to optimize different parts of the model.
    • [New Feature] Add a new optimization method Ftrl, which is often used in recommendation model training.
    • [New Feature] Add a new example: Training Resnet50 on ImageNet dataset.
    • [New Feature] Add new OpenCV based image preprocessing transformer ChannelScaledNormalizer.
    • [New Feature] Add new OpenCV based image preprocessing transformer RandomAlterAspect.
    • [New Feature] Add new OpenCV based image preprocessing transformer RandomCropper.
    • [New Feature] Add new OpenCV based image preprocessing transformer RandomResize.
    • [New Feature] Support loading Tensorflow Max operation.
    • [New Feature] Allow user to specify input port when loading Tensorflow model. If the input operation accepts multiple tensors as input, user can specify which to feed data to instead of feed all tensors.
    • [New Feature] Support loading Tensorflow Gather operation.
    • [New Feature] Add random split for ImageFrame
    • [New Feature] Add setLabel and getURI API into ImageFrame
    • [API Change] Add batch size into the Python model.predict API.
    • [API Change] Add generateBackward into load Tensorflow model API, which allows user choose whether to generate backward path when load Tensorflow model.
    • [API Change] Add feature() and label() to the Sample.
    • [API Change] Deprecate the DLClassifier/DLEstimator in org.apache.spark.ml. Prefer using DLClassifier/DLEstimator under com.intel.analytics.bigdl.dlframes.
    • [Enhancement] Refine StridedSlice. Support begin/end/shrinkAxis mask just like Tensorflow.
    • [Enhancement] Add layer sync to SpatialBatchNormalization. SpatialBatchNormalization can calculate mean/std on a larger batch size. The model with SpatialBatchNormalization layer can converge to a better accuracy even the local batch size is small.
    • [Enhancement] Code refactor in DistriOptimizer for advanced parameter operations, e.g. global gradient clipping.
    • [Enhancement] Add more models into the LoadModel example.
    • [Enhancement] Share Const values when broadcast the model. The Const value will not be changed and we can share it when use multiple model for inference on a same node, which will reduce memory usage.
    • [Enhancement] Refine the getTime and time counting implementation.
    • [Enhancement] Support group serializer so that layers of the same hierarchy could share the same serializer.
    • [Enhancement] Dockerfile use Python 2.7.
    • [Bug Fix] Fix memory leak problem when using quantized model in predictor.
    • [Bug Fix] Fix PY4J Java gateway not compatible in Spark local mode for Spark 2.3.
    • [Bug Fix] Fix a bug in python inception example.
    • [Bug Fix] Fix a bug when run Tensorflow model using loop.
    • [Bug Fix] Fix a bug in the Squeeze layer.
    • [Bug Fix] Fix python API for random split.
    • [Bug Fix] Using parameters() instead of getParameterTable() to get weight and bias in serialization.
    • [Document] Fix incorrectness in Quantized model document.
    • [Document] Fix incorrect instructions when generate Sequence files for ImageNet 2012 dataset in the document.
    • [Document] Move bigdl-core build document into a separated page and refine the format.
    • [Document] Fix incorrect command in Tensorflow load and transfer learning examples.
    Source code(tar.gz)
    Source code(zip)
  • v0.5.0(Mar 30, 2018)

    Highlights

    • Bring in a Keras-like API(Scala and Python). User can easily run their Keras code (training and inference) on Apache Spark through BigDL. For more details, see this link.
    • Support load Tensorflow dynamic models(e.g. LSTM, RNN) in BigDL and support more Tensorflow operations, see this page.
    • Support combining data preprocessing and neural network layers in the same model (to make model deployment easy )
    • Speedup various modules in BigDL (BCECriterion, rmsprop, LeakyRelu, etc.)
    • Add DataFrame-based image reader and transformer

    New Features

    • Tensor can be converted to OpenCVMat
    • Bring in a new Keras-like API for scala and python
    • Support load Tensorflow dynamic models(e.g. LSTM, RNN)
    • Support load more Tensorflow operations(InvertPermutation, ConcatOffset, Exit, NextIteration, Enter, RefEnter, LoopCond, ControlTrigger, TensorArrayV3,TensorArrayGradV3, TensorArrayGatherV3, TensorArrayScatterV3, TensorArrayConcatV3, TensorArraySplitV3, TensorArrayReadV3, TensorArrayWriteV3, TensorArraySizeV3, StackPopV2, StackPop, StackPushV2, StackPush, StackV2, Stack)
    • ResizeBilinear support NCHW
    • ImageFrame support load Hadoop sequence file
    • ImageFrame support gray image
    • Add Kv2Tensor Operation(Scala)
    • Add PGCriterion to compute the negative policy gradient given action distribution, sampled action and reward
    • Support gradual increase learning rate in LearningrateScheduler
    • Add FixExpand and add more options to AspectScale for image preprocessing
    • Add RowTransformer(Scala)
    • Support to add preprocessors to Graph, which allows user combine preprocessing and trainable model into one model
    • Resnet on cifar-10 example support load images from HDFS
    • Add CategoricalColHashBucket operation(Scala)
    • Predictor support Table as output
    • Add BucketizedCol operation(Scala)
    • Support using DenseTensor and SparseTensor together to create Sample
    • Add CrossProduct Layer (Scala)
    • Provide an option to allow user bypass the exception in transformer
    • DenseToSparse layer support disable backward propagation
    • Add CategoricalColVocaList Operation(Scala)
    • Support imageframe in python optimizer
    • Support get executor number and executor cores in python
    • Add IndicatorCol Operation(Scala)
    • Add TensorOp, which is an operation with Tensor[T]-formatted input and output, and provides shortcuts to build Operations for tensor transformation by closures. (Scala)
    • Provide a docker file to make it easily to setup testing environment of BigDL
    • Add CrossCol Operation(Scala)
    • Add MkString Operation(Scala)
    • Add a prediction service interface for concurrent calls and accept bytes input
    • Add SparseTensor.cast & SparseTensor.applyFun
    • Add DataFrame-based image reader and transformer
    • Support load tensoflow model files saved by tf.saved_model API
    • SparseMiniBatch supporting multiple TensorDataTypes

    Enhancement

    • ImageFrame support serialization
    • A default implementation of zeroGradParameter is added to AbstractModule
    • Improve the style of the document website
    • Models in different threads share weights in model training
    • Speed up leaky relu
    • Speed up Rmsprop
    • Speed up BCECriterion
    • Support Calling Java Function in Python Executor and ModelBroadcast in Python
    • Add detail instructions to run-on-ec2
    • Optimize padding mechanism
    • Fix maven compiling warnings
    • Check duplicate layers in the container
    • Refine the document which introduce how to automatically Deploy BigDL on Dataproc cluster
    • Refactor adding extra jars/python packages for python user. Now only need to set env variable BIGDL_JARS & BIGDL_PACKAGES
    • Implement appendColumn and avoid the error caused by API mismatch between different Spark version
    • Add python inception training on ImageNet example
    • Update "can't find locality partition for partition ..." to warning message

    API change

    • Move DataFrame-based API to dlframe package
    • Refine the Container hierarchy. The add method(used in Sequential, Concat…) is moved to a subclass DynamicContainer
    • Refine the serialization code hierarchy
    • Dynamic Graph has been an internal class which is only used to run tensorflow models
    • Operation is not allowed to use outside Graph
    • The getParamter method as final and private[bigdl], which should be only used in model training
    • remove the updateParameter method, which is only used in internal test
    • Some Tensorflow related operations are marked as internal, which should be only used when running Tensorflow models

    Bug Fix

    • Fix Sparse sample batch bug. It should add another dimension instead of concat the original tensor
    • Fix some activation or layers don’t work in TimeDistributed and RnnCell
    • Fix a bug in SparseTensor resize method
    • Fix a bug when convert SparseTensor to DenseTensor
    • Fix a bug in SpatialFullConvolution
    • Fix a bug in Cosine equal method
    • Fix optimization state mess up when call optimizer.optimize() multiple times
    • Fix a bug in Recurrent forward after invoking reset
    • Fix a bug in inplace leakyrelu
    • Fix a bug when save/load bi-rnn layers
    • Fix getParameters() in submodule will create new storage when parameters has been shared by parent module
    • Fix some incompatible syntax between python 2.7 and 3.6
    • Fix save/load graph will loss stop gradient information
    • Fix a bug in SReLU
    • Fix a bug in DLModel
    • Fix sparse tensor dot product bug
    • Fix Maxout ser issue
    • Fix some serialization issue in some customized faster rcnn model
    • Fix and refine some example document instructions
    • Fix a bug in export_tf_checkpoint.py script
    • Fix a bug in set up python package.
    • Fix picklers initialization issues
    • Fix some race condition issue in Spark 1.6 when broadcasting model
    • Fix Model.load in python return type is wrong
    • Fix a bug when use pyspark-with-bigdl.sh to run jobs on Yarn
    • Fix empty tensor call size and stride not throw null exception
    Source code(tar.gz)
    Source code(zip)
  • v0.4.0(Jan 4, 2018)

    Highlights

    • Supported all Keras layers, and support Keras 1.2.2 model loading. See keras-support for detail
    • Python 3.6 support
    • OpenCV support, and add a dozen of image transformer based on OpenCV
    • More layers/operations

    New Features

    • Models & Layers & Operations & Loss function
      • Add layers for Keras: Cropping2D, Cropping3D, UpSampling1D, UpSampling2D, UpSampling3D, masking,Maxout,HighWay,GaussianDropout, GaussianNoise, CAveTable, VolumetricAveragePooling, HardSigmoidSReLU, LocallyConnected1D, LocallyConnected2D, SpatialSeparableConvolution, ActivityRegularization, SpatialDropout1D, SpatialDropout2D, SpatialDropout3D
      • Add Criterion for keras: PoissonCriterion, KullbackLeiblerDivergenceCriterion, MeanAbsolutePercentageCriterion, MeanSquaredLogarithmicCriterion, CosineProximityCriterion
      • Support NHWC for LRN and BatchNormalization
      • Add LookupTableSparse (lookup table for multivalue)
      • Add activation argument for recurrent layers
      • Add MultiRNNCell
      • Add SpatialSeparableConvolution
      • Add MSRA filler
      • Support SAME padding in 3d conv and allows user config padding size in convlstm and convlstm3d
      • TF opteration: SegmentSum, conv3d related operations, Dilation2D, Dilation2DBackpropFilter, Dilation2DBackpropInput, Digamma, Erf, Erfc, Lgamma, TanhGrad, depthwise, Rint, All, Any, Range, Exp, Expm1, Round, FloorDiv, TruncateDiv, Mod, FloorMod, TruncateMod, IntopK, Round, Maximum, Minimum, BatchMatMu, Sqrt, SqrtGrad, Square, RsqrtGrad, AvgPool, AvgPoolGrad, BiasAddV1, SigmoidGrad, Relu6, Relu6Grad, Elu, EluGrad, Softplus, SoftplusGrad, LogSoftmax, Softsign, SoftsignGrad, Abs, LessEqual, GreaterEqual, ApproximateEqual, Log, LogGrad, Log1p, Log1pGrad, SquaredDifference, Div, Ceil, Inv, InvGrad, IsFinite, IsInf, IsNan, Sign, TopK. See details at tensorflow_ops_list)
      • Add object detection related layers: PriorBox, NormalizeScale, Proposal, DetectionOutputSSD, DetectionOutputFrcnn, Anchor
    • Transformer
      • Add image Transformer based on OpenCV: Resize, Brightness, ChannelOrder, Contrast, Saturation, Hue, ChannelNormalize, PixelNormalize, RandomCrop, CenterCrop, FixedCrop, DetectionCrop, Expand, Filler, ColorJitter, RandomSampler, MatToFloats, AspectScale, RandomAspectScale, BytesToMat
      • Add Transformer: RandomTransformer, RoiProject, RoiHFlip, RoiResize, RoiNormalize
    • API change
      • Add predictImage function in LocalPredictor
      • Add partition number option for ImageFrame read
      • Add an API to get node from graph model with given name
      • Support List of JTensors for label in Python API
      • Expose local optimizer and predictor in Python API
    • Install & Deploy
    • Model Save/Load
      • Support big-sized model (parameter exceed > 2.1G) for both java and protobuffer
      • Support keras model loading
    • Training
      • Allow user to set new train data or new criterion for optimizer reusing
      • Support gradient clipping (constant clip and clip by L2-norm)

    Enhancement

    • Speed up BatchNormalization.
    • Speed up MSECriterion
    • Speed up Adam
    • Speed up static graph execution
    • Support reading TFRecord files from HDFS
    • Support reading raw binary files from HDFS
    • Check input size in concat layer
    • Add proper exception handling for CaffeLoader&Persister
    • Add serialization support for multiple tensor numeric
    • Add an Activity wrapper for Python to simplify the returning value
    • Override joda-time in hadoop-aws to reduce compile time
    • LocalOptimizer-use modelbroadcast-like method to clone module
    • Time counting for paralleltable's forward/backward
    • Use shade to package jar-with-dependencies to manage some package conflict
    • Support loading bigdl_conf_file in multiple python zip files

    Bug Fix

    • Fix getModel failed in DistriOptimizer when model parameters exceed 2.1G
    • Fix core number is 0 where there's only one core in system
    • Fix SparseJoinTable throw exception if input’s nElement changed.
    • Fix some issues found when save bigdl model to tensorflow format file
    • Fix return object type error of DLClassifier.transform in Python
    • Fix graph generatebackward is lost in serialization
    • Fix resizing tensor to empty tensor doesn’t work properly
    • Fix Adapter layer does not support different batch size at runtime
    • Fix Adaper layer cannot be serialized directly
    • Fix calling wrong function when set user-defined mkl threads
    • Fix SmoothL1Criterion and SoftmaxWithCriterion doesn’t deal with input’s offset.
    • Fix L1Regularization throw NullPointerException while broadcasting model.
    • Fix CMul layer will crash for certain configure
    Source code(tar.gz)
    Source code(zip)
  • v0.3.0(Nov 8, 2017)

    Highlights

    • New protobuf-based model storage format
    • Support model quantization
    • Support sparse tensor and model
    • Easier and broader Tensorflow model load support
    • More layers/operations
    • Apache Spark 2.2 support

    New Features

    • Models & Layers & Operations & Loss function
      • Support convlstm3D model
      • Support Variational Auto Encoder
      • Support Unet
      • Support PTB model
      • Add SpatialWithinChannelLRN layer
      • Add 3D-deconv layer
      • Add BifurcateSplitTable layer
      • Add KLD criterion
      • Add Gaussian layer
      • Add Sampler layer
      • Add RNN decoder layer
      • Support NHWC data format in 2D-conv, 2D-pooling layers
      • Support same/valid padding type in 2D-conv and 2D-pooling layers
      • Support dynamic execution flow in Graph
      • Graph node can pass nested tensors
      • Layer/Operation can support different input and output numeric tensor
      • Start to support operations in BigDL, add following operations: LogicalNot, LogicalOr, LogicalAnd, 1D Max Pooling, Squeeze, Prod, Sum, Reshape, Identity, ReLU, Equals, Greater, Less, Switch, Merge, Floor, L2Loss, RandomUniform, Rank, MatMul, SoftMax, Conv2d, Add, Assert, Onehot, Assign, Cast, ExpandDims, MaxPool, Realdiv, BiasAdd, Pad, Tile, StridedSlice, Transpose, Negative, AssignGrad, BiasAddGrad, Deconv2D, Conv2DBackFilter CrossEntropy, MaxPoolGrad, NoOp, RandomUniform, ReluGrad, Select, Sum, Pow, BroadcastGradientArgs, Control Dependency
      • Start to support sparse layers in BigDL, add following sparse layers: SparseLinear, SparseJoinTable, DenseToSparse
    • Tensor
      • Support sparse tensor
      • Support scalar (0-D tensor)
      • Tensor support more numeric type: boolean, short, int, long, string, char, bytestring
      • Tensor don’t display full content in toString when there’re too many elements
    • API change
      • Expose evaluate API to python
      • Add a predictClass API to model to simplify the code when user want to use model in classification
      • Change model.test to model.evaluate in Python
      • Refine Recurrent, BiRecurrent and RnnCell API
      • Sample.features from ndarray to JTensor/List[JTensor]
      • Sample.label from ndarray to JTensor
    • Install & Deploy
      • Support Apache Spark 2.2
      • Add script to run BigDL on Google DataProc platform
      • Refine run-example.sh scripts to run bigdl examples on AWS with build-in Spark
      • Pip install will now auto install spark-2.2
      • Add a docker file
    • Model Save/Load
      • New model persistent format(protobuf based) to provide a better user experience when save/load bigdl models
      • Support load more operations from Tensorflow
      • Support read tensor content from Tensorflow checkpoint
      • Support load a subset of Tensorflow graph
      • Support load Tensorflow preprocessing graph(read/parse tfrecord data, image decoders and queues)
      • Automatically convert data in Tensorflow queue to RDD and feeding model training in BigDL
      • Support load deconv layer from caffe and Tensorflow
      • Support save/load SpatialCrossLRN torch module
    • Training
      • Allow user to modify the optimization algorithm status when resuming the training in Python
      • Allow user to specify optimization algorithms, learning rate and learning rate decay when use BigDL in Spark * ML pipeline
      • Allow user to stop gradient on some layers in backpropagation
      • Allow user to freeze layer parameters in training
      • Add ML pipeline python API, user can use BigDL with ML pipeline in python code

    Enhancement

    1. Support model quantization. User can speed up model inference by quantize the model
    2. Display bigdl model in Tensorboard
    3. User can easily convert a sequential model to graph model by invoking new added toGraph method
    4. Remove unnecessary contiguous check in 3D conv
    5. Support global average pooling
    6. Support regularizer in 3D convolution layer
    7. Add regularizer for convlstmpeephole3d
    8. Throw more meaningful messages in layers and criterions
    9. Migrate GRU/LSTM/RNN/LSTM-Peehole definition from sequence to graph
    10. Switch to pytest for python unit tests
    11. Speed up tanh layer
    12. Speed up sigmoid layer
    13. Speed up recurrent layer
    14. Support batch normalization in recurrent
    15. Speedup Python ndarray to scala tensor convertion
    16. Improve gradient sync performance in distributed training
    17. Speedup tensor dot operation with mkl dot
    18. Speedup copy operation in recurrent container
    19. Speedup logsoftmax
    20. Move classes.lst and img_class.lst to the model example folder, so user can easier to find them.
    21. Ensure spark.speculation is set to false to get a better performance in training
    22. Easier to turn on performance data in distributed training log
    23. Optimize memory usage when broadcasting the model
    24. Support mllib vector as feature for BigDL
    25. Support create multiple tensors Sample in python
    26. Support resizing in BytesToBGRImg

    Bug Fix

    1. Fix TemporalConv layer cannot return parameter table
    2. Fix some bugs when loading dilated group convolution from caffe
    3. Fix some bugs when loading caffe v1 layers
    4. Fix a bug in TimeDistributed layer
    5. Fix get incorrect execution time in recurrent layers
    6. Fix inplace layer clear state bug
    7. Fix incorrect training data sample count under some input
    8. Remove label check in BytesToGreyImg
    9. Fix a bug in concat table when it contains no layer
    10. Fix a bug in maptable
    11. Fix some typos in document
    12. Use newInstance method to obtain FileSystem
    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Jul 24, 2017)

    New feature

    • A new BigDL document website online https://bigdl-project.github.io/, which replace the original BigDL wiki
    • Added New Models & Layers
      • TreeLSTM and examples for sentiment analytics
      • convLSTM layer
      • 1D convolution layer
      • Mean Absolute Error (MAE) metrics
      • TimeDistributed Layer
      • VolumetricConvolution(3D convolution)
      • VolumetricMaxPooling
      • RoiPooling layer
      • DiceCoefficient loss
      • bi-recurrent layers
    • API change
      • Allow user to set regularization per layer
      • Allow user to set learning rate per layer
      • Add predictClass API for python
      • Add DLEstimator for Spark ML pipeline
      • Add Functional API for model definition
      • Add movie length dataset API
      • Add 4d normalize support
      • Add evaluator API to simplify model test
    • Install & Deploy
      • Allow user to install BigDL from pip
      • Support win64 platform
      • A new script to auto pack/distribute python dependency on yarn cluster mode
    • Model Save/Load
      • Allow user to save BigDL model as Caffe model file
      • Allow user to load/save some Tensorflow model(cover tensorflow slim APIs)
      • Support save/load model file from/to s3/hdfs
    • Optimization
      • Add plateau learning rate schedule
      • Allow user to adjust optimization process based on loss and score
      • Add Exponential learning rate decay
      • Add natural exp decay learning rate schedule
      • Add multistep learning rate policy

    Enhancement

    1. Optimization method API refactor
    2. Allow user to load a Caffe model without pre-defining a BigDL model
    3. Optimize Recurrent Layers performance
    4. Refine the ML pipeline related API, and add more examples
    5. Optimize JoinTable layer performance
    6. Allow user to use nio blockmanager on Spark 1.5
    7. Refine layer parameter initialization algorithm API
    8. Refine Sample class to save memory usage when cache train/test dataset as tensor format
    9. Refine MiniBatch API to support padding and multiple tensors
    10. Remove bigdl.sh. BigDL will set MKL behavior through MKL Java API, and user can control this via Java properties
    11. Allow user to remove Spark log in redirecting log file
    12. Allow user create a SpatialConvultion layer without bias
    13. Refine validation metrics API
    14. Refine smoothL1Criterion and reduce tensor storage usage
    15. Use reflection to handle difference of Spark2 platforms, and user need not to recompile BigDL for different Spark2 platform
    16. Optimize FlattenTable performance
    17. Use maven package instead of script to copy dist artifacts together

    Bug Fix

    1. Fix some error in Text-classifier document
    2. Fix a bug when call JoinTable after clearState()
    3. Fix a bug in Concat layer when the dimension concatenated along is larger than 2
    4. Fix a bug in MapTable layer
    5. Fix some multi-thread error not catch issue
    6. Fix maven artifact dependency issue
    7. Fix model save method won’t close the stream issue
    8. Fix a bug in BCECriterion
    9. Fix some ConcatTable don’t clear gradInput buffer
    10. Fix SpatialDilatedConvolution not clear gradInput content
    Source code(tar.gz)
    Source code(zip)
  • v0.1.1(Jun 16, 2017)

    Release Notes

    • API Change
    1. Use bigdl as the top level package name for all bigdl python module
    2. Allow user to change the model in the optimizer
    3. Allow user to define a model in python API
    4. Allow user to invoke BigDL scala code from python in 3rd prject
    5. Allow user to use BigDL random generator in python
    6. Allow user to use forward/backward method in python
    7. Add BiRnn layer to python
    8. Remove useless CriterionTable layer
    • Enhancement
    1. Load libjmkl.so in the class load phase
    2. Support python 3.5
    3. Initialize gradient buffer at the start of backward to reduce the memory usage
    4. Auto pack python dependency in yarn cluster mode
    • Bug Fix
    1. Fix optimizer continue without failure after retry maximum number
    2. Fix LookupTable python API throw noSuchMethod error
    3. Fix an addmv bug for 1x1 matrix
    4. Fix lenet python example error
    5. Fix python load text file encoding issue
    6. Fix HardTanh performance issue
    7. Fix data may distribute unevenly in vgg example when input partition is too large
    8. Fix a bug in SpatialDilatedConvolution
    9. Fix a bug in BCECriterion loss function
    10. Fix a bug in Add layer
    11. Fix runtime error when run BigDL on Pyspark 1.5
    Source code(tar.gz)
    Source code(zip)
Distributed Deep learning with Keras & Spark

Elephas: Distributed Deep Learning with Keras & Spark Elephas is an extension of Keras, which allows you to run distributed deep learning models at sc

Max Pumperla 1.6k Dec 29, 2022
Microsoft Machine Learning for Apache Spark

Microsoft Machine Learning for Apache Spark MMLSpark is an ecosystem of tools aimed towards expanding the distributed computing framework Apache Spark

Microsoft Azure 3.9k Dec 30, 2022
Uber Open Source 1.6k Dec 31, 2022
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.

TensorFlowOnSpark TensorFlowOnSpark brings scalable deep learning to Apache Hadoop and Apache Spark clusters. By combining salient features from the T

Yahoo 3.8k Jan 4, 2023
[DEPRECATED] Tensorflow wrapper for DataFrames on Apache Spark

TensorFrames (Deprecated) Note: TensorFrames is deprecated. You can use pandas UDF instead. Experimental TensorFlow binding for Scala and Apache Spark

Databricks 757 Dec 31, 2022
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

Horovod Horovod is a distributed deep learning training framework for TensorFlow, Keras, PyTorch, and Apache MXNet. The goal of Horovod is to make dis

Horovod 12.9k Jan 7, 2023
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

Distributed (Deep) Machine Learning Community 23.6k Jan 3, 2023
XGBoost-Ray is a distributed backend for XGBoost, built on top of distributed computing framework Ray.

XGBoost-Ray is a distributed backend for XGBoost, built on top of distributed computing framework Ray.

null 92 Dec 14, 2022
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

Microsoft 14.5k Jan 7, 2023
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

Ray provides a simple, universal API for building distributed applications. Ray is packaged with the following libraries for accelerating machine lear

null 23.3k Dec 31, 2022
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective. 10x Larger Models 10x Faster Trainin

Microsoft 8.4k Dec 30, 2022
a distributed deep learning platform

Apache SINGA Distributed deep learning system http://singa.apache.org Quick Start Installation Examples Issues JIRA tickets Code Analysis: Mailing Lis

The Apache Software Foundation 2.7k Jan 5, 2023
WAGMA-SGD is a decentralized asynchronous SGD for distributed deep learning training based on model averaging.

WAGMA-SGD is a decentralized asynchronous SGD based on wait-avoiding group model averaging. The synchronization is relaxed by making the collectives externally-triggerable, namely, a collective can be initiated without requiring that all the processes enter it. It partially reduces the data within non-overlapping groups of process, improving the parallel scalability.

Shigang Li 6 Jun 18, 2022
Spark development environment for k8s

Local Spark Dev Env with Docker Development environment for k8s. Using the spark-operator image to ensure it will be the same environment. Start conta

Otacilio Filho 18 Jan 4, 2022
Code base of KU AIRS: SPARK Autonomous Vehicle Team

KU AIRS: SPARK Autonomous Vehicle Project Check this link for the blog post describing this project and the video of SPARK in simulation and on parkou

Mehmet Enes Erciyes 1 Nov 23, 2021
Apache Liminal is an end-to-end platform for data engineers & scientists, allowing them to build, train and deploy machine learning models in a robust and agile way

Apache Liminals goal is to operationalise the machine learning process, allowing data scientists to quickly transition from a successful experiment to an automated pipeline of model training, validation, deployment and inference in production. Liminal provides a Domain Specific Language to build ML workflows on top of Apache Airflow.

The Apache Software Foundation 121 Dec 28, 2022
A high performance and generic framework for distributed DNN training

BytePS BytePS is a high performance and general distributed training framework. It supports TensorFlow, Keras, PyTorch, and MXNet, and can run on eith

Bytedance Inc. 3.3k Dec 28, 2022
🎛 Distributed machine learning made simple.

?? lazycluster Distributed machine learning made simple. Use your preferred distributed ML framework like a lazy engineer. Getting Started • Highlight

Machine Learning Tooling 44 Nov 27, 2022
Management of exclusive GPU access for distributed machine learning workloads

TensorHive is an open source tool for managing computing resources used by multiple users across distributed hosts. It focuses on granting

Paweł Rościszewski 131 Dec 12, 2022