🎛 Distributed machine learning made simple.

Overview

🎛 lazycluster

Distributed machine learning made simple.
Use your preferred distributed ML framework like a lazy engineer.

Getting StartedHighlightsFeaturesAPI DocsSupportReport a BugContribution

lazycluster is a Python library intended to liberate data scientists and machine learning engineers by abstracting away cluster management and configuration so that they are able to focus on their actual tasks. Especially, the easy and convenient cluster setup with Python for various distributed machine learning frameworks is emphasized.

Highlights

  • High-Level API for starting clusters:
    • DASK
    • Hyperopt
    • More lazyclusters (e.g. Ray, PyTorch, Tensorflow, Horovod, Spark) to come ...
  • Lower-level API for:
    • Managing Runtimes or RuntimeGroups to:
      • A-/synchronously execute RuntimeTasks by leveraging the power of ssh
      • Expose services (e.g. a DB) from or to a Runtime or in a whole RuntimeGroup
  • Command line interface (CLI)
    • List all available Runtimes
    • Add a Runtime configuration
    • Delete a Runtime configuration

API layer

Concept Definition: Runtime

A Runtime is the logical representation of a remote host. Typically, the host is another server or a virtual machine / container on another server. This python class provides several methods for utilizing remote resources such as the port exposure from / to a Runtime as well as the execution of RuntimeTasks. A Runtime has a working directory. Usually, the execution of a RuntimeTask is conducted relatively to this directory if no other path is explicitly given. The working directory can be manually set during the initialization. Otherwise, a temporary directory gets created that might eventually be removed.

Concept Definition: RuntimeGroup

A RuntimeGroup is the representation of logically related Runtimes and provides convenient methods for managing those related Runtimes. Most methods are wrappers around their counterparts in the Runtime class. Typical usage examples are exposing a port (i.e. a service such as a DB) in the RuntimeGroup, transfer files, or execute a RuntimeTask on the Runtimes. Additionally, all concrete RuntimeCluster (e.g. the HyperoptCluster) implementations rely on RuntimeGroups for example.

Concept Definition: Manager

The manager refers to the host where you are actually using the lazycluster library, since all desired lazycluster entities are managed from here. Caution: It is not to be confused with the RuntimeManager class.

Concept Definition: RuntimeTask

A RuntimeTask is a composition of multiple elemantary task steps, namely send file, get file, run command (shell), run function (python). A RuntimeTask can be executed on a remote host either by handing it over to a Runtime object or standalone by handing over a fabric Connection object to the execute method of the RuntimeTask. Consequently, all invididual task steps are executed sequentially. Moreover, a RuntimeTask object captures the output (stdout/stderr) of the remote execution in its execution log. An example for a RuntimeTask could be to send a csv file to a Runtime, execute a python function that is transforming the csv file and finally get the file back.



Getting started

Installation

pip install lazycluster
# Most up-to-date development version
pip install --upgrade git+https://github.com/ml-tooling/lazycluster.git@develop

Prerequisites

For lazycluster usage on the manager:

  • Unix based OS

  • Python >= 3.6

  • ssh client (e.g. openssh-client)

  • Passwordless ssh access to the Runtime hosts (recommended)

    Configure passwordless ssh access (click to expand...)
    • Create a key pair on the manager as described here or use an existing one
    • Install lazycluster on the manager
    • Create the ssh configuration for each host to be used as Runtime by using the lazycluster CLI command lazycluster add-runtime as described here and do not forget to specify the --id-file argument.
    • Finally, enable the passwordless ssh access by copying the public key to each Runtime as descibed here

Runtime host requirements:

  • Unix based OS
  • Python >= 3.6
  • ssh server (e.g. openssh-server)

Note:

Passwordless ssh needs to be setup for the hosts to be used as Runtimes for the most convenient user experience. Otherwise, you need to pass the connection details to Runtime.__init__ via connection_kwargs. These parameters will be passed on to the fabric.Connection.

Usage example high-level API

Start a Dask cluster.

from lazycluster import RuntimeManager
from lazycluster.cluster.dask_cluster import DaskCluster

# Automatically generate a group based on the ssh configuration
runtime_manager = RuntimeManager()
runtime_group = runtime_manager.create_group()

# Start the Dask cluster instances using the RuntimeGroup
dask_cluster = DaskCluster(runtime_group)
dask_cluster.start()

# => Now, you can start using the running Dask cluster

# Get Dask client to interact with the cluster
# Note: This will give you a dask.distributed.Client which is not
#       a lazycluster cluster but a Dask one instead
client = cluster.get_client()

Usage example lower-level API

Execute a Python function on a remote host and access the return data.

from lazycluster import RuntimeTask, Runtime

# Define a Python function which will be executed remotely
def hello(name:str):
    return 'Hello ' + name + '!'

# Compose a `RuntimeTask`
task = RuntimeTask('my-first_task').run_command('echo Hello World!') \
                                   .run_function(hello, name='World')

# Actually execute it remotely in a `Runtime`
task = Runtime('host-1').execute_task(task, execute_async=False)

# The stdout from from the executing `Runtime` can be accessed
# via the execution log of the `RuntimeTask`
task.print_log()

# Print the return of the `hello()` call
generator = task.function_returns
print(next(generator))

Support

The lazycluster project is maintained by Jan Kalkan. Please understand that we won't be able to provide individual support via email. We also believe that help is much more valuable if it's shared publicly so that more people can benefit from it.

Type Channel
🚨 Bug Reports
🎁 Feature Requests
👩‍💻 Usage Questions
🗯 General Discussion

Features

Use the Command Line Interface (CLI) to manage local ssh configuration to enable Runtime usage

Details (click to expand...)

For a full list of CLI commands please use lazycluster --help. For the help of a specific command please use lazycluster COMMAND --help.

List all available runtimes incl. additional information like cpu, memory, etc.

Moreover, also incative hosts will be shown. Inactive means, that the host could not be reached via ssh and instantiated as a valid Runtime.

# Will print a short list of active / inactive Runtimes
lazycluster list-runtimes

List Runtimes

# will print a list of active / inactive Runtimes incl. additional host information
# Note: This is slower as compared to omittin the -l option
lazycluster list-runtimes -l

List Runtimes in long format

Add host to ssh config

The host is named localhost for user root accessible on localhost port 22 using the private key file found under ~/.ssh/id_rsa.

Note: Add command will only add the ssh configuration on the manager. For a complete guide on how to setup passwordless ssh check the prerequisites section.

lazycluster add-runtime localhost root@localhost:22 --id_file ~/.ssh/id_rsa

Runtime Added

Delete the ssh config of Runtime

Note: Corresponding remote ikernel will be deleted too if present.

lazycluster delete-runtime host-1

Runtime Deleted

Create Runtimes & RuntimeGroups

Details (click to expand...)

A Runtime has a working directory. Usually, the execution of a RuntimeTask is conducted relatively to this directory if no other path is explicitly given. The working directory can be manually set during the initialization. Otherwise, a temporary directory gets created that might eventually be removed.

from lazycluster import Runtime, RuntimeGroup

rt_1 = Runtime('host-1')
rt_2 = Runtime('host-2', working_dir='/workspace')

# In this case you get a group where both Runtimes have different working directories.
# The working directory on host-1 will be a temp one and gets removed eventually.
runtime_group = RuntimeGroup([rt_1, rt_2])

# Here, the group internally creates Runtimes for both hosts and sets its working directory.
runtime_group = RuntimeGroup(hosts=['host-1', 'host-2'], working_dir='/workspace')

Moreover, you can set environment variables for the Runtimes. These variables can then be accessed when executing a Python function on the Runtime or executing a shell command. Per default the working directory is set as an env variable and the class constant Runtime.WORKING_DIR_ENV_VAR_NAME will give you the name of the variable. The working directory is always accessible also if manually update the env_variables.

# Directly set the env vars per Runtimes
rt = Runtime('host-1')
rt.env_variables = {'foo': 'bar'}

# Or use the convenient method to the the env vars
# for all Runtimes in a RuntimeGroup
runtime_group = RuntimeGroup(hosts=['host-1', 'host-2'])
group.set_env_variables({'foo': 'bar'})

Use the RuntimeManager to create a RuntimeGroup based on the manager's ssh config

Details (click to expand...)

The RuntimeManager can automatically detect all available Runtimes based on the manager's local ssh config and eventually create a necessary RuntimeGroup for you.

from lazycluster import RuntimeManager, RuntimeGroup

runtime_group = RuntimeManager().create_group()

Start a Dask cluster for scalable analytics

Details (click to expand...)

Most simple way to use Dask in a cluster based on a RuntimeGroup created by the RuntimeManager. The RuntimeManager can automatically detect all available Runtimes based on the manager's ssh config and eventually create a necessary RuntimeGroup for you. This RuntimeGroup is then handed over to DaskCluster during initialization.

The DASK scheduler instance gets started on the manager. Additionally, multiple DASK worker processes get started in the RuntimeGroup, i.e. in the Runtimes. The default number of workers is equal to the number of Runtimes in the RuntimeGroup.

Prerequisite: Please make sure that you have Dask installed on the manager. This can be done using pip install -q "dask[complete]".

Details (click to expand...)
from lazycluster import RuntimeManager
from lazycluster.cluster.dask_cluster import DaskCluster

# 1st: Create a RuntimeGroup, e.g. by letting the RuntimeManager detect
#      available hosts (i.e. Runtimes) and create the group for you.
runtime_group = RuntimeManager().create_group()

# 2nd: Create the DaskCluster instance with the RuntimeGroup.
cluster = DaskCluster(runtime_group)

# 3rd: Let the DaskCluster instantiate all entities on Runtimes
#      of the RuntimeGroup using default values. For custom
#      configuration check the DaskCluster API documentation.
cluster.start()

# => Now, all cluster entities should be started and you can simply use
#    it as documented in the hyperopt documentation.

Test the cluster setup

# Define test functions to be executed in parallel via DASK
def square(x):
    return x ** 2

def neg(x):
    return -x

# Get a DASK client instance
client = cluster.get_client()

# Execute the computation
A = client.map(square, range(10))
B = client.map(neg, A)
total = client.submit(sum, B, )
res = total.result()

print('Result: ' + str(res))

Use different strategies for launching the master and the worker instances.
Details (click to expand...)

Use different strategies for launching the master and the worker instance by providing custom implementation of lazycluster.cluster.MasterLauncher and lazycluster.cluster.WorkerLauncher. The default implementations are lazycluster.cluster.dask_cluster.LocalMasterLauncher and lazycluster.cluster.dask_cluster.RoundRobinLauncher.

cluster = DaskCluster(RuntimeManager().create_group(),
                      MyMasterLauncherImpl(),
                      MyWorkerLauncherImpl())
cluster.start()

Distributed hyperparameter tuning with Hyperopt

Details (click to expand...)

Most simple way to use Hyperopt in a cluster based on a RuntimeGroup created by the RuntimeManager. The RuntimeManager can automatically detect all available Runtimes based on the manager's ssh config and eventually create a necessary RuntimeGroup for you. This RuntimeGroup is then handed over to HyperoptCluster during initialization.

A MongoDB instance gets started on the manager. Additionally, multiple hyperopt worker processes get started in the RuntimeGroup, i.e. on the contained Runtimes. The default number of workers is equal to the number of Runtimes in the RuntimeGroup.

Prerequisites:

  • MongoDB server must be installed on the manager.
    • Note: When using the ml-workspace as the master then you can use the provided install script for MongoDB which can be found under /resources/tools.
  • Hyperopt must be installed on all Runtimes where hyperopt workers will be started
    • Note: When using the ml-workspace as hosts for the Runtimes then hyperopt is already pre-installed.
Launch a cluster (click to expand...)

For a detailed documentation of customizing options and default values check out the API docs

from lazycluster import RuntimeManager
from lazycluster.cluster.hyperopt_cluster import HyperoptCluster

# 1st: Create a RuntimeGroup, e.g. by letting the RuntimeManager detect
#      available hosts (i.e. Runtimes) and create the group for you.
runtime_group = RuntimeManager().create_group()

# 2nd: Create the HyperoptCluster instance with the RuntimeGroup.
cluster = HyperoptCluster(runtime_group)

# 3rd: Let the HyperoptCluster instantiate all entities on Runtimes of the RuntimeGroup using default values. For custom
#      configuration check the HyperoptCluster API documentation.
cluster.start()

# => Now, all cluster entities should be started and you can simply use
#    it as documented in the hyperopt documentation. We recommend to call
#    cluster.cleanup() once you are done.

Test the cluster setup using the simple example to minimize the sin function.

Note: The call to fmin is also done on the manager. The objective_function gets sent to the hyperopt workers by fmin via MongoDB. So there is no need to trigger the execution of fmin or the objective_function on the individual Runtimes. See hyperopt docs for detailed explanation.

import math
from hyperopt import fmin, tpe, hp
from hyperopt.mongoexp import MongoTrials

# You can retrieve the the actual url required by MongoTrials form the cluster instance
trials = MongoTrials(cluster.mongo_trial_url, exp_key='exp1')
objective_function = math.sin
best = fmin(objective_function, hp.uniform('x', -2, 2), trials=trials, algo=tpe.suggest, max_evals=10)
# Ensures that MongoDB gets stopped and other resources
cluster.cleanup()

Now, we will cenceptually demonstrate how to use lazycluster w/ hyperopt to optimize hyperparameters of a fasttext model. Note, this should not be a fasttext demo and thus the actual usage of fasttext is not optimized. Thus, you should read the related docs for this purpose. The example should just highlight how to get fasttext up and running in a distributed setting using lazycluster.

from lazycluster import RuntimeManager
from lazycluster.cluster.hyperopt_cluster import HyperoptCluster
import os

# 1st: Create a RuntimeGroup, e.g. by letting the RuntimeManager detect
#      available hosts (i.e. Runtimes) and create the group with a persistent
#      working directory for you.
runtime_group = RuntimeManager().create_group(working_dir='~/hyperopt')

# 2nd: Send the training - and test dataset to all Runtimes
path_to_datasets = '/path_on_manager'
train_file_name = 'train.csv'
train_path = os.path.join(path_to_datasets, train_file_name)
test_file_name = 'train.csv'
test_path = os.path.join(path_to_datasets, test_file_name)

# Per default the file will be send asynchronously to Runtime's working directory
runtime_group.send_file(train_file_name)
runtime_group.send_file(test_file_name)

# 3rd: Create the HyperoptCluster instance with the RuntimeGroup.
cluster = HyperoptCluster(runtime_group)

# 4th: Let the HyperoptCluster instantiate all entities on
# Runtimes of the RuntimeGroup using default values.
# For custom  configuration check the HyperoptCluster API documentation.
cluster.start()

# 5th: Ensure that the processes for sending the files terminated already,
#      since we sent the files async in 2nd step.
runtime_group.join()

# => Now, all cluster entities are started, datasets transferred, and you
#    can simply use the lcuster as documented in the hyperopt documentation.

# 6th: Define the objective function to be minimized by Hyperopt in order to find the
#      best hyperparameter combination.
def train(params):

    import fasttext
    import os

    train_path = os.path.join(os.environ['WORKING_DIR'], params['train_set_file_name'])
    test_path = os.path.join(os.environ['WORKING_DIR'], params['test_set_file_name'])

    model = fasttext.train_supervised(
        input = train_path,
        lr = float(params['learning_rate']),
        dim = int(params['vector_dim']),
        ws = int(params['window_size']),
        epoch = int(params['epochs']),
        minCount = int(params['min_count']),
        neg = int(params['negativ_sampling']),
        t = float(params['sampling']),
        wordNgrams = 1, # word ngrams other than 1 crash
        bucket = int(params['bucket']),
        pretrainedVectors = str(params['pretrained_vectors']),
        lrUpdateRate = int(params['lr_update_rate']),
        thread = int(params['threads']),
        verbose = 2
    )

    number_of_classes, precision, recall = model.test(test_path)

    f1 = 2 * ((precision * recall) / (precision + recall))

    # Return value must be negative because hyperopt's fmin tries to minimize the objective
    # function. You can think of it as minimizing an artificial loss function.
    return -1 * f1

from hyperopt import fmin, tpe, hp
from hyperopt.mongoexp import MongoTrials

# 7th: Define the searh space for the paramters to be optimized. Check further functions
#      of Hyperopt's hp module that might suit your specific requirement. This should just
#      give you an idea and not show how to best use fasttext.
search_space = {
    'min_count': hp.quniform('min_count', 2, 20, 1),
    'window_size': hp.quniform('window_size', 4, 15, 1),
    'vector_dim': hp.quniform('vector_dim', 100, 300, 1),
    'learning_rate': 0.4,
    'lr_update_rate': 100,
    'negativ_sampling': hp.quniform('negativ_sampling', 5, 20, 1),
    'sampling': hp.uniform('sampling', 0, 10**-3),
    'bucket': 2000000,
    'epochs': hp.quniform('epochs', 3, 30, 1),
    'pretrained_vectors': '',
    'threads': 8,
    'train_set_file_name': train_file_name,
    'test_set_file_name': test_file_name
}

# 8th: Actually, execute the hyperparameter optimization. Use the mongo_trial_url
#      property of your HyperoptCluster instance to get the url in the format
#      required by MongoTrials.
trials = MongoTrials(cluster.mongo_trial_url, exp_key='exp1')
best = fmin(train, search_space, tpe.suggest, 500, trials)
print(best)

Debugging (click to expand...)

In general you should read the Logging, exception handling and debugging section first so that you are aware of the general options lazycluster offers for debugging.
So the first step is to successfully launch a Hyperopt cluster by using the corresponding lazycluster class. If you experience problems until this point you should analyze the exceptions which should guide you forward to a solution. If this given error is not self explaining then please consider to provide meaningful feedback here so that it will be soon. Common problems until the cluster is started are:

  • MongoDB or hyperopt are not installed, i.e. the prerequisites are not yet fulfilled. => Ensure that the prerequisites are fulfilled. Consider using ml-workspace to get rid of dependency problems.
  • MongoDB is already running (under the same dbpath). This might especially happen if you started a cluster before and the cleanup did not happen correctly. Usually, the cleanup should happen atexit but sometimes it simply does not work depending on your execution environment. => to prevent this problem you can and should explicitly call the cleanup() method of the HyperoptCluster instance => to solve the problem if MongoDB is still running just type lsof -i | grep mongod into a terminal. Finally, use the kill pid command with the process ID you got from issuing the previous command.

Once the Hyperopt cluster is running, you can start using it. It should be noted, that the following is mainly about finding Hyperopt related issues since lazycluster basically did its job already. Typically, this means you have a bug in your objective function that you try to minimize with Hyperopt.
First, you could use the print_log() method of your hyperopt to check the execution log. If you can't find any error here, then check the execution log files or redirect the execution log from files to stdout of the manager by setting debug=True in the start methods of the HyperoptCluster class.
Alternatively, you can ssh into one of your Runtimes and manually start a hyperopt-worker process. You can find the respective shell command in the hyperopt docs. Moreover, you can get the necessary url for the --mongo argument by accessing the python property mongo_url from your HyperoptCluster instance once its running. Consequently, the newly started worker will poll a job from the master (i.e. MongoDB) and start its execution. Now you should see the error in the terminal once it occurs.

We found two common bug types related to the objective function. First, make sure that the hyper-/parameters you are passing to your model have the correct datatypes. Sounds trivial, right? :)
Next, you typically use some training - and test dataset on your Runtimes inside your objective function. So the correct file paths may be a bit tricky at first. You should understand that the objective function gets communicated to the hyperopt worker processes by fmin() via MongoDB. Consequently, the objective function gets executed as it is on the Runtimes and the paths must exist on the Runtimes. The Runtime's working directory as documented in the API docs is of interest here. It should be noted, that the path of this directory is available on the Runtimes. Consequently, we recommend that you manually set a working directory on your Runtimes and move the training - and test dataset files relative to the working directory. This can also be done on RuntimeGroup level. Now, you can create a relative path to the files inside your objective_function with os.path.join(os.environ['WORKING_DIR'], 'relative_file_path'). Note: The advantage of manually setting a working directory in this case is that a manually set working directory does not get removed at the end. Consequently, you do not need to move the files each time you start the execution. This hint can safe you quite a lot of time especially when you need to restart the exectuion mutliple times while debugging.


Use different strategies for launching the master and the worker instances.

Details (click to expand...)

Use different strategies for launching the master and the worker instances by providing custom implementation of lazycluster.cluster.MasterLauncher and lazycluster.cluster.WorkerLauncher. The default implementations are lazycluster.cluster.hyperopt_cluster.LocalMongoLauncher and lazycluster.cluster.hyperopt_cluster.RoundRobinLauncher.

cluster = HyperoptCluster(RuntimeManager().create_group(),
                          MyMasterLauncherImpl(),
                          MyWorkerLauncherImpl())
cluster.start()

Expose services

Details (click to expand...)

Expose a service from a Runtime

A DB is running on a remote host on port runtime_port and the DB is only accessible from the remote host. But you also want to access the service from the manager on port local_port. Then you can use this method to expose the service which is running on the remote host to the manager.

Details (click to expand...)
from lazycluster import Runtime

# Create a Runtime
runtime = Runtime('host-1')

# Make the port 50000 from the Runtime accessible on localhost
runtime.expose_port_from_runtime(50000)

# Make the local port 40000 accessible on the Runtime
runtime.expose_port_to_runtime(40000)

Expose a service to a Runtime

A DB is running on the manager on port local_port and the DB is only accessible from the manager. But you also want to access the service on the remote Runtime on port runtime_port. Then you can use this method to expose the service which is running on the manager to the remote host.

Details (click to expand...)
from lazycluster import Runtime

# Create a Runtime
runtime = Runtime('host-1')

# Make the port 50000 from the Runtime accessible on localhost
runtime.expose_port_from_runtime(50000)

# Make the local port 40000 accessible on the Runtime
runtime.expose_port_to_runtime(40000)

Service exposure

Now, we extend the previous example by using a RuntimeGroup instead of just a single Runtime. This means we want to expose a service which is running on the manager to a group of Runtimes.

Details (click to expand...)
from lazycluster import RuntimeGroup

# Create a RuntimeGroup
runtime_group = RuntimeGroup('host1', 'host-2', 'host-3')

# Make the local port 50000 accessible on all Runtimes in the RuntimeGroup.
runtime_group.expose_port_to_runtimes(50000)

# Note: The port can also be exposed to a subset of the Runtimes by using the
# method parameter exclude_hosts.
runtime_group.expose_port_to_runtimes(50000, exclude_hosts='host-3')

Expose a service from a Runtime to the other Runtimes in the RuntimeGroup

Assume you have service which is running on Runtime host-1. Now, you can expose the service to the remaining Runtimes in the RuntimeGroup.

Details (click to expand...)
from lazycluster import RuntimeGroup

# Create a RuntimeGroup
runtime_group = RuntimeGroup('host1', 'host-2', 'host-3')

# Make the port 40000 which is running on host-1 accessible on all other Runtimes in the RuntimeGroup
runtime_group.expose_port_from_runtime_to_group('host-1', 40000)

File Transfer

Details (click to expand...)

A RuntimeTask is capable of sending a file from the manager to a Runtime or vice versa. Moreover, the Runtime class as well as the RuntimeGroup provide convenient methods for this purpose that internally creates the RuntimeTasks for you.

In the following example, the file.csv will be transferred to the Runtime's working directory. Another path on the Runtime can be specified by supplying a remote_path as argument. See Runtime docs for further details on the working directory.

from lazycluster import RuntimeTask, Runtime

task = RuntimeTask('file-transfer')
task.send_file('local_path/file.csv')

runtime = Runtime('host-1')
runtime.execute_task(task, exec_async=False)

The explicit creation of a RuntimeTask is only necessary if you intend to add further steps to the RuntimeTask instead of just transferring a file. For example, you want to send a file, execute a Python function, and transfer the file back. If not, you can use the file transfer methods of the Runtime or RuntimeGroup. In the case of sending a file to a RuntimeGroup you should send the files asynchronously. Otherwise, each file will be transferred sequentially. Do not forget to call join(), if you need the files to be transferred before proceeding.

from lazycluster import RuntimeTask, Runtime, RuntimeGroup, RuntimeManager

# Send a file to a single Runtime
runtime = Runtime('host-1')
send_file('local_path/file.csv', execute_async=False)

# Send a file to a whole RuntimeGroup
group = RuntimeManager().create_group()
group.send_file('local_path/file.csv', execute_async=True)
group.join()

The usage of get_file is similar and documented here.

Simple preprocessing example

Details (click to expand...)

Read a local CSV file (on the manager) and upper case chunks in parallel using RuntimeTasks and a RuntimeGroup.

from typing import List
import pandas as pd
from lazycluster import RuntimeTask, RuntimeManager

# Define the function to be executed remotely
def preprocess(docs: List[str]):
    return [str(doc).lower() for doc in docs]

file_path = '/path/to/file.csv'

runtime_group = RuntimeManager().create_group()

tasks = []

# Distribute chunks of the csv and start the preprocessing in parallel in the RuntimeGroup
for df_chunk in pd.read_csv(file_path, sep=';', chunksize=500):

    task = RuntimeTask().run_function(preprocess, docs=df_chunk['text'].tolist())

    tasks.append(runtime_group.execute_task(task))

# Wait until all executions are done
runtime_group.join()

# Get the return data and print it
index = 0
for chunk in runtime_group.function_returns:
    print('Chunk: ' + str(index))
    index += 1
    print(chunk)

Logging, exception handling and debugging

Details (click to expand...)

lazycluster aims to abstract away the complexity implied by using multiple distributed Runtimes and provides an intuitive high level API fur this purpose. The lazycluster manager orchestrates the individual components of the distributed setup. A common use case could be to use lazycluster in order to launch a distributed hyperopt cluster. In this case, we have the lazycluster manager, that starts a MongoDB instance, starts the hyperopt worker processes on multiple Runtimes and ensures the required communication via ssh between these instances. Each individual component could potentially fail including the 3rd party ones such as hyperopt workers. Since lazycluster is a generic library and debugging a distributed system is an instrinsically non-trivial task, we tried to emphasize logging and good exception handling practices so that you can stay lazy.

Standard Python log

We use the standard Python logging module in order to log everything of interest that happens on the manager.

Details (click to expand...)

Per default we recommend to set the basicConfig log level to logging.INFO. Consequently, you will get relevant status updates about the progress of launching a cluster for example. Of course, you can adjust the log level to logging.DEBUG or anything you like.

We like to use the following basic configuration when using lazycluster in a Jupyter notebook:

import logging

logging.basicConfig(format='[%(levelname)s] %(message)s', level=logging.INFO)

Note: Some 3rd party libraries produce a lot of INFO messages, which are usually not of interest for the user. This is particular true for Paramiko. We base most ssh handling on Fabric which is based on Paramiko. We decided to set the log level for these libraries to logging.Error per default. This happens in the __init__.py module of the lazycluster package. And will be set once when importing the first module or class from lazycluster. If you want to change the log level of 3rd party libs you can set it the following way:

import logging
from lazycluster import Environment

# Effects logs of all libraries that were initially set to logging.ERROR
lazycluster.Environment.set_third_party_log_level(logging.INFO)

# Of course, you can set the log level manually for each library / module
logging.getLogger('paramiko').setLevel(logging.DEBUG)
logging.getLogger('lazycluster').setLevel(logging.INFO)

See set_third_party_log_level() of the Environment class for a full list of affected libraries.

Execution log

The execution log aims to provide a central access point to output produced on the Runtimes.

Details (click to expand...)

This type of log contains mainly the stdout/stderr produced when executing a RuntimeTask on a Runtime. If you are new to lazycluster or you never used the lower level API directly, then you might think the execution log is not relevant for you. But it is :) Also the concrete cluster implementations (e.g. DaskCluster or HyperoptCluster) are built on top of the lower-level API. You can think of it as the kind of log which you can use to understand what actually happened on your Runtimes. You can access the execution log in 3 different ways.

The 1st option is by accessing the excution log files. The stdout/stderr generated on the Runtimes is streamed to log files. The respective directory is per default ./lazycluster/execution_log on the manager. The log directory contains a subfolder for each Runtime (i.e. host) that executed at least one RuntimeTask. Inside a Runtime folder you will find one log file per executed RuntimeTask. Each logfile name is generated by concatenating the name of the RuntimeTask and a current timestamp. You can configure the path were the log directory gets created by adjusting the lazycluster main directory. See Environment for this purpose. Moreover, the respective file path can be programmatically accessed via RuntimeTask.execution_log_file_path. This property gets updated each time the RuntimeTask gets executed.

The 2nd option is to redirect the execution log (i.e. stdout/stderr from the Runtimes) to the stdout of the manager. Hereby, you can quickly spot errors. The drawback here is that you can not directly distinguish which Runtime generated which output, since the output of potentially multiple Runtimes is directly streamed to the manager's stdout as it occurs. To enable this feature you need to pass on the debug flag to the respective methods (i.e. RuntimeTask.execute(), Runtime.execute_task(), RuntimeGroup.execute_task()). All cluster related start() methods (e.g. HyperoptCluster.start(), DaskCluster.start() etc.) provide the debug option too. Example:

from lazycluster import RuntimeGroup, RuntimeTask

task = RuntimeTask('debug-test').run_command('python --version')
group = RuntimeGroup(hosts=['gaia-1', 'gaia-2'])
tasks = group.execute_task(task, debug=True)

The 3rd option is to access the execution_log property of a RuntimeTask. Additionally, the Runtime as well as the RuntimeGroup provide a print_log() function which prints the execution_log of the RuntimeTasks that were executed on the Runtimes. The execution_log property is a list and can be accessed via index. Each log entry corresponds to the output of a single (fully executed) step of a RuntimeTask. This means the stdout/stderr is not streamed to the manager can only be accessed after its execution. This kind of log might be useful if you need to access the ouput of a concrete RuntimeTask step programmatically. See the concept definition and the class documentation of the RuntimeTask for further details.

Note: It should be noted that RuntimeTask.run_function() is actually not a single task step. A call to this method will produce multiple steps, since the Python function that needs to be executed will be send as a pickle file to the remote host. There it gets unpickled, executed and the return data is sent back as a pickle file. This means if you intend to access the exectution log you should be aware that the log contains multiple log entries for the run_function() call. But the number of steps per call is fixed. Moreover, you should think about using the return value of a a remotely executed Python function instead of using the execution log for this purpose.

from lazycluster import Runtime, RuntimeTask

# Create the task
task = RuntimeTask('exec-log-demo')

# Add 2 individual task steps
task.run_command('echo Hello')
task.run_command('echo lazycluster!')

# Create a Runtime
runtime = Runtime('host-1')

# Execute the task remotely on the Runtime
runtime.execute_task(task)

# Access th elog per index
print(task.execution_log[0]) # => 'Hello'
print(task.execution_log[1]) # => 'lazycluster!'

# Let the Runtime print the log
# an equivalent method exists for RuntimeGroup
runtime.print_log()

Exception handling

Details (click to expand...)

Our exception handling concept follows the idea to use standard python classes whenever appropriate. Otherwise, we create a library specific error (i.e. exception) class.

Each created error class inherits from our base class LazyclusterError which in turn inherits from Pythons's Exception class. We aim to be informative as possible with our used exceptions to guide you to a solution to your problem. So feel encouraged to provide feedback on misleading or unclear error messages, since we strongly believe that guided errors are essential so that you can stay as lazy as possible.


Contribution


Licensed Apache 2.0. Created and maintained with ❤️ by developers from SAP in Berlin.

Comments
  • Bump urllib3 from 1.26.2 to 1.26.5

    Bump urllib3 from 1.26.2 to 1.26.5

    Bumps urllib3 from 1.26.2 to 1.26.5.

    Release notes

    Sourced from urllib3's releases.

    1.26.5

    :warning: IMPORTANT: urllib3 v2.0 will drop support for Python 2: Read more in the v2.0 Roadmap

    • Fixed deprecation warnings emitted in Python 3.10.
    • Updated vendored six library to 1.16.0.
    • Improved performance of URL parser when splitting the authority component.

    If you or your organization rely on urllib3 consider supporting us via GitHub Sponsors

    1.26.4

    :warning: IMPORTANT: urllib3 v2.0 will drop support for Python 2: Read more in the v2.0 Roadmap

    • Changed behavior of the default SSLContext when connecting to HTTPS proxy during HTTPS requests. The default SSLContext now sets check_hostname=True.

    If you or your organization rely on urllib3 consider supporting us via GitHub Sponsors

    1.26.3

    :warning: IMPORTANT: urllib3 v2.0 will drop support for Python 2: Read more in the v2.0 Roadmap

    • Fixed bytes and string comparison issue with headers (Pull #2141)

    • Changed ProxySchemeUnknown error message to be more actionable if the user supplies a proxy URL without a scheme (Pull #2107)

    If you or your organization rely on urllib3 consider supporting us via GitHub Sponsors

    Changelog

    Sourced from urllib3's changelog.

    1.26.5 (2021-05-26)

    • Fixed deprecation warnings emitted in Python 3.10.
    • Updated vendored six library to 1.16.0.
    • Improved performance of URL parser when splitting the authority component.

    1.26.4 (2021-03-15)

    • Changed behavior of the default SSLContext when connecting to HTTPS proxy during HTTPS requests. The default SSLContext now sets check_hostname=True.

    1.26.3 (2021-01-26)

    • Fixed bytes and string comparison issue with headers (Pull #2141)

    • Changed ProxySchemeUnknown error message to be more actionable if the user supplies a proxy URL without a scheme. (Pull #2107)

    Commits
    • d161647 Release 1.26.5
    • 2d4a3fe Improve performance of sub-authority splitting in URL
    • 2698537 Update vendored six to 1.16.0
    • 07bed79 Fix deprecation warnings for Python 3.10 ssl module
    • d725a9b Add Python 3.10 to GitHub Actions
    • 339ad34 Use pytest==6.2.4 on Python 3.10+
    • f271c9c Apply latest Black formatting
    • 1884878 [1.26] Properly proxy EOF on the SSLTransport test suite
    • a891304 Release 1.26.4
    • 8d65ea1 Merge pull request from GHSA-5phf-pp7p-vc2r
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies stale 
    opened by dependabot[bot] 2
  • Bump pygments from 2.7.3 to 2.7.4

    Bump pygments from 2.7.3 to 2.7.4

    Bumps pygments from 2.7.3 to 2.7.4.

    Release notes

    Sourced from pygments's releases.

    2.7.4

    • Updated lexers:

      • Apache configurations: Improve handling of malformed tags (#1656)

      • CSS: Add support for variables (#1633, #1666)

      • Crystal (#1650, #1670)

      • Coq (#1648)

      • Fortran: Add missing keywords (#1635, #1665)

      • Ini (#1624)

      • JavaScript and variants (#1647 -- missing regex flags, #1651)

      • Markdown (#1623, #1617)

      • Shell

        • Lex trailing whitespace as part of the prompt (#1645)
        • Add missing in keyword (#1652)
      • SQL - Fix keywords (#1668)

      • Typescript: Fix incorrect punctuation handling (#1510, #1511)

    • Fix infinite loop in SML lexer (#1625)

    • Fix backtracking string regexes in JavaScript/TypeScript, Modula2 and many other lexers (#1637)

    • Limit recursion with nesting Ruby heredocs (#1638)

    • Fix a few inefficient regexes for guessing lexers

    • Fix the raw token lexer handling of Unicode (#1616)

    • Revert a private API change in the HTML formatter (#1655) -- please note that private APIs remain subject to change!

    • Fix several exponential/cubic-complexity regexes found by Ben Caller/Doyensec (#1675)

    • Fix incorrect MATLAB example (#1582)

    Thanks to Google's OSS-Fuzz project for finding many of these bugs.

    Changelog

    Sourced from pygments's changelog.

    Version 2.7.4

    (released January 12, 2021)

    • Updated lexers:

      • Apache configurations: Improve handling of malformed tags (#1656)

      • CSS: Add support for variables (#1633, #1666)

      • Crystal (#1650, #1670)

      • Coq (#1648)

      • Fortran: Add missing keywords (#1635, #1665)

      • Ini (#1624)

      • JavaScript and variants (#1647 -- missing regex flags, #1651)

      • Markdown (#1623, #1617)

      • Shell

        • Lex trailing whitespace as part of the prompt (#1645)
        • Add missing in keyword (#1652)
      • SQL - Fix keywords (#1668)

      • Typescript: Fix incorrect punctuation handling (#1510, #1511)

    • Fix infinite loop in SML lexer (#1625)

    • Fix backtracking string regexes in JavaScript/TypeScript, Modula2 and many other lexers (#1637)

    • Limit recursion with nesting Ruby heredocs (#1638)

    • Fix a few inefficient regexes for guessing lexers

    • Fix the raw token lexer handling of Unicode (#1616)

    • Revert a private API change in the HTML formatter (#1655) -- please note that private APIs remain subject to change!

    • Fix several exponential/cubic-complexity regexes found by Ben Caller/Doyensec (#1675)

    • Fix incorrect MATLAB example (#1582)

    Thanks to Google's OSS-Fuzz project for finding many of these bugs.

    Commits
    • 4d555d0 Bump version to 2.7.4.
    • fc3b05d Update CHANGES.
    • ad21935 Revert "Added dracula theme style (#1636)"
    • e411506 Prepare for 2.7.4 release.
    • 275e34d doc: remove Perl 6 ref
    • 2e7e8c4 Fix several exponential/cubic complexity regexes found by Ben Caller/Doyensec
    • eb39c43 xquery: fix pop from empty stack
    • 2738778 fix coding style in test_analyzer_lexer
    • 02e0f09 Added 'ERROR STOP' to fortran.py keywords. (#1665)
    • c83fe48 support added for css variables (#1633)
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies stale 
    opened by dependabot[bot] 2
  • Bump pyyaml from 5.3.1 to 5.4

    Bump pyyaml from 5.3.1 to 5.4

    Bumps pyyaml from 5.3.1 to 5.4.

    Changelog

    Sourced from pyyaml's changelog.

    5.4 (2021-01-19)

    Commits
    • 58d0cb7 5.4 release
    • a60f7a1 Fix compatibility with Jython
    • ee98abd Run CI on PR base branch changes
    • ddf2033 constructor.timezone: _copy & deepcopy
    • fc914d5 Avoid repeatedly appending to yaml_implicit_resolvers
    • a001f27 Fix for CVE-2020-14343
    • fe15062 Add 3.9 to appveyor file for completeness sake
    • 1e1c7fb Add a newline character to end of pyproject.toml
    • 0b6b7d6 Start sentences and phrases for capital letters
    • c976915 Shell code improvements
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies stale 
    opened by dependabot[bot] 2
  • Bump jinja2 from 2.11.2 to 2.11.3

    Bump jinja2 from 2.11.2 to 2.11.3

    Bumps jinja2 from 2.11.2 to 2.11.3.

    Release notes

    Sourced from jinja2's releases.

    2.11.3

    This contains a fix for a speed issue with the urlize filter. urlize is likely to be called on untrusted user input. For certain inputs some of the regular expressions used to parse the text could take a very long time due to backtracking. As part of the fix, the email matching became slightly stricter. The various speedups apply to urlize in general, not just the specific input cases.

    Changelog

    Sourced from jinja2's changelog.

    Version 2.11.3

    Released 2021-01-31

    • Improve the speed of the urlize filter by reducing regex backtracking. Email matching requires a word character at the start of the domain part, and only word characters in the TLD. :pr:1343
    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies stale 
    opened by dependabot[bot] 2
  • Bump cryptography from 3.3.1 to 3.3.2

    Bump cryptography from 3.3.1 to 3.3.2

    Bumps cryptography from 3.3.1 to 3.3.2.

    Changelog

    Sourced from cryptography's changelog.

    3.3.2 - 2021-02-07

    
    * **SECURITY ISSUE:** Fixed a bug where certain sequences of ``update()`` calls
      when symmetrically encrypting very large payloads (>2GB) could result in an
      integer overflow, leading to buffer overflows. *CVE-2020-36242*
    

    .. _v3-3-1:

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies stale 
    opened by dependabot[bot] 2
  • Bump bleach from 3.2.1 to 3.3.0

    Bump bleach from 3.2.1 to 3.3.0

    Bumps bleach from 3.2.1 to 3.3.0.

    Changelog

    Sourced from bleach's changelog.

    Version 3.3.0 (February 1st, 2021)

    Backwards incompatible changes

    • clean escapes HTML comments even when strip_comments=False

    Security fixes

    • Fix bug 1621692 / GHSA-m6xf-fq7q-8743. See the advisory for details.

    Features

    None

    Bug fixes

    None

    Version 3.2.3 (January 26th, 2021)

    Security fixes

    None

    Features

    None

    Bug fixes

    • fix clean and linkify raising ValueErrors for certain inputs. Thank you @Google-Autofuzz.

    Version 3.2.2 (January 20th, 2021)

    Security fixes

    None

    Features

    • Migrate CI to Github Actions. Thank you @hugovk.

    Bug fixes

    • fix linkify raising an IndexError on certain inputs. Thank you @Google-Autofuzz.
    Commits
    • 79b7a3c Merge pull request from GHSA-vv2x-vrpj-qqpq
    • 842fcb4 Update for v3.3.0 release
    • 1334134 sanitizer: escape HTML comments
    • c045a8b Merge pull request #581 from mozilla/nit-fixes
    • 491abb0 fix typo s/vnedoring/vendoring/
    • 10b1c5d vendor: add html5lib-1.1.dist-info/REQUESTED
    • cd838c3 Merge pull request #579 from mozilla/validate-convert-entity-code-points
    • 612b808 Update for v3.2.3 release
    • 6879f6a html5lib_shim: validate unicode points for convert_entity
    • 90cb80b Update for v3.2.2 release
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies stale 
    opened by dependabot[bot] 2
  • Bump numpy from 1.19.4 to 1.21.0

    Bump numpy from 1.19.4 to 1.21.0

    Bumps numpy from 1.19.4 to 1.21.0.

    Release notes

    Sourced from numpy's releases.

    v1.21.0

    NumPy 1.21.0 Release Notes

    The NumPy 1.21.0 release highlights are

    • continued SIMD work covering more functions and platforms,
    • initial work on the new dtype infrastructure and casting,
    • universal2 wheels for Python 3.8 and Python 3.9 on Mac,
    • improved documentation,
    • improved annotations,
    • new PCG64DXSM bitgenerator for random numbers.

    In addition there are the usual large number of bug fixes and other improvements.

    The Python versions supported for this release are 3.7-3.9. Official support for Python 3.10 will be added when it is released.

    :warning: Warning: there are unresolved problems compiling NumPy 1.21.0 with gcc-11.1 .

    • Optimization level -O3 results in many wrong warnings when running the tests.
    • On some hardware NumPy will hang in an infinite loop.

    New functions

    Add PCG64DXSM BitGenerator

    Uses of the PCG64 BitGenerator in a massively-parallel context have been shown to have statistical weaknesses that were not apparent at the first release in numpy 1.17. Most users will never observe this weakness and are safe to continue to use PCG64. We have introduced a new PCG64DXSM BitGenerator that will eventually become the new default BitGenerator implementation used by default_rng in future releases. PCG64DXSM solves the statistical weakness while preserving the performance and the features of PCG64.

    See upgrading-pcg64 for more details.

    (gh-18906)

    Expired deprecations

    • The shape argument numpy.unravel_index cannot be passed as dims keyword argument anymore. (Was deprecated in NumPy 1.16.)

    ... (truncated)

    Commits
    • b235f9e Merge pull request #19283 from charris/prepare-1.21.0-release
    • 34aebc2 MAINT: Update 1.21.0-notes.rst
    • 493b64b MAINT: Update 1.21.0-changelog.rst
    • 07d7e72 MAINT: Remove accidentally created directory.
    • 032fca5 Merge pull request #19280 from charris/backport-19277
    • 7d25b81 BUG: Fix refcount leak in ResultType
    • fa5754e BUG: Add missing DECREF in new path
    • 61127bb Merge pull request #19268 from charris/backport-19264
    • 143d45f Merge pull request #19269 from charris/backport-19228
    • d80e473 BUG: Removed typing for == and != in dtypes
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • Bump urllib3 from 1.26.2 to 1.26.4

    Bump urllib3 from 1.26.2 to 1.26.4

    Bumps urllib3 from 1.26.2 to 1.26.4.

    Release notes

    Sourced from urllib3's releases.

    1.26.4

    :warning: IMPORTANT: urllib3 v2.0 will drop support for Python 2: Read more in the v2.0 Roadmap

    • Changed behavior of the default SSLContext when connecting to HTTPS proxy during HTTPS requests. The default SSLContext now sets check_hostname=True.

    If you or your organization rely on urllib3 consider supporting us via GitHub Sponsors

    1.26.3

    :warning: IMPORTANT: urllib3 v2.0 will drop support for Python 2: Read more in the v2.0 Roadmap

    • Fixed bytes and string comparison issue with headers (Pull #2141)

    • Changed ProxySchemeUnknown error message to be more actionable if the user supplies a proxy URL without a scheme (Pull #2107)

    If you or your organization rely on urllib3 consider supporting us via GitHub Sponsors

    Changelog

    Sourced from urllib3's changelog.

    1.26.4 (2021-03-15)

    • Changed behavior of the default SSLContext when connecting to HTTPS proxy during HTTPS requests. The default SSLContext now sets check_hostname=True.

    1.26.3 (2021-01-26)

    • Fixed bytes and string comparison issue with headers (Pull #2141)

    • Changed ProxySchemeUnknown error message to be more actionable if the user supplies a proxy URL without a scheme. (Pull #2107)

    Commits
    • a891304 Release 1.26.4
    • 8d65ea1 Merge pull request from GHSA-5phf-pp7p-vc2r
    • 5e34326 Add proper stacklevel to method_allowlist warning
    • 361f1e2 Release 1.26.3
    • 3179dfd Allow using deprecated OpenSSL with CRYPTOGRAPHY_ALLOW_OPENSSL_102
    • d97e5d4 Use Python 3.5 compatible get-pip
    • cb5e2fc [1.26] Don't compare bytes and str in putheader()
    • b89158f [1.26] Update RECENT_DATE to 2020-07-01
    • a800c74 [1.26] Recommend GitHub Sponsors instead of Open Collective
    • 947284e [1.26] Improve message for ProxySchemeUnknown exception
    • See full diff in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • Bump urllib3 from 1.26.2 to 1.26.3

    Bump urllib3 from 1.26.2 to 1.26.3

    Bumps urllib3 from 1.26.2 to 1.26.3.

    Release notes

    Sourced from urllib3's releases.

    1.26.3

    :warning: IMPORTANT: urllib3 v2.0 will drop support for Python 2: Read more in the v2.0 Roadmap

    • Fixed bytes and string comparison issue with headers (Pull #2141)

    • Changed ProxySchemeUnknown error message to be more actionable if the user supplies a proxy URL without a scheme (Pull #2107)

    If you or your organization rely on urllib3 consider supporting us via GitHub Sponsors

    Changelog

    Sourced from urllib3's changelog.

    1.26.3 (2021-01-26)

    • Fixed bytes and string comparison issue with headers (Pull #2141)

    • Changed ProxySchemeUnknown error message to be more actionable if the user supplies a proxy URL without a scheme. (Pull #2107)

    Commits
    • 361f1e2 Release 1.26.3
    • 3179dfd Allow using deprecated OpenSSL with CRYPTOGRAPHY_ALLOW_OPENSSL_102
    • d97e5d4 Use Python 3.5 compatible get-pip
    • cb5e2fc [1.26] Don't compare bytes and str in putheader()
    • b89158f [1.26] Update RECENT_DATE to 2020-07-01
    • a800c74 [1.26] Recommend GitHub Sponsors instead of Open Collective
    • 947284e [1.26] Improve message for ProxySchemeUnknown exception
    • See full diff in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • Add license scan report and status

    Add license scan report and status

    Your FOSSA integration was successful! Attached in this PR is a badge and license report to track scan status in your README.

    Below are docs for integrating FOSSA license checks into your CI:

    opened by fossabot 1
  • Maintenance/test automation

    Maintenance/test automation

    What kind of change does this PR introduce?

    • [ ] Bugfix
    • [ ] New Feature
    • [ ] Feature Improvment
    • [x ] Refactoring
    • [ ] Documentation
    • [ ] Other, please describe:
    documentation maintenance 
    opened by JanKalkan 0
  • Bump certifi from 2020.12.5 to 2022.12.7

    Bump certifi from 2020.12.5 to 2022.12.7

    Bumps certifi from 2020.12.5 to 2022.12.7.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Add CodeQL workflow for GitHub code scanning

    Add CodeQL workflow for GitHub code scanning

    Hi ml-tooling/lazycluster!

    This is a one-off automatically generated pull request from LGTM.com :robot:. You might have heard that we’ve integrated LGTM’s underlying CodeQL analysis engine natively into GitHub. The result is GitHub code scanning!

    With LGTM fully integrated into code scanning, we are focused on improving CodeQL within the native GitHub code scanning experience. In order to take advantage of current and future improvements to our analysis capabilities, we suggest you enable code scanning on your repository. Please take a look at our blog post for more information.

    This pull request enables code scanning by adding an auto-generated codeql.yml workflow file for GitHub Actions to your repository — take a look! We tested it before opening this pull request, so all should be working :heavy_check_mark:. In fact, you might already have seen some alerts appear on this pull request!

    Where needed and if possible, we’ve adjusted the configuration to the needs of your particular repository. But of course, you should feel free to tweak it further! Check this page for detailed documentation.

    Questions? Check out the FAQ below!

    FAQ

    Click here to expand the FAQ section

    How often will the code scanning analysis run?

    By default, code scanning will trigger a scan with the CodeQL engine on the following events:

    • On every pull request — to flag up potential security problems for you to investigate before merging a PR.
    • On every push to your default branch and other protected branches — this keeps the analysis results on your repository’s Security tab up to date.
    • Once a week at a fixed time — to make sure you benefit from the latest updated security analysis even when no code was committed or PRs were opened.

    What will this cost?

    Nothing! The CodeQL engine will run inside GitHub Actions, making use of your unlimited free compute minutes for public repositories.

    What types of problems does CodeQL find?

    The CodeQL engine that powers GitHub code scanning is the exact same engine that powers LGTM.com. The exact set of rules has been tweaked slightly, but you should see almost exactly the same types of alerts as you were used to on LGTM.com: we’ve enabled the security-and-quality query suite for you.

    How do I upgrade my CodeQL engine?

    No need! New versions of the CodeQL analysis are constantly deployed on GitHub.com; your repository will automatically benefit from the most recently released version.

    The analysis doesn’t seem to be working

    If you get an error in GitHub Actions that indicates that CodeQL wasn’t able to analyze your code, please follow the instructions here to debug the analysis.

    How do I disable LGTM.com?

    If you have LGTM’s automatic pull request analysis enabled, then you can follow these steps to disable the LGTM pull request analysis. You don’t actually need to remove your repository from LGTM.com; it will automatically be removed in the next few months as part of the deprecation of LGTM.com (more info here).

    Which source code hosting platforms does code scanning support?

    GitHub code scanning is deeply integrated within GitHub itself. If you’d like to scan source code that is hosted elsewhere, we suggest that you create a mirror of that code on GitHub.

    How do I know this PR is legitimate?

    This PR is filed by the official LGTM.com GitHub App, in line with the deprecation timeline that was announced on the official GitHub Blog. The proposed GitHub Action workflow uses the official open source GitHub CodeQL Action. If you have any other questions or concerns, please join the discussion here in the official GitHub community!

    I have another question / how do I get in touch?

    Please join the discussion here to ask further questions and send us suggestions!

    maintenance 
    opened by lgtm-com[bot] 0
  • Bump numpy from 1.19.4 to 1.22.0

    Bump numpy from 1.19.4 to 1.22.0

    Bumps numpy from 1.19.4 to 1.22.0.

    Release notes

    Sourced from numpy's releases.

    v1.22.0

    NumPy 1.22.0 Release Notes

    NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

    • Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.
    • A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.
    • NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.
    • New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.
    • A new configurable allocator for use by downstream projects.

    These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

    The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

    Expired deprecations

    Deprecated numeric style dtype strings have been removed

    Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

    (gh-19539)

    Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

    numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

    (gh-19615)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Bump paramiko from 2.7.2 to 2.10.1

    Bump paramiko from 2.7.2 to 2.10.1

    Bumps paramiko from 2.7.2 to 2.10.1.

    Commits
    • 286bd9f Cut 2.10.1
    • 4c491e2 Fix CVE re: PKey.write_private_key chmod race
    • aa3cc6f Cut 2.10.0
    • e50e19f Fix up changelog entry with real links
    • 02ad67e Helps to actually leverage your mocked system calls
    • 29d7bf4 Clearly our agent stuff is not fully tested yet...
    • 5fcb8da OpenSSH docs state %C should also work in IdentityFile and Match exec
    • 1bf3dce Changelog enhancement
    • f6342fc Prettify, add %C as acceptable controlpath token, mock gethostname
    • 3f3451f Add to changelog
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
Releases(v0.2.4)
  • v0.2.4(Dec 14, 2020)

  • v0.2.3(Dec 14, 2020)

    PyPi Release

    👷 Maintenance & Refactoring

    • Maintenance/test automation (#3) by @JanKalkan
    • Apply new project structure and default files for v0.1.0 (#2) by @LukasMasuch

    👥 Contributors

    Thanks to @JanKalkan and @LukasMasuch for the contributions.

    Source code(tar.gz)
    Source code(zip)
Owner
Machine Learning Tooling
Open-source machine learning tooling to boost your productivity.
Machine Learning Tooling
Distributed Computing for AI Made Simple

Project Home Blog Documents Paper Media Coverage Join Fiber users email list [email protected] Fiber Distributed Computing for AI Made Simp

Uber Open Source 997 Dec 30, 2022
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

Microsoft 14.5k Jan 7, 2023
Uber Open Source 1.6k Dec 31, 2022
Management of exclusive GPU access for distributed machine learning workloads

TensorHive is an open source tool for managing computing resources used by multiple users across distributed hosts. It focuses on granting

Paweł Rościszewski 131 Dec 12, 2022
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

Ray provides a simple, universal API for building distributed applications. Ray is packaged with the following libraries for accelerating machine lear

null 23.3k Dec 31, 2022
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

Distributed (Deep) Machine Learning Community 23.6k Jan 3, 2023
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Master status: Development status: Package information: TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assista

Epistasis Lab at UPenn 8.9k Jan 9, 2023
Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Python Extreme Learning Machine (ELM) Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Augusto Almeida 84 Nov 25, 2022
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

Vowpal Wabbit 8.1k Dec 30, 2022
CD) in machine learning projectsImplementing continuous integration & delivery (CI/CD) in machine learning projects

CML with cloud compute This repository contains a sample project using CML with Terraform (via the cml-runner function) to launch an AWS EC2 instance

Iterative 19 Oct 3, 2022
BigDL: Distributed Deep Learning Framework for Apache Spark

BigDL: Distributed Deep Learning on Apache Spark What is BigDL? BigDL is a distributed deep learning library for Apache Spark; with BigDL, users can w

null 4.1k Jan 9, 2023
Distributed Deep learning with Keras & Spark

Elephas: Distributed Deep Learning with Keras & Spark Elephas is an extension of Keras, which allows you to run distributed deep learning models at sc

Max Pumperla 1.6k Dec 29, 2022
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective. 10x Larger Models 10x Faster Trainin

Microsoft 8.4k Dec 30, 2022
a distributed deep learning platform

Apache SINGA Distributed deep learning system http://singa.apache.org Quick Start Installation Examples Issues JIRA tickets Code Analysis: Mailing Lis

The Apache Software Foundation 2.7k Jan 5, 2023
WAGMA-SGD is a decentralized asynchronous SGD for distributed deep learning training based on model averaging.

WAGMA-SGD is a decentralized asynchronous SGD based on wait-avoiding group model averaging. The synchronization is relaxed by making the collectives externally-triggerable, namely, a collective can be initiated without requiring that all the processes enter it. It partially reduces the data within non-overlapping groups of process, improving the parallel scalability.

Shigang Li 6 Jun 18, 2022
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

Horovod Horovod is a distributed deep learning training framework for TensorFlow, Keras, PyTorch, and Apache MXNet. The goal of Horovod is to make dis

Horovod 12.9k Jan 7, 2023
Distributed Tensorflow, Keras and PyTorch on Apache Spark/Flink & Ray

A unified Data Analytics and AI platform for distributed TensorFlow, Keras and PyTorch on Apache Spark/Flink & Ray What is Analytics Zoo? Analytics Zo

null 2.5k Dec 28, 2022
A high performance and generic framework for distributed DNN training

BytePS BytePS is a high performance and general distributed training framework. It supports TensorFlow, Keras, PyTorch, and MXNet, and can run on eith

Bytedance Inc. 3.3k Dec 28, 2022
Distributed scikit-learn meta-estimators in PySpark

sk-dist: Distributed scikit-learn meta-estimators in PySpark What is it? sk-dist is a Python package for machine learning built on top of scikit-learn

Ibotta 282 Dec 9, 2022