Scale complex AI/ML pipelines anywhere
CodeFlare is a framework to simplify the integration, scaling and acceleration of complex multi-step analytics and machine learning pipelines on the cloud.
Its main features are:
- Pipeline execution and scaling: CodeFlare Pipelines faciltates the definition and parallel execution of pipelines. It unifies pipeline workflows across multiple frameworks while providing nearly optimal scale-out parallelism on pipelined computations.
- Deploy and integrate anywhere: CodeFlare simplifies deployment and integration by enabling a serverless user experience with the integration with Red Hat OpenShift and IBM Cloud Code Engine and providing adapters and connectors to make it simple to load data and connect to data services.
Release status
This project is under active development. See the Documentation for design descriptions and the latest version of the APIs.
Quick start
Run in your laptop
Instaling locally
CodeFlare can be installed from PyPI.
Prerequisites:
- Python 3.7 or 3.8
- JupyterLab (to run examples)
We recommend installing Python 3.8.6 using pyenv. You can find here recommended steps to set up the Python environment.
Install from PyPI:
pip3 install --upgrade pip # CodeFlare requires pip >21.0
pip3 install --upgrade codeflare
Alternatively, you can also build locally with:
git clone https://github.com/project-codeflare/codeflare.git
cd codeflare
pip3 install --upgrade pip
pip3 install .
Using Docker
You can try CodeFlare by running the docker image from Docker Hub:
projectcodeflare/codeflare:latest
has the latest released version installed.
The command below starts the most recent development build in a clean environment:
docker run --rm -it -p 8888:8888 projectcodeflare/codeflare:latest
It should produce an output similar to the one below, where you can then find the URL to run CodeFlare from a Jupyter notebook in your local browser.
[I
ServerApp] Jupyter Server
is running at:
...
[I
ServerApp] http://127.0.0.1:8888/lab
Using Binder service
You can try out some of CodeFlare features using the My Binder service.
Click on the link below to try CodeFlare, on a sandbox environment, without having to install anything.
Pipeline execution and scaling
CodeFlare Pipelines reimagined pipelines to provide a more intuitive API for the data scientist to create AI/ML pipelines, data workflows, pre-processing, post-processing tasks, and many more which can scale from a laptop to a cluster seamlessly.
See the API documentation here, and reference use case documentation in the Examples section.
A set of reference examples are provided as executable notebooks.
To run examples, if you haven't done so yet, clone the CodeFlare project with:
git clone https://github.com/project-codeflare/codeflare.git
Example notebooks require JupyterLab, which can be installed with:
pip3 install --upgrade jupyterlab
Use the command below to run locally:
jupyter-lab codeflare/notebooks/<example_notebook>
The step above should automatically open a browser window and connect to a running Jupyter server.
If you are using any one of the recommended cloud based deployments (see below), examples are found in the codeflare/notebooks
directory in the container image. The examples can be executed directly from the Jupyter environment.
As a first example of the API usage, see the sample pipeline.
For an example of how CodeFlare Pipelines can be used to scale out common machine learning problems, see the grid search example. It shows how hyperparameter optimization for a reference pipeline can be scaled and accelerated with both task and data parallelism.
Deploy and integrate anywhere
Unleash the power of pipelines by seamlessly scaling on the cloud. CodeFlare can be deployed on any Kubernetes-based platform, including IBM Cloud Code Engine and Red Hat OpenShift Container Platform.
- IBM Cloud Code Engine for detailed instructions on how to run CodeFlare on a serverless platform.
- Red Hat OpenShift for detailed instructions on how to run CodeFlare on OpenShift Container Platform.
Contributing
Join us in making CodeFlare Better! We encourage you to take a look at our Contributing page.
Blog
CodeFlare related blogs are published on our Medium publication.
License
CodeFlare is an open-source project with an Apache 2.0 license.