FEDn is an open-source, modular and ML-framework agnostic framework for Federated Machine Learning

Scaleout

Last update: Nov 9, 2022

Related tags

Overview

What is FEDn?

FEDn is an open-source, modular and ML-framework agnostic framework for Federated Machine Learning (FedML) developed and maintained by Scaleout Systems. FEDn enables highly scalable cross-silo and cross-device use-cases over FEDn networks.

Core Features

FEDn lets you seamlessly go from local development of a federated model in a pseudo-distributed sandbox to live production deployments in distributed, heterogeneous environments. Three key design objectives are guiding the project:

A ML-framework agnostic black-box design

Client model updates and model validations are treated as black-box computations. This means that it is possible to support virtually any ML model type or framework. Support for Keras and PyTorch artificial neural network models are available out-of-the-box, and support for many other model types, including select models from SKLearn, are in active development. A developer follows a structured design pattern to implement clients and there is a lot of flexibility in the toolchains used.

Horizontally scalable through a tiered aggregation scheme

FEDn is designed to allow for flexible and easy horizontal scaling to handle growing numbers of clients and to meet latency and throughput requirements. This is achieved by a tiered architecture where multiple independent combiners divide up the work to coordinate client updates and aggregation. Recent benchmarks show high performance both for thousands of clients in a cross-device setting and for 40 clients with large model updates (1GB) in a cross-silo setting, see https://arxiv.org/abs/2103.00148.

Built for real-world distributed computing scenarios

FEDn is built groud up to support real-world, production deployments in the distributed cloud. FEDn relies on proven best-practices in distributed computing and incorporates enterprise security features. A central assumption is that data clients should not have to expose any ingress ports.

More details about architecture and implementation can be found in the Documentation.

Getting started

The easiest way to start with FEDn is to use the provided docker-compose templates to launch a pseudo-distributed environment consisting of one Reducer, one Combiner, and a few Clients. Together with the supporting storage and database services this makes up a minimal system for developing a federated model and learning the FEDn architecture. FEDn projects are templated projects that contain the user-provided model application components needed for federated training, referred to as the compute package. We bundle two such test projects in the 'test' folder, and many more are available in external repositories. These projects can be used as templates for creating your own custom federated model.

Clone the repository (make sure to use git-lfs!) and follow these steps:

Pseudo-distributed deployment

We provide docker-compose templates for a minimal standalone, pseudo-distributed Docker deployment, useful for local testing and development on a single host machine.

Create a default docker network

We need to make sure that all services deployed on our single host can communicate on the same docker network. Therefore, our provided docker-compose templates use a default external network 'fedn_default'. First, create this network:

$ docker network create fedn_default

Deploy the base services (Minio and MongoDB)

$ docker-compose -f config/base-services.yaml up

Make sure you can access the following services before proceeding to the next steps:

Minio: http://localhost:9000
Mongo Express: http://localhost:8081

Start the Reducer

Copy the settings config file for the reducer, 'config/settings-reducer.yaml.template' to 'config/settings-reducer.yaml'. You do not need to make any changes to this file to run the sandbox. To start the reducer service:

$ docker-compose -f config/reducer.yaml up

Start a combiner

Copy the settings config file for the reducer, 'config/settings-combiner.yaml.template' to 'config/settings-combiner.yaml'. You do not need to make any changes to this file to run the sandbox. To start the combiner service and attach it to the reducer:

$ docker-compose -f config/combiner.yaml up

Make sure that you can access the Reducer UI at https://localhost:8090 and that the combiner is up and running before proceeding to the next step. You should see the combiner listed on https://localhost:8090/network.

Train a federated model

Training a federated model on the FEDn network involves uploading a compute package (containing the code that will be distributed to clients), seeding the federated model with a base model (untrained or pre-trained), and then attaching clients to the network. Follow the instruction here to set the environment up to train a model for digits classification using the MNIST dataset:

https://github.com/scaleoutsystems/fedn/blob/master/test/mnist-keras/README.md

Updating/changing the compute package and/or the seed model

By design, it is not possible to simply delete the compute package to reset the model - this is a security constraint enforced to not allow for arbitrary code replacement in an already configured federation. To restart and reseed the alliance in development mode navigate to MongoExpress (http://localhost:8081), log in (credentials are found/set in config/base-services.yaml) and delete the entire collection 'fedn-test-network', then restart all services.

Using FEDn in STACKn (relies on Kubernetes)

STACKn, Scaleout's cloud native (Fed)MLOps platform lets a user set up, monitor and manage FEDn networks (base services, reducer and combiners) in Kubernetes as 'Apps' deployed from a WebUI. STACKn also provides useful additional functionality such as Jupyter Labs, storage managmement, and model serving for the federated model using e.g. Tensorflow Serving, TorchServe, MLflow or custom serving. Refer to the STACKn documentation to set it up on your own cluster, or sign up on the waiting list for a private-beta SaaS deployment at https://scaleoutsystems.com/.

Fully distributed deployment

The deployment, sizing of nodes, and tuning of a FEDn network in production depends heavily on the use case (cross-silo, cross-device, etc), the size of model updates, on the available infrastructure, and on the strategy to provide end-to-end security. We provide instructions for a fully distributed reference deployment here: Distributed deployment.

Where to go from here

Additional example projects/clients:

PyTorch version of the MNIST getting-started example in test/mnist-pytorch
Sentiment analysis with a Keras CNN-lstm trained on the IMDB dataset (cross-silo): https://github.com/scaleoutsystems/FEDn-client-imdb-keras
Sentiment analysis with a PyTorch CNN trained on the IMDB dataset (cross-silo): https://github.com/scaleoutsystems/FEDn-client-imdb-pytorch.git
VGG16 trained on cifar-10 with a PyTorch client (cross-silo): https://github.com/scaleoutsystems/FEDn-client-cifar10-pytorch
Human activity recognition with a Keras CNN based on the casa dataset (cross-device): https://github.com/scaleoutsystems/FEDn-client-casa-keras
Fraud detection with a Keras auto-encoder (ANN encoder): https://github.com/Li-Ju666/FEDn-client-fraud_keras

Support

For more details please check out the FEDn documentation (https://scaleoutsystems.github.io/fedn/). If you do not find the information that you're looking for, have a bug report, or a feature request, start a ticket directly here on GitHub, or reach out to Scaleout (https://scaleoutsystems.com) to inquire about Enterprise support.

Making contributions

All pull requests will be considered and are much appreciated. We are currently managing issues and the release roadmap in an external tracker (Jira). Reach out to one of the maintainers if you are interested in making contributions, and we will help you find a good first issue to get you started.

For development, it is convenient to use the docker-compose templates config/reducer-dev.yaml and config/combiner-dev.yaml. These files will let you conveniently rebuild the reducer and combiner images with the current local version of the fedn source tree instead of the latest stable release. You might also want to use a Dockerfile for the client that installs fedn from your local clone of FEDn, alternatively mounts the source.

License

FEDn is licensed under Apache-2.0 (see LICENSE file for full information).

Comments

New feature json web tokens - related to old issue SK-16
Status

[x] Ready

[ ] Draft

[ ] Hold

Description

Included in this pr:

A token is generated (encoded) by json web tokens (jwt) based on the payload (including expire date and current date)

Expire date is set to 90 days

Adding the boolean flag --token will generate the token, by default false

If a token has been generated, combiner and clients must send the token via header in the request

If no token is used, the reducer will ignore the header.

The encryption alg. for encoding and decoding is set to HS256

obs the cli/tests does not currently work due to wrong reference import (no parent package, cli is not included in fedn module)

Obs. merge with develop needed because of PR #341

Types of changes

What types of changes does your code introduce to FEDn?

[ ] Hotfix (fixing a critical bug in production)

[ ] Bugfix

[x] New feature

[ ] Documentation update

Checklist

If you're unsure about any of the items below, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

[x ] This pull request is against develop branch (not applicable for hotfixes)

[ ] I have included a link to the issue on GitHub or JIRA (if any)

[ ] I have included migration files (if there are changes to the model classes)

[ ] I have read the CONTRIBUTING doc

[ ] I have included tests to complement my changes

[ ] I have updated the related documentation (if necessary)

[ ] I have added a reviewer for this pull request

[ ] I have added myself as an author for this pull request

Further comments

Anything else you think we should know before merging your code!
security
opened by Wrede 10
Setup Sphinx auto documentation, some PEP8 fixes, added docstrings
Status

[ ] Ready

[x] Draft

[ ] Hold

Description

The Sphinx autodoc system is in, along with some PEP-8 fixes and setting docstrings in the right format. Please note that a build scheme to compile and deploy the API docs to a webpage will need to be setup in due course.

Types of changes

What types of changes does your code introduce to FEDn?

[ ] Hotfix (fixing a critical bug in production)

[ ] Bugfix

[ ] New feature

[x] Documentation update

Checklist

If you're unsure about any of the items below, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

[x] This pull request is against develop branch (not applicable for hotfixes)

[ ] I have included a link to the issue on GitHub or JIRA (if any)

[ ] I have included migration files (if there are changes to the model classes)

[ ] I have read the CONTRIBUTING doc

[ ] I have included tests to complement my changes

[x] I have updated the related documentation (if necessary)

[x] I have added a reviewer for this pull request

[x] I have added myself as an author for this pull request

Further comments

Anything else you think we should know before merging your code!
opened by prasi372 10
Example in Getting Started: keras-client does not find package KerasSequentialHelper

I am trying to follow the example given in the readme, but get an error when I come to the section "Attach two Clients to the FEDn network". I get the following message when the keras-clients are starting client_1 | Traceback (most recent call last): client_1 | File "train.py", line 54, in client_1 | from fedn.utils.kerassequential import KerasSequentialHelper client_1 | ModuleNotFoundError: No module named 'fedn.utils.kerassequential' client_2 | Traceback (most recent call last): client_2 | File "train.py", line 54, in client_2 | from fedn.utils.kerassequential import KerasSequentialHelper client_2 | ModuleNotFoundError: No module named 'fedn.utils.kerassequential' Curret directory is "test/mnist-keras". As far as I can see there is no KerasSequentialHelper, but a KerasHelper that should be called. I use Linux Mint 18.3.
question

opened by vesdakon 7
Upgrade mnist examples to work with TF/Keras 2.6.0

Is your feature request related to a problem? Please describe.

In Tensorflow 2.6 there is no longer a "predict_classes", so validate.py in test/mnist-keras is not working.

Describe the solution you'd like An overhaul of the client code to work with latest TF.

How would the solution positively affect the functionality?

Describe any drawbacks (if any) A clear and concise description of the negative outcomes of the suggested solution.

Contact Details An e-mail address in case we need to contact you for further details regarding this request.
feature good first issue dependencies

opened by ahellander 5
Introduce custom baseimage override
To streamline better test projects In this suggested commit there are a new structure to allow for overriding of parameters.

version: '3.7' services: client: build: context: . dockerfile: components/client/Dockerfile args: baseimage: "tensorflow/tensorflow:latest"

Illustrated by an example

docker-compose -f docker-compose.yaml -f reducer.yaml -f combiner.yaml -f client.yaml **-f tensorflow.yaml** --build --scale client=5

where tensorflow.yaml in this example is overriding the BASEIMAGE variable to the Dockerimage and instructs which base image to initiate the client image from.

In this sense the tests we construct can have the same basic structure but depending on framework selected can inherit base image from for example tensorflow to ensure the execution context is present.
opened by morganekmefjord 5
Old compute package in master
Severity

[] Critical/Blocker (select if the issue makes the application unusable or causes serious data loss)

[x] High (select if the issue affects a major feature and there is no workaround or the available workaround is very complex)

[ ] Medium (select if the issue affects a minor feature or affects a major feature but has an easy enough workaround to not cause any major inconvenience)

[ ] Low (select if the issue doesn't significantly affect the user experience, like minor visual bugs)

Describe the bug The pre-built compute package for mnist-keras in current master does not match the actual client code (has not been updated in the release)

Environment:

OS: [e.g. mac OS]

Version: [e.g. macOS Catalina]

Browser [e.g. chrome, safari]

Version [e.g. 22]

Reproduction Steps Steps to reproduce the behavior:

Go to '...'

Click on '....'

Scroll down to '....'

See error

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Contact Details An e-mail address in case we need to contact you for further details regarding this issue.
bugfix
opened by ahellander 4
Deprecated key & dynamic port warnings when starting minio service

Severity Low (select if the issue doesn't significantly affect the user experience, like minor visual bugs)

Describe the bug Got the following warnings when start the minio service. Version 0.2.4

minio_1 | WARNING: MINIO_ACCESS_KEY and MINIO_SECRET_KEY are deprecated. minio_1 | Please use MINIO_ROOT_USER and MINIO_ROOT_PASSWORD minio_1 | API: http://172.19.0.4:9000 http://127.0.0.1:9000 minio_1 | minio_1 | Console: http://172.19.0.4:46547 http://127.0.0.1:46547 minio_1 | minio_1 | Documentation: https://docs.min.io minio_1 | minio_1 | WARNING: Console endpoint is listening on a dynamic port (46547), please use --console-address ":PORT" to choose a static port.
dependencies

opened by tian-cthit 4
Add message about package uploading on client log to avoid connection errors

Is your feature request related to a problem? Please describe. After starting the reducer and combiners we will get this error after starting clients:

client_1 | Asking for assignment client_1 | 09/25/2021 11:50:08 AM [connectionpool.py:971] Starting new HTTPS connection (1): 130.238.29.53:8090 client_1 | 2021-09-25 11:50:08,297 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): 130.238.29.53:8090 client_1 | 09/25/2021 11:50:08 AM [connectionpool.py:452] https://130.238.29.53:8090 "GET /assign?name=client00ba921f HTTP/1.1" 200 24

This will confuse the user, however, it's just a package problem, one needs to upload package before starting clients So, the suggestion is to have a clear message on the client log that asks the user to upload the package.

Describe the solution you'd like A clear and concise description of what you want to happen.

How would the solution positively affect the functionality? A clear and concise description of the positive outcomes of the suggested solution.

Describe any drawbacks (if any) A clear and concise description of the negative outcomes of the suggested solution.

Contact Details An e-mail address in case we need to contact you for further details regarding this request.

opened by aitmlouk 3
Timestamp on combiner log

Is your feature request related to a problem? Please describe. Why not adding a Timestamp to the combiner log like reducer, then at least it will help to track errors for long training

Describe the solution you'd like A clear and concise description of what you want to happen. Instead of having this: combiner_1 | COMBINER(FEDn_Combiner_addi):0 COMBINER: waiting for model updates: 0 of 4 completed.

it's better to have this: combiner_1 | COMBINER(FEDn_Combiner_addi): 09/24/2021 09:38:58 AM 0 COMBINER: waiting for model updates: 0 of 4 completed.

Describe any drawbacks (if any) None

Contact Details An e-mail address in case we need to contact you for further details regarding this request.

opened by aitmlouk 3
Improved visualization of FEDn network graph

Is your feature request related to a problem? Please describe.

Is is often helpful to have a graphical representation of the FEDn network. We have a simple version now in the /network view, but this should be improved to better handle multiple combiners. Also, it would be nice if the actual client names/ids were shown, as well as their status (online/offline).
feature good first issue

opened by ahellander 3
Model upload error
Severity

[x ] Critical/Blocker (select if the issue makes the application unusable or causes serious data loss)

[ ] High (select if the issue affects a major feature and there is no workaround or the available workaround is very complex)

[ ] Medium (select if the issue affects a minor feature or affects a major feature but has an easy enough workaround to not cause any major inconvenience)

[ ] Low (select if the issue doesn't significantly affect the user experience, like minor visual bugs)

Describe the bug When I try to upload the mnist-keras or mnist-pytorch models at 'https://localhost:8090/control' , it gives a http 500 server error. The log of the reducer contains the following error message:

model = self.load_model(path) reducer_1 | File "/app/fedn/fedn/utils/pytorchhelper.py", line 31, in load_model reducer_1 | b = np.load(path) reducer_1 | File "/usr/local/lib/python3.8/site-packages/numpy/lib/npyio.py", line 457, in load reducer_1 | raise ValueError("Cannot load file containing pickled data " reducer_1 | ValueError: Cannot load file containing pickled data when allow_pickle=False

It seems that numpy tries to load a pickled file with the allow_pickle=False option. This is for PyTorch, but the same error comes with the Keras model as well.

Environment: Ubuntu 18.04

Contact Details [email protected]
opened by sallogy 3
Metrics menu dropdown broken in develop
Severity

[ ] Critical/Blocker (select if the issue makes the application unusable or causes serious data loss)

[ ] High (select if the issue affects a major feature and there is no workaround or the available workaround is very complex)

[x] Medium (select if the issue affects a minor feature or affects a major feature but has an easy enough workaround to not cause any major inconvenience)

[ ] Low (select if the issue doesn't significantly affect the user experience, like minor visual bugs)

Describe the bug After moving the validation metric plot to the /models view the dropdown where a user can choose which validation metric to plot is broken.
bugfix
opened by ahellander 0
Make the pytorch helper the default

Is your feature request related to a problem? Please describe. Since we are now basing the quickstart tutorial on the pytorch version of the MINST example, it would reduce the risk for mistakes if we made the pytorch helper the default in the UI.
enhancement

opened by ahellander 0
Add an additional quickstart tutorial that does not use docker and docker-compose

Is your feature request related to a problem? Please describe. User understanding of how FEDn works might be obscured by the level of automation in the current examples that relies on Docker and docker-compose.

Describe the solution you'd like To complement the existing, developer-oriented examples, we should add a step-by-step guide that only uses venv.

How would the solution positively affect the functionality? Easier to grasp what each services is doing.
enhancement

opened by ahellander 0
Simplify the example / quickstart logic for users with more limited docker experience

Is your feature request related to a problem? Please describe.

We have been going back and forth on this one, trading off the ability to easily automate launching an arbitrary number of clients for the minst quickstart examples. Right now it is automated, but this comes at the price that we are using docker library in the entrypoint, and a relatively complex setup with docker-compose (overriding a base file). To strike a better balance here we will make the number of clients more static in the default configuration, with examples / docs for how to automate scaling of the clients for testing purposes.

Describe the solution you'd like When provide a docker-compose file that "manually" creates 3 clients (each with different, human readable name), and the default data-partition should be 3.

How would the solution positively affect the functionality? It will make the different steps involved more transparent and we can remove the docker library dependency in the entrypoint.

Describe any drawbacks (if any) It will create more work to automate launching arbitrary number of clients.
enhancement

opened by ahellander 0
Network graph fails to render on newer bokeh, networkx
Severity

[ ] Critical/Blocker (select if the issue makes the application unusable or causes serious data loss)

[ ] High (select if the issue affects a major feature and there is no workaround or the available workaround is very complex)

[ ] Medium (select if the issue affects a minor feature or affects a major feature but has an easy enough workaround to not cause any major inconvenience)

[x] Low (select if the issue doesn't significantly affect the user experience, like minor visual bugs)

Describe the bug The graph is not shown, due to this error:

fedn-reducer-1 | ValueError: failed to validate StaticLayoutProvider(id='p1074', ...).graph_layout: expected an element of Dict(Int, Seq(Any)), got {'combiner': array([-0.0008935, 0.0017451]), 'reducer': array([ 0.45759661, -0.88965251]), '637d40c4e93300a65d6c1bb7': array([0.54329689, 0.84086893]), '637d40c4e93300a65d6c1bb9': array([-1. , 0.04703848])}

Environment:

OS: [e.g. mac OS]

Version: [e.g. macOS Catalina]

Browser [e.g. chrome, safari]

Version [e.g. 22]

bugfix
opened by ahellander 0

Releases(v0.4.0-beta.1)

v0.4.0-beta.1(Dec 22, 2022)
What's Changed

Feature/SK-211 Enable toggle ssl by @ahellander in https://github.com/scaleoutsystems/fedn/pull/417

Fix/Bump tensorflow from 2.7.1 to 2.7.2 in /examples/mnist-keras by @dependabot in https://github.com/scaleoutsystems/fedn/pull/422

Feature/issue#393 Add python 3.10 to test matrix by @ahellander in https://github.com/scaleoutsystems/fedn/pull/424

Feature/SK-227 Add option to force SSL by @Wrede in https://github.com/scaleoutsystems/fedn/pull/426

Feature/SK-229 Add docker metadata to build tags by @Wrede in https://github.com/scaleoutsystems/fedn/pull/427

Bug/Fix error in deployment docs by @ahellander in https://github.com/scaleoutsystems/fedn/pull/425

Bug/Fix config download error for Python 3.10 by @ahellander in https://github.com/scaleoutsystems/fedn/pull/435

Feature/SK-248 Enable CA signed cert for GRPC channel by @Wrede in https://github.com/scaleoutsystems/fedn/pull/439

Bug/Fetch ca cert via ssl package by @Wrede in https://github.com/scaleoutsystems/fedn/pull/440

Feature/436 improvements to the UI/dashboard by @ahellander in https://github.com/scaleoutsystems/fedn/pull/443

Feature/438 Enables config of all settings in client.yaml (mirror CLI) by @ahellander in https://github.com/scaleoutsystems/fedn/pull/441

Fix/pin bokeh dep, resolves #444, filter graph to only plot online client… by @ahellander in https://github.com/scaleoutsystems/fedn/pull/445

Feature/446 Improves the example docs by @ahellander in https://github.com/scaleoutsystems/fedn/pull/447

Full Changelog: https://github.com/scaleoutsystems/fedn/compare/v0.3.3...v0.4.0-beta.1
Source code(tar.gz)
Source code(zip)
v0.3.3(Aug 3, 2022)
What's new?

UI token authentication: #388

Dispatcher commands are now wrapped in a shell: #390

MD5 checksum replaced with SHA256: #405

Documentation is now versioned and moved to RTD: #408, #409, #410

Code quality: #394, #398, #401, #399, #404, #406, #407

Minor improvements: #395, #403

Bugfixes

Pin minio version in compose file: #382

Minor fixes: #387, #389, #397, #396, #400

Fix broken CI and issues introduced by linting: #416

Other

Documentation fixes and improvements: #378, #379, #381, #402, #392, #411

Patch for compatibility with Gramine LibOS: #380

Images are now solely stored on GH registry: #384

Source code(tar.gz)
Source code(zip)
v0.3.3b2(Jul 7, 2022)
What's new?

UI token authentication: #388

Dispatcher commands are now wrapped in a shell: #390

MD5 checksum replaced with SHA256: #405

Documentation is now versioned and moved to RTD: #408, #409, #410

Code quality: #394, #398, #401, #399, #404, #406, #407

Minor improvements: #395, #403

Bugfixes

Pin minio version in compose file: #382

Minor fixes: #387, #389, #397, #396, #400

Fix broken CI and issues introduced by linting: #416

Other

Documentation fixes and improvements: #378, #379, #381, #402, #392, #411

Patch for compatibility with Gramine LibOS: #380

Images are now solely stored on GH registry: #384

Source code(tar.gz)
Source code(zip)
v0.3.3b1(Jul 5, 2022)
What's new?

UI token authentication: #388

Dispatcher commands are now wrapped in a shell: #390

MD5 checksum replaced with SHA256: #405

Documentation is now versioned and moved to RTD: #408, #409, #410

Code quality: #394, #398, #401, #399, #404, #406, #407

Minor improvements: #395, #403

Bugfixes

Pin minio version in compose file: #382

Minor fixes: #387, #389, #397, #396, #400

Other

Documentation fixes and improvements: #378, #379, #381, #402, #392, #411

Patch for compatibility with Gramine LibOS: #380

Images are now solely stored on GH registry: #384

Source code(tar.gz)
Source code(zip)
v0.3.2(Mar 10, 2022)
What's new?

Token (single) authentication: fedn run reducer --secret-key=<your-secret-phrase> will generate a token which will be required for combiners and clients to assign to the network. Clients and combiners are then required to authenticate via fedn run client --token=<generated-token> or by specifying "token" key in settings YAML file.

Compute package is no longer required to setup via the web UI: fedn run reducer --local-package Instead clients will make use of a local compute package, i.e. a remote compute package will not be downloaded.

VS Code devcontainer has been added

Bugfixes

Fixes an issue where a client could connect before a seed (initial) model has been uploaded

Other

Various minor fixes

Documentation

Source code(tar.gz)
Source code(zip)
v0.3.1(Dec 20, 2021)
What's new?

New resilience feature - clients will now attempt automatic reassignment if their associated combiner goes missing for more than 60s (default, configurable on the CLI).

Refactor of Client initialization.

Bugfixes

Fixes an issue where training requests to clients are delayed after a client has disconnected from combiner.

Other

Various minor fixes

Documentation

Source code(tar.gz)
Source code(zip)
v0.3.0(Dec 3, 2021)
What's new?

It is now possible to start clients as trainer, validators, or both (default).

Clients now execute training and validation events sequentially, this improves client stability for large models.

Improved visualization of the network graph.

Clients and their status are now listed on the network page.

Refactoring of the combiner aggregator API, making it easier to extend with your own aggregator.

Other

Minor bugfixes and stability improvements.

Improved documentation.

Source code(tar.gz)
Source code(zip)
v0.2.5(Aug 27, 2021)
v0.2.5

What's new?

The examples previously residing in 'test' have been refactored into a separate repository: https://github.com/scaleoutsystems/examples

Docker-compose templates for Minio upgraded to support latest version

Other

Documentation updated

Introduce Discord community server

Updated dependency to conform with new minio versions.

Source code(tar.gz)
Source code(zip)
v0.2.4(Jun 15, 2021)
v0.2.4

What's new?

Introduced a new events view.

Introduced a new view for viewing network layout, (reducer, combiner and clients hierarchy)

Introduced a new setup guide-phase to ensure prereqs like package and model are set before starting execution.

Introduced a better form for parameter selection on run configuration.

Introduced async dispatching of run configurations.

Introduced async update refresh of several important fields for user convenincence like status, events, network hierarchy etc.

Introduced a new download-client-config function to allow for faster and more convenient client configuration. (Just download config and point your local client and whoallah! You are online in this federation.)

Other

Fixed logic bugs related to framework persistance.

Fixed a logic bug causing clients to get assigned prior to compute package assignment (and hence will not account for assignment policy).

Fixed a logic bug if reducer is resumed from previous state (to ensure) that the right compute package is selected.

Update dependency versions.

Source code(tar.gz)
Source code(zip)
v0.2.3(May 19, 2021)
What's new?

Support for latest Minio

Improvements i UI - now not possible to submit jobs is in monitoring state.

Improvement of Docker image hierarchy.

Other:

Docs updates

Several bugfixes and security patches.

Source code(tar.gz)
Source code(zip)
v0.2.2(Apr 19, 2021)
v0.2.2

What's new?

The MNIST examples (Keras and PyTorch) have been updated so that they now bundle the example data in .npz format.

Other

Docs updates

Source code(tar.gz)
Source code(zip)
v0.2.1(Mar 30, 2021)
v0.2.1

What's new?

It is now possible to choose which validation metrics to plot in the Dashboard

Fixes

Plots backed by no current data is no longer shown as empty plots.

Other

Docs updates

Source code(tar.gz)
Source code(zip)
v0.2.0(Mar 5, 2021)
What's new?

It's now possible to have examples in external repositories

Support for models constructed with the Keras Functional API

Set maximum number of clients in the settings file

combiner: name: combinerhost: combinerport: 12080 max_clients: 50

Added visualizations on FEDn communication performance to the dashboard

Added client allocation policy to spread the clients evenly over the combiners

Use config for s3 model commits instead of a hard-coded bucket name

Memory management to prevent combiners from going off

Now possible to upload the compute package through the UI

Reducer, client and combiner now have their own Dockerfile definitions

Fixes

Combiners now handle the case when all clients fail to update a model

Other

Lots of product documentation updates

Source code(tar.gz)
Source code(zip)
v0.1.4(Nov 25, 2020)
Fixed volume path to prevent combiner crashes from writing every temporary model to RAM memory, and hitting RAM max memory limit.

Source code(tar.gz)
Source code(zip)
v0.1.3(Nov 14, 2020)
Resolves a bug in the calculation of the average in the combiner which would be affecting smaller models and a few client cases

Source code(tar.gz)
Source code(zip)
v0.1.2(Nov 12, 2020)
Additions:

Added new plot for time/round

Added CPU loads and MEM plot for all rounds

Allocate clients to accepting combiner with the least number of clients

Monitoring CPU/MEM/ROUNDS with personalized plots

Added HTML documentation templates, now accessible from FEDn

Fixes:

Removed flask-dashboard dependency which was previously bloating the Docker image

Combiner now handles the case when all Clients fails to update the model

Removed usage of hard-coded "models" bucket for s3 model commits, now using config instead

Source code(tar.gz)
Source code(zip)
v0.1.1(Nov 9, 2020)
Added:

Pull Request template

Bug Report and Feature Request templates

Contribution Guide

Code of Conduct

Source code(tar.gz)
Source code(zip)
v0.1.0(Oct 13, 2020)
Major

Compute package bundling, distribution and execution.

Ability to toggle remote distribution

Reducer init sequence.

Ability to re(initialize) reducer from command line and settings.yaml

Control state_store

Tempfile storage backend

combiner can now choose a tempfile backend for file storage of interim models.

Minor

Many performance improvements and bug fixes. See complete change log for details.
Source code(tar.gz)
Source code(zip)

Owner

Scaleout

Solving the data isolation problem in machine learning

GitHub

Federated Learning - Including common test models for federated learning, like CNN, Resnet18 and lstm, controlled by different parser

Federated_Learning ?? This projest include common test models for federated lear

10 Dec 11, 2022

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

H2O H2O is an in-memory platform for distributed, scalable machine learning. H2O uses familiar interfaces like R, Python, Scala, Java, JSON and the Fl

6.1k Jan 5, 2023

🔥 Cogitare - A Modern, Fast, and Modular Deep Learning and Machine Learning framework for Python

Cogitare is a Modern, Fast, and Modular Deep Learning and Machine Learning framework for Python. A friendly interface for beginners and a powerful too

Cogitare - Modern and Easy Deep Learning with Python

76 Sep 30, 2022

An open framework for Federated Learning.

Welcome to Intel® Open Federated Learning Federated learning is a distributed machine learning approach that enables organizations to collaborate on m

397 Dec 27, 2022

Implementation of the paper "Language-agnostic representation learning of source code from structure and context".

Code Transformer This is an official PyTorch implementation of the CodeTransformer model proposed in: D. Zügner, T. Kirschstein, M. Catasta, J. Leskov

131 Dec 13, 2022

PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

简体中文 | English PaddleRobotics paddleRobotics是基于paddle的机器人开源算法库集，包括人机交互、复杂运动控制、环境感知、slam定位导航等开源算法部分。人机交互主动多模交互技术TFVT-HRI 主动多模交互技术是通过视觉、语音、触摸传感器等输入机器人

185 Dec 26, 2022

PyTorch implementation of the supervised learning experiments from the paper Model-Agnostic Meta-Learning (MAML)

pytorch-maml This is a PyTorch implementation of the supervised learning experiments from the paper Model-Agnostic Meta-Learning (MAML): https://arxiv

516 Jan 5, 2023

Plato: A New Framework for Federated Learning Research

a new software framework to facilitate scalable federated learning research.

192 Jan 5, 2023

An Open Source Machine Learning Framework for Everyone

Documentation TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, a

170.1k Jan 4, 2023

An Open Source Machine Learning Framework for Everyone

Documentation TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, a

170.1k Jan 5, 2023

An Open Source Machine Learning Framework for Everyone

Documentation TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, a

153.2k Feb 13, 2021

The hippynn python package - a modular library for atomistic machine learning with pytorch.

The hippynn python package - a modular library for atomistic machine learning with pytorch. We aim to provide a powerful library for the training of a

37 Dec 29, 2022

Elegy is a framework-agnostic Trainer interface for the Jax ecosystem.

Elegy Elegy is a framework-agnostic Trainer interface for the Jax ecosystem. Main Features Easy-to-use: Elegy provides a Keras-like high-level API tha

435 Dec 30, 2022

Supervised domain-agnostic prediction framework for probabilistic modelling

A supervised domain-agnostic framework that allows for probabilistic modelling, namely the prediction of probability distributions for individual data

112 Oct 23, 2022

An Agnostic Computer Vision Framework - Pluggable to any Training Library: Fastai, Pytorch-Lightning with more to come

IceVision is the first agnostic computer vision framework to offer a curated collection with hundreds of high-quality pre-trained models from torchvision, MMLabs, and soon Pytorch Image Models. It orchestrates the end-to-end deep learning workflow allowing to train networks with easy-to-use robust high-performance libraries such as Pytorch-Lightning and Fastai

789 Dec 29, 2022

Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

CCAM (Unsupervised) Code repository for our paper "CCAM: Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localizati

113 Dec 27, 2022

Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-wise Distributed Data based on Pytorch Framework

VFedPCA+VFedAKPCA This is the official source code for the Paper: Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-

9 Sep 18, 2022

A Research-oriented Federated Learning Library and Benchmark Platform for Graph Neural Networks. Accepted to ICLR'2021 - DPML and MLSys'21 - GNNSys workshops.

FedGraphNN: A Federated Learning System and Benchmark for Graph Neural Networks A Research-oriented Federated Learning Library and Benchmark Platform

175 Dec 1, 2022

Everything you want about DP-Based Federated Learning, including Papers and Code. (Mechanism: Laplace or Gaussian, Dataset: femnist, shakespeare, mnist, cifar-10 and fashion-mnist. )

Differential Privacy (DP) Based Federated Learning (FL) Everything about DP-based FL you need is here. （所有你需要的DP-based FL的信息都在这里） Code Tip: the code o

83 Dec 24, 2022

FEDn is an open-source, modular and ML-framework agnostic framework for Federated Machine Learning

Related tags

Overview

What is FEDn?

Core Features

A ML-framework agnostic black-box design

Horizontally scalable through a tiered aggregation scheme

Built for real-world distributed computing scenarios

Getting started

Pseudo-distributed deployment

Train a federated model

Updating/changing the compute package and/or the seed model

Using FEDn in STACKn (relies on Kubernetes)

Fully distributed deployment

Where to go from here

Support

Making contributions

License

Comments

Status

Description

Types of changes

Checklist

Further comments

Status

Description

Types of changes

Checklist

Further comments

Releases(v0.4.0-beta.1)

v0.4.0-beta.1(Dec 22, 2022)

What's Changed

v0.3.3(Aug 3, 2022)

What's new?

Bugfixes

Other

v0.3.3b2(Jul 7, 2022)

What's new?

Bugfixes

Other

v0.3.3b1(Jul 5, 2022)

What's new?

Bugfixes

Other

v0.3.2(Mar 10, 2022)

What's new?

Bugfixes

Other

v0.3.1(Dec 20, 2021)

What's new?

Bugfixes

Other

v0.3.0(Dec 3, 2021)

What's new?

Other

v0.2.5(Aug 27, 2021)

v0.2.5

What's new?

Other

v0.2.4(Jun 15, 2021)

v0.2.4

What's new?

Other

v0.2.3(May 19, 2021)

v0.2.2(Apr 19, 2021)

v0.2.2

What's new?

Other

v0.2.1(Mar 30, 2021)

v0.2.1

What's new?

Fixes

Other

v0.2.0(Mar 5, 2021)

What's new?

Fixes

Other

v0.1.4(Nov 25, 2020)

v0.1.3(Nov 14, 2020)

v0.1.2(Nov 12, 2020)