FEDn is an open-source, modular and ML-framework agnostic framework for Federated Machine Learning

Overview

alt text

What is FEDn?

FEDn is an open-source, modular and ML-framework agnostic framework for Federated Machine Learning (FedML) developed and maintained by Scaleout Systems. FEDn enables highly scalable cross-silo and cross-device use-cases over FEDn networks.

Core Features

FEDn lets you seamlessly go from local development of a federated model in a pseudo-distributed sandbox to live production deployments in distributed, heterogeneous environments. Three key design objectives are guiding the project:

A ML-framework agnostic black-box design

Client model updates and model validations are treated as black-box computations. This means that it is possible to support virtually any ML model type or framework. Support for Keras and PyTorch artificial neural network models are available out-of-the-box, and support for many other model types, including select models from SKLearn, are in active development. A developer follows a structured design pattern to implement clients and there is a lot of flexibility in the toolchains used.

Horizontally scalable through a tiered aggregation scheme

FEDn is designed to allow for flexible and easy horizontal scaling to handle growing numbers of clients and to meet latency and throughput requirements. This is achieved by a tiered architecture where multiple independent combiners divide up the work to coordinate client updates and aggregation. Recent benchmarks show high performance both for thousands of clients in a cross-device setting and for 40 clients with large model updates (1GB) in a cross-silo setting, see https://arxiv.org/abs/2103.00148.

Built for real-world distributed computing scenarios

FEDn is built groud up to support real-world, production deployments in the distributed cloud. FEDn relies on proven best-practices in distributed computing and incorporates enterprise security features. A central assumption is that data clients should not have to expose any ingress ports.

More details about architecture and implementation can be found in the Documentation.

Getting started

The easiest way to start with FEDn is to use the provided docker-compose templates to launch a pseudo-distributed environment consisting of one Reducer, one Combiner, and a few Clients. Together with the supporting storage and database services this makes up a minimal system for developing a federated model and learning the FEDn architecture. FEDn projects are templated projects that contain the user-provided model application components needed for federated training, referred to as the compute package. We bundle two such test projects in the 'test' folder, and many more are available in external repositories. These projects can be used as templates for creating your own custom federated model.

Clone the repository (make sure to use git-lfs!) and follow these steps:

Pseudo-distributed deployment

We provide docker-compose templates for a minimal standalone, pseudo-distributed Docker deployment, useful for local testing and development on a single host machine.

  1. Create a default docker network

We need to make sure that all services deployed on our single host can communicate on the same docker network. Therefore, our provided docker-compose templates use a default external network 'fedn_default'. First, create this network:

$ docker network create fedn_default
  1. Deploy the base services (Minio and MongoDB)
$ docker-compose -f config/base-services.yaml up 

Make sure you can access the following services before proceeding to the next steps:

  1. Start the Reducer

Copy the settings config file for the reducer, 'config/settings-reducer.yaml.template' to 'config/settings-reducer.yaml'. You do not need to make any changes to this file to run the sandbox. To start the reducer service:

$ docker-compose -f config/reducer.yaml up 
  1. Start a combiner

Copy the settings config file for the reducer, 'config/settings-combiner.yaml.template' to 'config/settings-combiner.yaml'. You do not need to make any changes to this file to run the sandbox. To start the combiner service and attach it to the reducer:

$ docker-compose -f config/combiner.yaml up 

Make sure that you can access the Reducer UI at https://localhost:8090 and that the combiner is up and running before proceeding to the next step. You should see the combiner listed on https://localhost:8090/network.

Train a federated model

Training a federated model on the FEDn network involves uploading a compute package (containing the code that will be distributed to clients), seeding the federated model with a base model (untrained or pre-trained), and then attaching clients to the network. Follow the instruction here to set the environment up to train a model for digits classification using the MNIST dataset:

https://github.com/scaleoutsystems/fedn/blob/master/test/mnist-keras/README.md

Updating/changing the compute package and/or the seed model

By design, it is not possible to simply delete the compute package to reset the model - this is a security constraint enforced to not allow for arbitrary code replacement in an already configured federation. To restart and reseed the alliance in development mode navigate to MongoExpress (http://localhost:8081), log in (credentials are found/set in config/base-services.yaml) and delete the entire collection 'fedn-test-network', then restart all services.

Using FEDn in STACKn (relies on Kubernetes)

STACKn, Scaleout's cloud native (Fed)MLOps platform lets a user set up, monitor and manage FEDn networks (base services, reducer and combiners) in Kubernetes as 'Apps' deployed from a WebUI. STACKn also provides useful additional functionality such as Jupyter Labs, storage managmement, and model serving for the federated model using e.g. Tensorflow Serving, TorchServe, MLflow or custom serving. Refer to the STACKn documentation to set it up on your own cluster, or sign up on the waiting list for a private-beta SaaS deployment at https://scaleoutsystems.com/.

Fully distributed deployment

The deployment, sizing of nodes, and tuning of a FEDn network in production depends heavily on the use case (cross-silo, cross-device, etc), the size of model updates, on the available infrastructure, and on the strategy to provide end-to-end security. We provide instructions for a fully distributed reference deployment here: Distributed deployment.

Where to go from here

Additional example projects/clients:

Support

For more details please check out the FEDn documentation (https://scaleoutsystems.github.io/fedn/). If you do not find the information that you're looking for, have a bug report, or a feature request, start a ticket directly here on GitHub, or reach out to Scaleout (https://scaleoutsystems.com) to inquire about Enterprise support.

Making contributions

All pull requests will be considered and are much appreciated. We are currently managing issues and the release roadmap in an external tracker (Jira). Reach out to one of the maintainers if you are interested in making contributions, and we will help you find a good first issue to get you started.

For development, it is convenient to use the docker-compose templates config/reducer-dev.yaml and config/combiner-dev.yaml. These files will let you conveniently rebuild the reducer and combiner images with the current local version of the fedn source tree instead of the latest stable release. You might also want to use a Dockerfile for the client that installs fedn from your local clone of FEDn, alternatively mounts the source.

License

FEDn is licensed under Apache-2.0 (see LICENSE file for full information).

Comments
  • New feature json web tokens - related to old issue SK-16

    New feature json web tokens - related to old issue SK-16

    Status

    • [x] Ready
    • [ ] Draft
    • [ ] Hold

    Description

    Included in this pr:

    • A token is generated (encoded) by json web tokens (jwt) based on the payload (including expire date and current date)
    • Expire date is set to 90 days
    • Adding the boolean flag --token will generate the token, by default false
    • If a token has been generated, combiner and clients must send the token via header in the request
    • If no token is used, the reducer will ignore the header.
    • The encryption alg. for encoding and decoding is set to HS256
    • obs the cli/tests does not currently work due to wrong reference import (no parent package, cli is not included in fedn module)
    • Obs. merge with develop needed because of PR #341

    Types of changes

    What types of changes does your code introduce to FEDn?

    • [ ] Hotfix (fixing a critical bug in production)
    • [ ] Bugfix
    • [x] New feature
    • [ ] Documentation update

    Checklist

    If you're unsure about any of the items below, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

    • [x ] This pull request is against develop branch (not applicable for hotfixes)
    • [ ] I have included a link to the issue on GitHub or JIRA (if any)
    • [ ] I have included migration files (if there are changes to the model classes)
    • [ ] I have read the CONTRIBUTING doc
    • [ ] I have included tests to complement my changes
    • [ ] I have updated the related documentation (if necessary)
    • [ ] I have added a reviewer for this pull request
    • [ ] I have added myself as an author for this pull request

    Further comments

    Anything else you think we should know before merging your code!

    security 
    opened by Wrede 10
  • Setup Sphinx auto documentation, some PEP8 fixes, added docstrings

    Setup Sphinx auto documentation, some PEP8 fixes, added docstrings

    Status

    • [ ] Ready
    • [x] Draft
    • [ ] Hold

    Description

    The Sphinx autodoc system is in, along with some PEP-8 fixes and setting docstrings in the right format. Please note that a build scheme to compile and deploy the API docs to a webpage will need to be setup in due course.

    Types of changes

    What types of changes does your code introduce to FEDn?

    • [ ] Hotfix (fixing a critical bug in production)
    • [ ] Bugfix
    • [ ] New feature
    • [x] Documentation update

    Checklist

    If you're unsure about any of the items below, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

    • [x] This pull request is against develop branch (not applicable for hotfixes)
    • [ ] I have included a link to the issue on GitHub or JIRA (if any)
    • [ ] I have included migration files (if there are changes to the model classes)
    • [ ] I have read the CONTRIBUTING doc
    • [ ] I have included tests to complement my changes
    • [x] I have updated the related documentation (if necessary)
    • [x] I have added a reviewer for this pull request
    • [x] I have added myself as an author for this pull request

    Further comments

    Anything else you think we should know before merging your code!

    opened by prasi372 10
  • Example in Getting Started: keras-client does not find package KerasSequentialHelper

    Example in Getting Started: keras-client does not find package KerasSequentialHelper

    I am trying to follow the example given in the readme, but get an error when I come to the section "Attach two Clients to the FEDn network". I get the following message when the keras-clients are starting client_1 | Traceback (most recent call last): client_1 | File "train.py", line 54, in client_1 | from fedn.utils.kerassequential import KerasSequentialHelper client_1 | ModuleNotFoundError: No module named 'fedn.utils.kerassequential' client_2 | Traceback (most recent call last): client_2 | File "train.py", line 54, in client_2 | from fedn.utils.kerassequential import KerasSequentialHelper client_2 | ModuleNotFoundError: No module named 'fedn.utils.kerassequential' Curret directory is "test/mnist-keras". As far as I can see there is no KerasSequentialHelper, but a KerasHelper that should be called. I use Linux Mint 18.3.

    question 
    opened by vesdakon 7
  • Upgrade mnist examples to work with  TF/Keras 2.6.0

    Upgrade mnist examples to work with TF/Keras 2.6.0

    Is your feature request related to a problem? Please describe.

    In Tensorflow 2.6 there is no longer a "predict_classes", so validate.py in test/mnist-keras is not working.

    Describe the solution you'd like An overhaul of the client code to work with latest TF.

    How would the solution positively affect the functionality?

    Describe any drawbacks (if any) A clear and concise description of the negative outcomes of the suggested solution.

    Contact Details An e-mail address in case we need to contact you for further details regarding this request.

    feature good first issue dependencies 
    opened by ahellander 5
  • Introduce custom baseimage override

    Introduce custom baseimage override

    To streamline better test projects In this suggested commit there are a new structure to allow for overriding of parameters.

    version: '3.7'
    services:
      client:
        build:
          context: .
          dockerfile: components/client/Dockerfile
          args:
            baseimage: "tensorflow/tensorflow:latest"
    

    Illustrated by an example

    docker-compose -f docker-compose.yaml -f reducer.yaml -f combiner.yaml -f client.yaml **-f tensorflow.yaml** --build --scale client=5
    

    where tensorflow.yaml in this example is overriding the BASEIMAGE variable to the Dockerimage and instructs which base image to initiate the client image from.

    In this sense the tests we construct can have the same basic structure but depending on framework selected can inherit base image from for example tensorflow to ensure the execution context is present.

    opened by morganekmefjord 5
  • Old compute package in master

    Old compute package in master

    Severity

    • [] Critical/Blocker (select if the issue makes the application unusable or causes serious data loss)
    • [x] High (select if the issue affects a major feature and there is no workaround or the available workaround is very complex)
    • [ ] Medium (select if the issue affects a minor feature or affects a major feature but has an easy enough workaround to not cause any major inconvenience)
    • [ ] Low (select if the issue doesn't significantly affect the user experience, like minor visual bugs)

    Describe the bug The pre-built compute package for mnist-keras in current master does not match the actual client code (has not been updated in the release)

    Environment:

    • OS: [e.g. mac OS]
    • Version: [e.g. macOS Catalina]
    • Browser [e.g. chrome, safari]
    • Version [e.g. 22]

    Reproduction Steps Steps to reproduce the behavior:

    1. Go to '...'
    2. Click on '....'
    3. Scroll down to '....'
    4. See error

    Expected behavior A clear and concise description of what you expected to happen.

    Screenshots If applicable, add screenshots to help explain your problem.

    Contact Details An e-mail address in case we need to contact you for further details regarding this issue.

    bugfix 
    opened by ahellander 4
  • Deprecated key & dynamic port warnings when starting minio service

    Deprecated key & dynamic port warnings when starting minio service

    Severity Low (select if the issue doesn't significantly affect the user experience, like minor visual bugs)

    Describe the bug Got the following warnings when start the minio service. Version 0.2.4

    minio_1 | WARNING: MINIO_ACCESS_KEY and MINIO_SECRET_KEY are deprecated. minio_1 | Please use MINIO_ROOT_USER and MINIO_ROOT_PASSWORD minio_1 | API: http://172.19.0.4:9000 http://127.0.0.1:9000 minio_1 | minio_1 | Console: http://172.19.0.4:46547 http://127.0.0.1:46547 minio_1 | minio_1 | Documentation: https://docs.min.io minio_1 | minio_1 | WARNING: Console endpoint is listening on a dynamic port (46547), please use --console-address ":PORT" to choose a static port.

    dependencies 
    opened by tian-cthit 4
  • Add message about package uploading on client log to avoid connection errors

    Add message about package uploading on client log to avoid connection errors

    Is your feature request related to a problem? Please describe. After starting the reducer and combiners we will get this error after starting clients:

    client_1 | Asking for assignment client_1 | 09/25/2021 11:50:08 AM [connectionpool.py:971] Starting new HTTPS connection (1): 130.238.29.53:8090 client_1 | 2021-09-25 11:50:08,297 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): 130.238.29.53:8090 client_1 | 09/25/2021 11:50:08 AM [connectionpool.py:452] https://130.238.29.53:8090 "GET /assign?name=client00ba921f HTTP/1.1" 200 24

    This will confuse the user, however, it's just a package problem, one needs to upload package before starting clients So, the suggestion is to have a clear message on the client log that asks the user to upload the package.

    Describe the solution you'd like A clear and concise description of what you want to happen.

    How would the solution positively affect the functionality? A clear and concise description of the positive outcomes of the suggested solution.

    Describe any drawbacks (if any) A clear and concise description of the negative outcomes of the suggested solution.

    Contact Details An e-mail address in case we need to contact you for further details regarding this request.

    opened by aitmlouk 3
  • Timestamp on combiner log

    Timestamp on combiner log

    Is your feature request related to a problem? Please describe. Why not adding a Timestamp to the combiner log like reducer, then at least it will help to track errors for long training

    Describe the solution you'd like A clear and concise description of what you want to happen. Instead of having this: combiner_1 | COMBINER(FEDn_Combiner_addi):0 COMBINER: waiting for model updates: 0 of 4 completed.

    it's better to have this: combiner_1 | COMBINER(FEDn_Combiner_addi): 09/24/2021 09:38:58 AM 0 COMBINER: waiting for model updates: 0 of 4 completed.

    Describe any drawbacks (if any) None

    Contact Details An e-mail address in case we need to contact you for further details regarding this request.

    opened by aitmlouk 3
  • Improved visualization of FEDn network graph

    Improved visualization of FEDn network graph

    Is your feature request related to a problem? Please describe.

    Is is often helpful to have a graphical representation of the FEDn network. We have a simple version now in the /network view, but this should be improved to better handle multiple combiners. Also, it would be nice if the actual client names/ids were shown, as well as their status (online/offline).

    feature good first issue 
    opened by ahellander 3
  • Model upload error

    Model upload error

    Severity

    • [x ] Critical/Blocker (select if the issue makes the application unusable or causes serious data loss)
    • [ ] High (select if the issue affects a major feature and there is no workaround or the available workaround is very complex)
    • [ ] Medium (select if the issue affects a minor feature or affects a major feature but has an easy enough workaround to not cause any major inconvenience)
    • [ ] Low (select if the issue doesn't significantly affect the user experience, like minor visual bugs)

    Describe the bug When I try to upload the mnist-keras or mnist-pytorch models at 'https://localhost:8090/control' , it gives a http 500 server error. The log of the reducer contains the following error message:

    model = self.load_model(path) reducer_1 | File "/app/fedn/fedn/utils/pytorchhelper.py", line 31, in load_model reducer_1 | b = np.load(path) reducer_1 | File "/usr/local/lib/python3.8/site-packages/numpy/lib/npyio.py", line 457, in load reducer_1 | raise ValueError("Cannot load file containing pickled data " reducer_1 | ValueError: Cannot load file containing pickled data when allow_pickle=False

    It seems that numpy tries to load a pickled file with the allow_pickle=False option. This is for PyTorch, but the same error comes with the Keras model as well.

    Environment: Ubuntu 18.04

    Contact Details [email protected]

    opened by sallogy 3
  • Metrics menu dropdown broken in develop

    Metrics menu dropdown broken in develop

    Severity

    • [ ] Critical/Blocker (select if the issue makes the application unusable or causes serious data loss)
    • [ ] High (select if the issue affects a major feature and there is no workaround or the available workaround is very complex)
    • [x] Medium (select if the issue affects a minor feature or affects a major feature but has an easy enough workaround to not cause any major inconvenience)
    • [ ] Low (select if the issue doesn't significantly affect the user experience, like minor visual bugs)

    Describe the bug After moving the validation metric plot to the /models view the dropdown where a user can choose which validation metric to plot is broken.

    bugfix 
    opened by ahellander 0
  • Make the pytorch helper the default

    Make the pytorch helper the default

    Is your feature request related to a problem? Please describe. Since we are now basing the quickstart tutorial on the pytorch version of the MINST example, it would reduce the risk for mistakes if we made the pytorch helper the default in the UI.

    enhancement 
    opened by ahellander 0
  • Add an additional quickstart tutorial that does not use docker and docker-compose

    Add an additional quickstart tutorial that does not use docker and docker-compose

    Is your feature request related to a problem? Please describe. User understanding of how FEDn works might be obscured by the level of automation in the current examples that relies on Docker and docker-compose.

    Describe the solution you'd like To complement the existing, developer-oriented examples, we should add a step-by-step guide that only uses venv.

    How would the solution positively affect the functionality? Easier to grasp what each services is doing.

    enhancement 
    opened by ahellander 0
  • Simplify the example / quickstart logic for users with more limited docker experience

    Simplify the example / quickstart logic for users with more limited docker experience

    Is your feature request related to a problem? Please describe.

    We have been going back and forth on this one, trading off the ability to easily automate launching an arbitrary number of clients for the minst quickstart examples. Right now it is automated, but this comes at the price that we are using docker library in the entrypoint, and a relatively complex setup with docker-compose (overriding a base file). To strike a better balance here we will make the number of clients more static in the default configuration, with examples / docs for how to automate scaling of the clients for testing purposes.

    Describe the solution you'd like When provide a docker-compose file that "manually" creates 3 clients (each with different, human readable name), and the default data-partition should be 3.

    How would the solution positively affect the functionality? It will make the different steps involved more transparent and we can remove the docker library dependency in the entrypoint.

    Describe any drawbacks (if any) It will create more work to automate launching arbitrary number of clients.

    enhancement 
    opened by ahellander 0
  • Network graph fails to render on newer bokeh, networkx

    Network graph fails to render on newer bokeh, networkx

    Severity

    • [ ] Critical/Blocker (select if the issue makes the application unusable or causes serious data loss)
    • [ ] High (select if the issue affects a major feature and there is no workaround or the available workaround is very complex)
    • [ ] Medium (select if the issue affects a minor feature or affects a major feature but has an easy enough workaround to not cause any major inconvenience)
    • [x] Low (select if the issue doesn't significantly affect the user experience, like minor visual bugs)

    Describe the bug The graph is not shown, due to this error:

    fedn-reducer-1 | ValueError: failed to validate StaticLayoutProvider(id='p1074', ...).graph_layout: expected an element of Dict(Int, Seq(Any)), got {'combiner': array([-0.0008935, 0.0017451]), 'reducer': array([ 0.45759661, -0.88965251]), '637d40c4e93300a65d6c1bb7': array([0.54329689, 0.84086893]), '637d40c4e93300a65d6c1bb9': array([-1. , 0.04703848])}

    Environment:

    • OS: [e.g. mac OS]
    • Version: [e.g. macOS Catalina]
    • Browser [e.g. chrome, safari]
    • Version [e.g. 22]
    bugfix 
    opened by ahellander 0
Releases(v0.4.0-beta.1)
  • v0.4.0-beta.1(Dec 22, 2022)

    What's Changed

    • Feature/SK-211 Enable toggle ssl by @ahellander in https://github.com/scaleoutsystems/fedn/pull/417
    • Fix/Bump tensorflow from 2.7.1 to 2.7.2 in /examples/mnist-keras by @dependabot in https://github.com/scaleoutsystems/fedn/pull/422
    • Feature/issue#393 Add python 3.10 to test matrix by @ahellander in https://github.com/scaleoutsystems/fedn/pull/424
    • Feature/SK-227 Add option to force SSL by @Wrede in https://github.com/scaleoutsystems/fedn/pull/426
    • Feature/SK-229 Add docker metadata to build tags by @Wrede in https://github.com/scaleoutsystems/fedn/pull/427
    • Bug/Fix error in deployment docs by @ahellander in https://github.com/scaleoutsystems/fedn/pull/425
    • Bug/Fix config download error for Python 3.10 by @ahellander in https://github.com/scaleoutsystems/fedn/pull/435
    • Feature/SK-248 Enable CA signed cert for GRPC channel by @Wrede in https://github.com/scaleoutsystems/fedn/pull/439
    • Bug/Fetch ca cert via ssl package by @Wrede in https://github.com/scaleoutsystems/fedn/pull/440
    • Feature/436 improvements to the UI/dashboard by @ahellander in https://github.com/scaleoutsystems/fedn/pull/443
    • Feature/438 Enables config of all settings in client.yaml (mirror CLI) by @ahellander in https://github.com/scaleoutsystems/fedn/pull/441
    • Fix/pin bokeh dep, resolves #444, filter graph to only plot online client… by @ahellander in https://github.com/scaleoutsystems/fedn/pull/445
    • Feature/446 Improves the example docs by @ahellander in https://github.com/scaleoutsystems/fedn/pull/447

    Full Changelog: https://github.com/scaleoutsystems/fedn/compare/v0.3.3...v0.4.0-beta.1

    Source code(tar.gz)
    Source code(zip)
  • v0.3.3(Aug 3, 2022)

    What's new?

    • UI token authentication: #388
    • Dispatcher commands are now wrapped in a shell: #390
    • MD5 checksum replaced with SHA256: #405
    • Documentation is now versioned and moved to RTD: #408, #409, #410
    • Code quality: #394, #398, #401, #399, #404, #406, #407
    • Minor improvements: #395, #403

    Bugfixes

    • Pin minio version in compose file: #382
    • Minor fixes: #387, #389, #397, #396, #400
    • Fix broken CI and issues introduced by linting: #416

    Other

    • Documentation fixes and improvements: #378, #379, #381, #402, #392, #411
    • Patch for compatibility with Gramine LibOS: #380
    • Images are now solely stored on GH registry: #384
    Source code(tar.gz)
    Source code(zip)
  • v0.3.3b2(Jul 7, 2022)

    What's new?

    • UI token authentication: #388
    • Dispatcher commands are now wrapped in a shell: #390
    • MD5 checksum replaced with SHA256: #405
    • Documentation is now versioned and moved to RTD: #408, #409, #410
    • Code quality: #394, #398, #401, #399, #404, #406, #407
    • Minor improvements: #395, #403

    Bugfixes

    • Pin minio version in compose file: #382
    • Minor fixes: #387, #389, #397, #396, #400
    • Fix broken CI and issues introduced by linting: #416

    Other

    • Documentation fixes and improvements: #378, #379, #381, #402, #392, #411
    • Patch for compatibility with Gramine LibOS: #380
    • Images are now solely stored on GH registry: #384
    Source code(tar.gz)
    Source code(zip)
  • v0.3.3b1(Jul 5, 2022)

    What's new?

    • UI token authentication: #388
    • Dispatcher commands are now wrapped in a shell: #390
    • MD5 checksum replaced with SHA256: #405
    • Documentation is now versioned and moved to RTD: #408, #409, #410
    • Code quality: #394, #398, #401, #399, #404, #406, #407
    • Minor improvements: #395, #403

    Bugfixes

    • Pin minio version in compose file: #382
    • Minor fixes: #387, #389, #397, #396, #400

    Other

    • Documentation fixes and improvements: #378, #379, #381, #402, #392, #411
    • Patch for compatibility with Gramine LibOS: #380
    • Images are now solely stored on GH registry: #384
    Source code(tar.gz)
    Source code(zip)
  • v0.3.2(Mar 10, 2022)

    What's new?

    • Token (single) authentication: fedn run reducer --secret-key=<your-secret-phrase> will generate a token which will be required for combiners and clients to assign to the network. Clients and combiners are then required to authenticate via fedn run client --token=<generated-token> or by specifying "token" key in settings YAML file.
    • Compute package is no longer required to setup via the web UI: fedn run reducer --local-package Instead clients will make use of a local compute package, i.e. a remote compute package will not be downloaded.
    • VS Code devcontainer has been added

    Bugfixes

    • Fixes an issue where a client could connect before a seed (initial) model has been uploaded

    Other

    • Various minor fixes
    • Documentation
    Source code(tar.gz)
    Source code(zip)
  • v0.3.1(Dec 20, 2021)

    What's new?

    • New resilience feature - clients will now attempt automatic reassignment if their associated combiner goes missing for more than 60s (default, configurable on the CLI).
    • Refactor of Client initialization.

    Bugfixes

    • Fixes an issue where training requests to clients are delayed after a client has disconnected from combiner.

    Other

    • Various minor fixes
    • Documentation
    Source code(tar.gz)
    Source code(zip)
  • v0.3.0(Dec 3, 2021)

    What's new?

    • It is now possible to start clients as trainer, validators, or both (default).
    • Clients now execute training and validation events sequentially, this improves client stability for large models.
    • Improved visualization of the network graph.
    • Clients and their status are now listed on the network page.
    • Refactoring of the combiner aggregator API, making it easier to extend with your own aggregator.

    Other

    • Minor bugfixes and stability improvements.
    • Improved documentation.
    Source code(tar.gz)
    Source code(zip)
  • v0.2.5(Aug 27, 2021)

    v0.2.5

    What's new?

    • The examples previously residing in 'test' have been refactored into a separate repository: https://github.com/scaleoutsystems/examples
    • Docker-compose templates for Minio upgraded to support latest version

    Other

    • Documentation updated
    • Introduce Discord community server
    • Updated dependency to conform with new minio versions.
    Source code(tar.gz)
    Source code(zip)
  • v0.2.4(Jun 15, 2021)

    v0.2.4

    What's new?

    • Introduced a new events view.
    • Introduced a new view for viewing network layout, (reducer, combiner and clients hierarchy)
    • Introduced a new setup guide-phase to ensure prereqs like package and model are set before starting execution.
    • Introduced a better form for parameter selection on run configuration.
    • Introduced async dispatching of run configurations.
    • Introduced async update refresh of several important fields for user convenincence like status, events, network hierarchy etc.
    • Introduced a new download-client-config function to allow for faster and more convenient client configuration. (Just download config and point your local client and whoallah! You are online in this federation.)

    Other

    • Fixed logic bugs related to framework persistance.
    • Fixed a logic bug causing clients to get assigned prior to compute package assignment (and hence will not account for assignment policy).
    • Fixed a logic bug if reducer is resumed from previous state (to ensure) that the right compute package is selected.
    • Update dependency versions.
    Source code(tar.gz)
    Source code(zip)
  • v0.2.3(May 19, 2021)

    What's new?

    • Support for latest Minio
    • Improvements i UI - now not possible to submit jobs is in monitoring state.
    • Improvement of Docker image hierarchy.

    Other:

    • Docs updates
    • Several bugfixes and security patches.
    Source code(tar.gz)
    Source code(zip)
  • v0.2.2(Apr 19, 2021)

    v0.2.2

    What's new?

    • The MNIST examples (Keras and PyTorch) have been updated so that they now bundle the example data in .npz format.

    Other

    • Docs updates
    Source code(tar.gz)
    Source code(zip)
  • v0.2.1(Mar 30, 2021)

    v0.2.1

    What's new?

    • It is now possible to choose which validation metrics to plot in the Dashboard

    Fixes

    • Plots backed by no current data is no longer shown as empty plots.

    Other

    • Docs updates
    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Mar 5, 2021)

    What's new?

    • It's now possible to have examples in external repositories
    • Support for models constructed with the Keras Functional API
    • Set maximum number of clients in the settings file
    combiner:
    name:
    combinerhost:
    combinerport: 12080
    max_clients: 50
    
    • Added visualizations on FEDn communication performance to the dashboard
    • Added client allocation policy to spread the clients evenly over the combiners
    • Use config for s3 model commits instead of a hard-coded bucket name
    • Memory management to prevent combiners from going off
    • Now possible to upload the compute package through the UI
    • Reducer, client and combiner now have their own Dockerfile definitions

    Fixes

    • Combiners now handle the case when all clients fail to update a model

    Other

    • Lots of product documentation updates
    Source code(tar.gz)
    Source code(zip)
  • v0.1.4(Nov 25, 2020)

  • v0.1.3(Nov 14, 2020)

  • v0.1.2(Nov 12, 2020)

    Additions:

    • Added new plot for time/round
    • Added CPU loads and MEM plot for all rounds
    • Allocate clients to accepting combiner with the least number of clients
    • Monitoring CPU/MEM/ROUNDS with personalized plots
    • Added HTML documentation templates, now accessible from FEDn

    Fixes:

    • Removed flask-dashboard dependency which was previously bloating the Docker image
    • Combiner now handles the case when all Clients fails to update the model
    • Removed usage of hard-coded "models" bucket for s3 model commits, now using config instead
    Source code(tar.gz)
    Source code(zip)
  • v0.1.1(Nov 9, 2020)

  • v0.1.0(Oct 13, 2020)

    Major

    Compute package bundling, distribution and execution.

    • Ability to toggle remote distribution

    Reducer init sequence.

    • Ability to re(initialize) reducer from command line and settings.yaml
    • Control state_store

    Tempfile storage backend

    • combiner can now choose a tempfile backend for file storage of interim models.

    Minor

    Many performance improvements and bug fixes. See complete change log for details.

    Source code(tar.gz)
    Source code(zip)
Owner
Scaleout
Solving the data isolation problem in machine learning
Scaleout
TianyuQi 10 Dec 11, 2022
🔥 Cogitare - A Modern, Fast, and Modular Deep Learning and Machine Learning framework for Python

Cogitare is a Modern, Fast, and Modular Deep Learning and Machine Learning framework for Python. A friendly interface for beginners and a powerful too

Cogitare - Modern and Easy Deep Learning with Python 76 Sep 30, 2022
An open framework for Federated Learning.

Welcome to Intel® Open Federated Learning Federated learning is a distributed machine learning approach that enables organizations to collaborate on m

Intel Corporation 397 Dec 27, 2022
Implementation of the paper "Language-agnostic representation learning of source code from structure and context".

Code Transformer This is an official PyTorch implementation of the CodeTransformer model proposed in: D. Zügner, T. Kirschstein, M. Catasta, J. Leskov

Daniel Zügner 131 Dec 13, 2022
PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

简体中文 | English PaddleRobotics paddleRobotics是基于paddle的机器人开源算法库集,包括人机交互、复杂运动控制、环境感知、slam定位导航等开源算法部分。 人机交互 主动多模交互技术TFVT-HRI 主动多模交互技术是通过视觉、语音、触摸传感器等输入机器人

null 185 Dec 26, 2022
PyTorch implementation of the supervised learning experiments from the paper Model-Agnostic Meta-Learning (MAML)

pytorch-maml This is a PyTorch implementation of the supervised learning experiments from the paper Model-Agnostic Meta-Learning (MAML): https://arxiv

Kate Rakelly 516 Jan 5, 2023
Plato: A New Framework for Federated Learning Research

a new software framework to facilitate scalable federated learning research.

System Group@Theory Lab 192 Jan 5, 2023
An Open Source Machine Learning Framework for Everyone

Documentation TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, a

null 170.1k Jan 4, 2023
An Open Source Machine Learning Framework for Everyone

Documentation TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, a

null 170.1k Jan 5, 2023
An Open Source Machine Learning Framework for Everyone

Documentation TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, a

null 153.2k Feb 13, 2021
The hippynn python package - a modular library for atomistic machine learning with pytorch.

The hippynn python package - a modular library for atomistic machine learning with pytorch. We aim to provide a powerful library for the training of a

Los Alamos National Laboratory 37 Dec 29, 2022
Elegy is a framework-agnostic Trainer interface for the Jax ecosystem.

Elegy Elegy is a framework-agnostic Trainer interface for the Jax ecosystem. Main Features Easy-to-use: Elegy provides a Keras-like high-level API tha

null 435 Dec 30, 2022
Supervised domain-agnostic prediction framework for probabilistic modelling

A supervised domain-agnostic framework that allows for probabilistic modelling, namely the prediction of probability distributions for individual data

The Alan Turing Institute 112 Oct 23, 2022
An Agnostic Computer Vision Framework - Pluggable to any Training Library: Fastai, Pytorch-Lightning with more to come

IceVision is the first agnostic computer vision framework to offer a curated collection with hundreds of high-quality pre-trained models from torchvision, MMLabs, and soon Pytorch Image Models. It orchestrates the end-to-end deep learning workflow allowing to train networks with easy-to-use robust high-performance libraries such as Pytorch-Lightning and Fastai

airctic 789 Dec 29, 2022
Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

CCAM (Unsupervised) Code repository for our paper "CCAM: Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localizati

Computer Vision Insitute, SZU 113 Dec 27, 2022
Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-wise Distributed Data based on Pytorch Framework

VFedPCA+VFedAKPCA This is the official source code for the Paper: Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-

John 9 Sep 18, 2022
A Research-oriented Federated Learning Library and Benchmark Platform for Graph Neural Networks. Accepted to ICLR'2021 - DPML and MLSys'21 - GNNSys workshops.

FedGraphNN: A Federated Learning System and Benchmark for Graph Neural Networks A Research-oriented Federated Learning Library and Benchmark Platform

FedML-AI 175 Dec 1, 2022
Everything you want about DP-Based Federated Learning, including Papers and Code. (Mechanism: Laplace or Gaussian, Dataset: femnist, shakespeare, mnist, cifar-10 and fashion-mnist. )

Differential Privacy (DP) Based Federated Learning (FL) Everything about DP-based FL you need is here. (所有你需要的DP-based FL的信息都在这里) Code Tip: the code o

wenzhu 83 Dec 24, 2022