An Industrial Grade Federated Learning Framework

Federated AI Ecosystem

Last update: Jan 9, 2023

Related tags

Overview

FATE (Federated AI Technology Enabler) is an open-source project initiated by Webank's AI Department to provide a secure computing framework to support the federated AI ecosystem. It implements secure computation protocols based on homomorphic encryption and multi-party computation (MPC). It supports federated learning architectures and secure computation of various machine learning algorithms, including logistic regression, tree-based algorithms, deep learning and transfer learning.

https://fate.fedai.org

Federated Learning Algorithms in FATE

FATE already supports a number of federated learning algorithms, including vertical federated learning, horizontal federated learning, and federated transfer learning. More details are available in federatedml.

Installation

FATE can be installed on Linux or Mac. Now, FATE can support two installation approaches：

Native installation: standalone and cluster deployments;
Cloud native installation using KubeFATE:
- Multi-party deployment by Docker-compose, which is for the development and testing purpose;
- Cluster (multi-node) deployment by Kubernetes.

Native installation:

Software environment: JDK 1.8+, Python 3.6, python virtualenv and mysql 5.6+

Standalone Runtime

FATE provides Standalone runtime architecture for developers. It can help developers quickly test FATE. Standalone support two types of deployment: Docker version and Manual version. Please refer to Standalone Deployment Guide.

Cluster Runtime

FATE also provides a Cluster (distributed) runtime architecture for big data scenario. Migration from Standalone Runtime to Cluster Runtime requires only changes of the configuration. No change of the algorithm is needed. To deploy FATE on a cluster, please refer to Cluster Deployment Guide.

KubeFATE installation:

Using KubeFATE, FATE can be deployed by either Docker-Compose or Kubernetes:

For development or testing purposes, Docker-Compose is recommended. It only requires Docker environment. For more detail, please refer to Deployment by Docker Compose.
For a production or a large scale deployment, Kubernetes is recommended as an underlying infrastructure to manage a FATE cluster. For more detail, please refer to Deployment on Kubernetes.

More instructions can be found in the repo of KubeFATE.

FATE-Client

FATE-client is a tool for easy interaction with FATE. We strongly recommend you install FATE-client and use it with FATE conveniently. Please refer to this document for more details on FATE-Client.

Running Tests

A script to run all the unit tests has been provided in ./python/federatedml/test folder.

Once FATE is installed, tests can be run using:

$ sh ./python/federatedml/test/run_test.sh

All the unit tests should pass if FATE is installed properly.

Documentation

Quick Start Guide

A tutorial of getting started with modeling tasks can be found here.

Obtaining Model and Checking out Results

Functions such as tracking component output models or logs can be invoked by a tool called fate-flow. The deployment and usage of fate-flow can be found here.

API Guide

FATE provides API documents in doc-api.

Development Guide

To develop your federated learning algorithms using FATE, please refer to FATE Development Guide.

Getting Involved

Join our maillist FATE-FedAI Group IO. You can ask questions and participate in the development discussion.
Check out the FAQ for any questions you may have.
Please report bugs by submitting issues.
Submit contributions using pull requests.
Bilibili: @FATEFedAI
Twitter: @FATEFedAI

License

Apache License 2.0

Comments

Discussions

This issue is specially opened for discussion. Here, you can feedback some problem such as install problem when you install FATE or new feature which you think is important for you. Finally， hope to use English to ask your questions. Thanks. dylanfan

opened by dylan-fan 23
ERROR: <_Rendezvous of RPC that terminated with: status = StatusCode.INTERNAL

I am using the cluster version FATE on the Centos 7 OS. I have already successfully passed the toy_example test and min test in both the fast and normal mode. When I was trying to use the quick_run.py to run the secureboost example, I successfully completed the train task but failed in the predict task.
There was no error.log in the eggroll logs but an error.log in logs/fate_flow. You can see the ERROR.log and the fate_flow ERROR.log in the attachments. Actually, I have seen the same error message when I run the toy_example test and min test. And it seems that once this error occurs, whatever test I run the system will report the same error. I have checked the netstat and the status of the service and there reports no error. Also, I have restarted the service several times but it doesn't help. I wonder whether anyone can help me locate the problem or solve this problem, thank you so much.

The ERROR message are as follows: ERROR.log ERROR.log

fate_flow ERROR.log fate_flow_ERROR.log

The netstat on both sides are as follows:

opened by vivianwky 14
发起联邦任务时出现parameter to mergefrom() must be instance of same class错误

在arm芯片服务器上部署了fate之后，运行run_toy_example.py出现如下错误： Federated schedule error, Parameter to mergefrom() must be instance of same class: expected basic_meta_pb2.Endpoint got basic_meta_pb2.Endpoint. for field Topic.callback 想请教一下为啥出现这样的错误，然后应该如何处理呢？谢谢～

opened by ykcirh 12
About the auto-deploy.sh -- grpc instalment

Is your feature request related to a problem? Please describe.

When the auto-deploy.sh comes to compile the grpc, it can attend the "third_party/protobuf", but there are no files inside and the following commands will get error;

''' cd third_party/protobuf ./autogen.sh ./configure --prefix=$dir/third_party '''

the printout of these three lines command is shown below;

[INSTALL] Installing grpc protoc plugins [INSTALL] Installing root certificates third_party/protobufdata/projects/fate/storage-service-cxx/third_party/grpc# cd
d_party/protobuf# ./autogen.shcts/fate/storage-service-cxx/third_party/grpc/third bash: ./autogen.sh: No such file or directory d_party/protobuf# ./configure --prefix=/data/projects/fate/third_party/grpc/third bash: ./configure: No such file or directory d_party/protobuf# maketa/projects/fate/storage-service-cxx/third_party/grpc/third make: *** No targets specified and no makefile found. Stop. d_party/protobuf# make checkjects/fate/storage-service-cxx/third_party/grpc/third make: *** No rule to make target 'check'. Stop. d_party/protobuf# make installcts/fate/storage-service-cxx/third_party/grpc/third make: *** No rule to make target 'install'. Stop. d_party/protobuf# cd /data/projects/fate/storage-service-cxx/ird_party/grpc/third cts/fate/third_party/* ./third_party/e/storage-service-cxx# rsync -a /data/proje (base) root@lxy-PA:/data/projects/fate/storage-service-cxx#

Describe the solution you'd like

Are there some matter with the auto-deploy.sh file? Thanks

opened by YangjieZhou 12
fate v1.9.0 有关神经网络的问题
在fate v1.9.0版本参考文档给出的demo https://github.com/FederatedAI/FATE/blob/master/doc/tutorial/pipeline/pipeline_tutorial_homo_nn.ipynb测试神经网络发现一个错误。

1.首次执行代码提示缺失tensorflow模块在我使用了pip install tensorflow下载了tensorflow模块并再次执行了代码，再次发生以下错误 #4377 ，错误信息如下追入/pipeline/component/nn/models/sequantial.py 发现问题代码如下： ` def add(self, layer): if _TF_KERAS_VALID and isinstance(layer, base_layer.Layer): layer_type = "keras" elif isinstance(layer, dict): layer_type = "nn" elif hasattr(layer, "module") and getattr(layer, "module").startswith("torch.nn.modules"): layer_type = "pytorch" else:

raise ValueError("Layer type {} not support yet".format(type(layer)))`

在 isinstance(layer, base_layer.Layer)中返回的是false，所以抛出了以上错误，将sequantial.py文件中的from tensorflow.python.keras.engine import base_layer 修改为 from keras.engine import base_layer则解决了这个问题。所以是因为我下载的tensorflow版本发生了错误还是需要修改一下文档？修改的文档在这里 #4380
opened by oceanqdu 11
homo-convert发生错误

在根据给出的demo执行 flow model homo-convert -c examples/model/homo_convert_model.json 发生以下错误。同时在使用pipeline提交任务并使用， pipeline.model_convert.convert() 将会发生同样的错误

fate版本 1.8.0 centos7 请问一下是什么原因？
bug

opened by oceanqdu 10
有关FATE联邦学习横向联邦学习的问题

场景：参与方A和B使用FATE的HomoSecureBoost训练得到一个横向联邦学习模型。在得到模型后，参与方A想要使用这个模型进行预测，因此只需要A上传预测数据。在执行过程中发现无法执行预测任务。问题：在横向联邦学习中训练得到模型是分布式存储在两个参与方的吗？如果一方想使用模型进行预测，另一方只作为计算方可以吗？希望能够得到您的解答

opened by oceanqdu 10
[Transfer Issue] 使用docker-compose版本测试homo_nn遇到的问题

**What deployment mode you are use? ** docker-compose;

**What KubeFATE and FATE version you are using? ** kubefate-docker-compose-v1.8.0

在使用docker-compose测试参考文档https://github.com/FederatedAI/FATE/blob/master/doc/tutorial/pipeline/pipeline_tutorial_homo_nn.ipynb 测试homo_nn遇到了一个报错

ValueError: Layer type <class 'keras.layers.core.Dense'> not support yet

在部署kubefate时我已经修改了我的parties.conf 文件，文件内容如下：

我想问一下应该怎么解决这个问题呢

opened by oceanqdu 9
有关pipeline实现横向联邦学习的预测

#https://github.com/FederatedAI/FATE/blob/master/doc/tutorial/pipeline/pipeline_tutorial_hetero_sbt.ipynb 正如教程中所示通过pipeline只能实现纵向联邦学习的预测过程，而在横向联邦学习的某些场景下，有时候并不需要所有参与方在预测过程中都提供数据，正如 #4314中所假设的场景，希望pipeline能够实现在横向联邦学习中单个参与方进行预测。
enhancement

opened by oceanqdu 9

自定义组件无法运行

New Algorithm

根据步骤自定义组件，（mate/param/核心代码）但在实际运行时候卡住无法运行，检擦日志发现是fate_flow在计算状态时长度为0 请问这个应该怎么解决？

日志

[INFO] [2022-05-09 06:51:16,365] [202205090318518103400] [32:140455706732288] - [dag_scheduler.schedule_running_job] [line:404]: scheduling running job
[INFO] [2022-05-09 06:51:16,365] [202205090318518103400] [32:140455706732288] - [task_scheduler.schedule] [line:35]: scheduling job tasks
[INFO] [2022-05-09 06:51:16,367] [202205090318518103400] [32:140455706732288] - [task_scheduler.schedule] [line:93]: finish scheduling job tasks
[ERROR] [2022-05-09 06:51:16,367] [202205090318518103400] [32:140455706732288] - [dag_scheduler.run_do] [line:224]: calculate job status failed, all task status: dict_values([])
Traceback (most recent call last):
  File "/data/projects/fate/fateflow/python/fate_flow/scheduler/dag_scheduler.py", line 222, in run_do
    self.schedule_running_job(job=job)
  File "/data/projects/fate/fateflow/python/fate_flow/scheduler/dag_scheduler.py", line 411, in schedule_running_job
    new_job_status = cls.calculate_job_status(task_scheduling_status_code=task_scheduling_status_code, tasks_status=tasks_status.values())
  File "/data/projects/fate/fateflow/python/fate_flow/scheduler/dag_scheduler.py", line 564, in calculate_job_status
    raise Exception("calculate job status failed, all task status: {}".format(tasks_status))
Exception: calculate job status failed, all task status: dict_values([])
[ERROR] [2022-05-09 06:51:16,367] [202205090318518103400] [32:140455706732288] - [dag_scheduler.run_do] [line:225]: schedule job failed

federatedml

opened by turkeymz 9

fateboard 1.8无法登录

New Algorithm

更新fate V1.8之后无法登录fateboard

Short Description

更新fate1.8以后，无法登录fateboard

默认不设置情况下，无法使用admin/admin登录
yaml配置admin/admin（容器内部配置文件修改成功），依然无法登录

配置与截图

server.port=8080
fateflow.url=http://fateflow:9380
fateflow.http_app_key=
fateflow.http_secret_key=
spring.http.encoding.charset=UTF-8
spring.http.encoding.enabled=true
server.tomcat.uri-encoding=UTF-8
fateboard.front_end.cors=false
fateboard.front_end.url=http://localhost:8028
server.tomcat.max-threads=1000
server.tomcat.max-connections=20000
spring.servlet.multipart.max-file-size=10MB
spring.servlet.multipart.max-request-size=100MB
server.compression.enabled=true
server.compression.mime-types=application/json,application/xml,text/html,text/xml,text/plain
server.board.login.username=admin
server.board.login.password=admin
server.servlet.session.timeout=4h
server.servlet.session.cookie.max-age=4h
management.endpoints.web.exposure.exclude=*

fateerror

opened by turkeymz 9

How to only use partial data in each epoch

Hi,

I noticed that every participant will train her/his model using the whole local dataset for every online_epoch. For example, it is supposed that A/B has 3200 samples, if batch_size=32, A/B will use 100 mini_batches for a single local training round. Considering the convergency of aggregation and security, it would be better only use fractional mini_batches, e.g. 0.1*100=10, for the local train in each epoch. Does fate currently support configuring this hyperparameter?

regards, Timo

opened by Timo9Madrid7 5
横向逻辑回归训练过程添加加密参数后训练过程卡住
使用fate版本：1.7.2

按A，B，C三个节点部署

问题描述：

测试横向逻辑回归脚本：https://github.com/FederatedAI/FATE/blob/master/examples/pipeline/homo_logistic_regression/pipeline-homo-lr-train-eval.py，可以在两分钟左右完成训练，在param中增加加密参数（"encrypt_param": {"method": "Paillier", "key_length": 512}）之后，训练在第一轮完成后一直卡住不能开始第二轮迭代。所以两个问题：

如果没有“encrypt_param”，Fate的横向联邦是否安全，“encrypt_param”是不是必须的

为什么训练过程卡住，是加密过程导致的吗
opened by kaiwang0112006 5
Pipeline Tutorial with HeteroSecureBoost训练模型报错

kubefate1.9.0测试10000的client上Pipeline Tutorial with HeteroSecureBoost训练模型报错

root@87e91f7ded6b:/data/projects/fate# python pipeline_heterosecureboost_train.py 2022-12-20 17:45:44.367 | ERROR | main::34 - An error has been caught in function '', process 'MainProcess' (689), thread 'MainThread' (140540335593280): Traceback (most recent call last):

File "/usr/local/lib/python3.8/site-packages/pipeline/utils/invoker/job_submitter.py", line 46, in submit_job raise ValueError(f"retcode err, callback result is {result}")

ValueError: retcode err, callback result is {'retcode': 100, 'retmsg': 'Connection refused, Please check if the fate flow service is started'}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File "pipeline_heterosecureboost_train.py", line 34, in pipeline.fit() │ └ <function PipeLine.fit at 0x7fd21765c940> └ <pipeline.backend.pipeline.PipeLine object at 0x7fd218b3f820>

File "/usr/local/lib/python3.8/site-packages/pipeline/backend/pipeline.py", line 568, in fit self._train_job_id, detail_info = self._job_invoker.submit_job(self._train_dsl, training_conf, callback_func) │ │ │ │ │ │ │ │ └ None │ │ │ │ │ │ │ └ {'dsl_version': 2, 'initiator': {'role': 'guest', 'party_id': 9999}, 'role': {'guest': [9999], 'host': [10000]}, 'job_paramet... │ │ │ │ │ │ └ {'components': {'reader_0': {'module': 'Reader', 'output': {'data': ['data']}, 'provider': 'fate_flow'}, 'data_transform_0': ... │ │ │ │ │ └ <pipeline.backend.pipeline.PipeLine object at 0x7fd218b3f820> │ │ │ │ └ <function JobInvoker.submit_job at 0x7fd2176d2b80> │ │ │ └ <pipeline.utils.invoker.job_submitter.JobInvoker object at 0x7fd218b3f370> │ │ └ <pipeline.backend.pipeline.PipeLine object at 0x7fd218b3f820> │ └ None └ <pipeline.backend.pipeline.PipeLine object at 0x7fd218b3f820> File "/usr/local/lib/python3.8/site-packages/pipeline/utils/invoker/job_submitter.py", line 54, in submit_job raise ValueError("job submit failed, err msg: {}".format(result)) └ {'retcode': 100, 'retmsg': 'Connection refused, Please check if the fate flow service is started'}

ValueError: job submit failed, err msg: {'retcode': 100, 'retmsg': 'Connection refused, Please check if the fate flow service is started'} Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/pipeline/utils/invoker/job_submitter.py", line 46, in submit_job raise ValueError(f"retcode err, callback result is {result}") ValueError: retcode err, callback result is {'retcode': 100, 'retmsg': 'Connection refused, Please check if the fate flow service is started'}

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "pipeline_heterosecureboost_train.py", line 34, in pipeline.fit() File "/usr/local/lib/python3.8/site-packages/loguru/_logger.py", line 1226, in catch_wrapper return function(*args, **kwargs) File "/usr/local/lib/python3.8/site-packages/pipeline/backend/pipeline.py", line 568, in fit self._train_job_id, detail_info = self._job_invoker.submit_job(self._train_dsl, training_conf, callback_func) File "/usr/local/lib/python3.8/site-packages/pipeline/utils/invoker/job_submitter.py", line 54, in submit_job raise ValueError("job submit failed, err msg: {}".format(result)) ValueError: job submit failed, err msg: {'retcode': 100, 'retmsg': 'Connection refused, Please check if the fate flow service is started'}

opened by desertfoxfj 1

两台主机使用standalone进行部署并进行交互

版本1.9.0的更新中提到了，两台不同的主机通过部署standalone版本也能实现联邦。请问具体的实现方式是怎么样的呢？之前看到有回答说需要部署额外的通讯服务(pulsar或rabbitmq), 然后修改conf/service_conf.yaml。请帮忙看一下以下步骤可以吗？ 1.在两台服务器上成功部署fate 1.9.0 standalone版本。 2、两台主机上全部需要部署通信服务pulsar。（每台主机都部署pulsar的单机模式(standalone)可以吗，部署过程是否可以参考官方文档中的部署方式 https://fate.readthedocs.io/en/latest/zh/deploy/cluster-deploy/doc/fate_on_spark/common/pulsar_deployment_guide/ ) 3、修改conf/service_conf.yaml文件，修改内容如下：（请您帮忙看一下yaml文件有无问题）

use_registry: false
use_deserialize_safe_module: false
dependent_distribution: false
encrypt_password: false
encrypt_module: fate_arch.common.encrypt_utils#pwdecrypt
private_key:
party_id:
hook_module:
  client_authentication: fate_flow.hook.flow.client_authentication
  site_authentication: fate_flow.hook.flow.site_authentication
  permission: fate_flow.hook.flow.permission
hook_server_name:
authentication:
  client:
    switch: false
    http_app_key:
    http_secret_key:
  site:
    switch: false
permission:
  switch: false
  component: false
  dataset: false
fateflow:
   proxy:
     name: fateflow
     host: xx  #本机IP
     http_port: 9380
     grpc_port: 9360
database:
  name: fate_flow
  user: fate
  passwd: fate
  host: 127.0.0.1
  port: 3306
  max_connections: 100
  stale_timeout: 30
zookeeper:
  hosts:
    - 127.0.0.1:2181
  use_acl: false
  user: fate
  password: fate

default_engines:
  computing: standalone
  federation: pulsar
  storage: standalone
fate_on_standalone:
  standalone:
    cores_per_node: 20
    nodes: 1
fate_on_eggroll:
  clustermanager:
    cores_per_node: 16
    nodes: 1
  rollsite:
    host: 127.0.0.1
    port: 9370
fate_on_spark:
  spark:
    home:
    cores_per_node: 20
    nodes: 2
  linkis_spark:
    cores_per_node: 20
    nodes: 2
    host: 127.0.0.1
    port: 9001
    token_code: MLSS
    python_path: /data/projects/fate/python
  hive:
    host: 127.0.0.1
    port: 10000
    auth_mechanism:
    username:
    password:
  linkis_hive:
    host: 127.0.0.1
    port: 9001
  hdfs:
    name_node: hdfs://fate-cluster
    path_prefix:
  pulsar:
    host: 127.0.0.1
    port: 6650
    mng_port: 8080
    cluster: standalone
    tenant: fl-tenant
    topic_ttl: 30
    # default conf/pulsar_route_table.yaml
    route_table:
    mode: replication
    max_message_size: 1048576
  nginx:
    host: 127.0.0.1
    http_port: 9300
    grpc_port: 9310
# external services
fateboard:
  host: 127.0.0.1
  port: 8080

enable_model_store: false
model_store_address:

  storage: tencent_cos
  Region:
  SecretId:
  SecretKey:
  Bucket:

servings:
  hosts:
    - 127.0.0.1:8000
fatemanager:
  host: 127.0.0.1
  port: 8001
  federatedId: 0

假设两台主机同时做完上述步骤之后，他们俩怎么才能实现交互？怎么指定各自的partyid？谁是guest 谁是host？是需要修改conf/pulsar_route_table.yaml文件中的内容吗？仅仅需要修改文件中host和guest的ip就可以吗？，还需其他操作吗？

如果步骤不对能否给出正确的操作步骤，或者给出修改过后的一个有关 conf/service_conf.yaml的demo作为参考？实在是没有找到有关standalone版本交互的文档或教程，对于一个初学者解决起来有点困难，还希望您能抽出时间帮忙解答一下，十分感谢！

opened by oceanqdu 0

Releases(v1.10.0)

v1.10.0(Dec 29, 2022)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

FederatedML

Renewed Homo NN: PyTorch-based, support flexible model building:

Support user access to complex self-defined PyTorch models or ready-to-use PyTorch models such as DeepFM, ResNet, BERT, Yolo

Support various data set types, may build data set based on PyTorch Dataset

User-defined training loss

User-defined training process: user-defined aggregation algorithm for client and server

Provide API for developing Aggregator

Upgraded Hetero NN: support flexible model building and various data set types:

more flexible pytorch top/bottom model customization; provide access to industry approved PyTorch models

User-defined training loss

Support various data set types, may build data set based on PyTorch Dataset

Renewed Homo-federated framework with support for all current homo models, including Homo NN, Homo LR,Homo SecureBoost, Homo Feature Binning, and Hetero KMeans. This provides smoother algorithm customization and development experience

Semi-Supervised Algorithm Positive Unlabeled Learning

Hetero LR & Hetero SecureBoost now supports Intel IPCL

Intersection support Multi-host Elliptic-curve-based PSI

Intersection may compute Multi-host Secure PSI Cardinality

Hetero Feature Optimal Binning now record & show Gini/KS/Chi-Square metrics

Host may load Hetero Binning model with WOE score through Model Loader

Hetero Feature Binning support binning by user-provided split points

Sampler support weighted sampling by instance weight

Fate-Client

Flow CLI adds min-test options

Pipeline adds data-bind API, useful for local development

Pipeline may reconfigure role/model_id/model_version, switching party_id for prediction task

Source code(tar.gz)
Source code(zip)
v1.8.1(Dec 9, 2022)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

EggRoll

Support EggRoll v2.4.7

Source code(tar.gz)
Source code(zip)
v1.7.3(Dec 9, 2022)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

EggRoll

Support EggRoll v2.4.7

Source code(tar.gz)
Source code(zip)
v1.9.2(Dec 8, 2022)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

EggRoll

Support EggRoll v2.4.7

Source code(tar.gz)
Source code(zip)
v1.9.1(Nov 23, 2022)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

Bug-Fix

Fix cipher compression with large Hessian value for HeteroSecureBoost

Fix tweedie-loss calculation in HeteroSecureBoost

Fix Intersection summary when left-joining data with match_id

Fix event/non_event statistic for WOE computation in HeteroFeatureBinning

Fix default sid name display for data uploaded with meta

Source code(tar.gz)
Source code(zip)
v1.9.0(Aug 31, 2022)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

FederatedML

Add elliptic curve based PSI algorithm, which allows 128-bit secure-level intersection of billion samples, 20x faster than RSA protocol under the same security level

Support accurate intersection cardinality calculation

Support for multi-column ID data; user may specify id column for PSI intersection and subsequent modeling usage

Hetero NN supports torch backend and supports complex layers such as LSTM

Add CoAE label reinforcement mechanism for vertical federated neural network

Hetero NN supports multi-host modeling scenarios

HeteroSecureBoost supports merging sub-models from all parties and exporting the merged model into lightgbm or PMML format

HeteroLR and HeteroSSHELR support merging sub-models from all parties and exporting the merged model into sklearn or PMML format

HeteroFeatureSelection supports anonymous feature selection

Label Encoder adds automatic label type inference

10x faster local VIF computation in HeteroPearson, with added support for computing local VIF on linearly dependent columns

Optimized feature engineering column processing logic

HeteroFeatureBinning supports calculation of IV and WOE values during prediction

Renewed feature anonymous generation logic

FATE-ARCH

Support python3.8+

Support Spark 3x

Renewed Federation module, RabbitMQ and Pulsar support client transmission mode

Support Standalone, Spark, EggRoll heterogeneous computing engine interconnection

Fate-Client

PipeLine adds timeout retry mechanism

Pipeline's get_output_data API now may return component output data in DataFrame-typed format

Source code(tar.gz)
Source code(zip)
v1.8.0(Apr 18, 2022)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

FederatedML

Add non-coordinated-version Hetero Linear Regression, based on integrated Hetero GLM framework, with mixed protocol of HE and SPDZ

Homo LR support one-vs-rest

Add SecureBoost-MO algorithm to speed up multi-class classification of Hetero & Homo SecureBoost, 1.5x-5x faster

Optimize Hetero SecureBoost Predict Transmission Data Size，reduce 75% bandwidth consumption if tree's max depth is small

Speed up DH Intersection implementation, 30%+ faster

Optimized Quantile Binning gk-summary structure & split point query，20%+ faster, less memory cost

Support weighted training in non-coordinated Hetero Logistic Regression & Linear Regression

Merge Hetero FastSecureBoost into Hetero SecureBoost as a boosting strategy option

Fate-ARCH

Adjustable task_cores for standalone FATE

Enable Eggroll option to make computing output "IN_MEMORY" by default

Fate-Test

Include Paillier encryption performance evaluation

Include SPDZ performance evaluation

Optimized testsuite printout

Include examples data upload and mnist download

Provide pipeline to dsl convert tools

Bug-Fix

Fix bug for SPDZ when using default q_filed

Fix multiple get problem of SPDZ

Fix bugs of recursive-query homo feature binning

Fix homo_nn's model aggregation problem

Fix bug for hetero feature selection when using federated filter but some party's feature is empty.

Source code(tar.gz)
Source code(zip)
v1.5.3(Mar 14, 2022)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

FederatedML

New batch strategy in coordinated Hetero LR: support masked batch data and batch shuffle

Iterative Affine is disabled

Eggroll

Support Eggroll v2.2.3, upgrade com.h2database:h2 to version 2.1.210, com.google.protobuf:protobuf-java to version 3.16.1, spring to version 5.1.20.RELEASE

Source code(tar.gz)
Source code(zip)
v1.7.2(Mar 1, 2022)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

FederatedML

New batch strategy in coordinated Hetero LR: support masked batch data and batch shuffle

Model inference protection enhancement for Hetero SecureBoost with FED-EINI algorithm

Hetero SecureBoost supports split feature importance on host side, disables gain feature importance

Offline SBT Feature transform component

Bug-Fix

Fixed Bug for HeteroPearson with changing default q_field value for spdz

Fix Data Transform's schema label name setting problem when with_label is False

Add testing examples for new algorithm features, and delete deprecated params in algorithm examples.

FATE-ARCH

Support the loading of custom password encryption modules through plug-ins

Separate the base connection address of the data storage table from the data table information, and compatible with historical versions

Source code(tar.gz)
Source code(zip)
v1.7.1.1(Jan 27, 2022)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

Deploy

upgrade mysql to version 8.0.28

Eggroll

Support Eggroll v2.4.3, upgrade com.h2database:h2 to version 2.1.210, com.google.protobuf:protobuf-java to version 3.16.1

Source code(tar.gz)
Source code(zip)
v1.7.1(Jan 12, 2022)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

FederatedML

Iterative Affine is disabled

Speed up Hetero Feature Selection, 100x+ faster when feature dimension is high

Speed up OneHot, 60x+ faster when feature dimension is high

Data Statistics supports missing value, with improved efficiency

Fix bug of quantile binning: may lose data when partitions hold too many instances

Fix reconstruction reuse problem of SPDZ

Fix Host's ineffective decay rate of Homo Logistic Regression

Improved strategy for handling missing values when converting Homo SecureBoost using homo model convertor

Improved presentation of Evaluation's confusing matrix

FATE-Client

Add Source Provider attribute to Pipeline components

Eggroll

Support Eggroll v2.4.2, fixed Log4j security bug

Source code(tar.gz)
Source code(zip)
v1.7.0(Nov 24, 2021)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

FATE-ARCH

Support EggRoll 2.4.0

Support Spark-Local Computing Engine

Support Hive Storage

Support LocalFS Storage for Spark-Local Computing Engine

Optimizing the API interface for Storage session and table

Simplified the API interface for Session, remove backend and workmode parameters

Heterogeneous Engine Support: Federation between Spark-Local and Spark-Cluster

Computing Engine, Storage Engine, Federation Engine are set in conf/service_conf.yaml when FATE is deployed

FederatedML

Optimized Hetero-SecureBoost: with gradient packing、cipher compressing, and sparse point statistics optimization, 4x+ faster

Homo-SecureBoost supports memory-based histogram computation for more efficient tree building, 5x+ faster

Optimized RSA Intersect with CRT optimization, 3x+ faster

New two-party Hetero Logistic Regression Algorithm: mixed protocol of HE and MPC, without a trusted third party

Support data with match-id, separating match id and sample id

New DH Intersect based on PH Key-exchange protocol

Intersect support cardinality estimation

Intersect adds optionally preprocessing step

RSA and DH Intersect support cache

New Feature Imputation module: can apply arbitrary imputation method to each column

New Label Transform module: transform categorical label values

Homo-LR, Homo-SecureBoost, Homo-NN now can convert models into sklearn、lightgbm、torch & tf-keras framework

Hetero Feature Binning supports multi-class task, higher efficiency with label packing

Hetero Feature Selection support multi-class iv filter

Secure Information Retrieval supports multi-column retrieval

Major training algorithms support warm-start and checkpoint : Homo & Hetero LR, Homo & Hetero-SecureBoost, Homo & Hetero NN

Optimized Pailler addition operation, several times faster, Hetero-SecureBoost with Paillier speed up 2x+

Fate-Client

Pipeline supports uploading match id functionality

Pipeline supports homo model conversion

Pipeline supports model push to FATE-Serving

Pipeline supports running jobs with specified FATE version

FATE-Test

Integrate FederatedML unittest

Support for uploading image data

Big data generation using storage interface, optimized generation logic

Support for historical data comparison

cache_deps and model_loader_deps support

Run DSL Testsuite with specified FATE version

Source code(tar.gz)
Source code(zip)
v1.6.1(Sep 14, 2021)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

FederatedML

Support single party prediction

SIR support non-ascii id

Selection support local iv filter

Adjustable Paillier key length for Hetero LR

Binning support iv calculation on categorical features

Hetero LR one vs rest support evaluation during training

FATE-Flow:

Support mysql storage engine;

Added service registry interface;

Added service query interface;

Support fate on WeDataSphere mode

Add lock when writing model_local_cache

Register the model download urls to zookeeper

Bug-Fix:

Fix error for deploying module with lack of partial upstream modules in multi-input cases

Fix error for deploying module with multiple output, like data-statistics

Fix job id length no more than 25 limitation

Fix error when loss function of Hetero SecureBoost set to log-cosh

Fix setting predict label to string-type error when Hetero SecureBoost predicts

Fix error for HeteroLR without intercept

Fix quantile error of Data Statistics with specified columns

Fix string parsing error of OneHot with specified columns

Some parameters can now take 0 or 1 integer values when valid range is [0, 1]

Source code(tar.gz)
Source code(zip)
v1.5.2(Jul 29, 2021)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

FederatedML

SIR support non-ascii id

Selection support local iv filter

Adjustable Paillier key length for Hetero LR

Binning support iv calculation on categorical features

Hetero LR one vs rest support evaluation during training

Fate-Flow

Read data from mysql with ‘table bind’ command to map source table to FATE table

FATE cluster push model for one-to-multiple FATE Serving clusters in one party

System Architecture

More efficient ‘sample’ api

Bug Fixes

Fix error for deploying module with lack of partial upstream modules in multi-input cases

Fix job id length no more than 25 limitation

Fix error when loss function of Hetero SecureBoost set to log-cosh

Fix setting predict label to string-type error when Hetero SecureBoost predicts

Fix error for HeteroLR without intercept

Fix torch import error

Fix quantile error of Data Statistics with specified columns

Fix string parsing error of OneHot with specified columns

Some parameters can now take 0 or 1 integer values when valid range is [0, 1]

Source code(tar.gz)
Source code(zip)
v1.6.0(Mar 31, 2021)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

FederatedML

Hetero SecureBoost: more efficient computation with GOSS, histogram subtraction, cipher compression, 2-4x faster

Hetero GLM: improved communication efficiency, adjustable floating point precision, 2x faster

Hetero NN: adjustable floating point precision, support SelectiveBackPropagation and dropOut on interaction layer, 2x faster

Hetero Feature Binning: improved algorithm with cipher compression, 2x faster

Intersect: add split calculation option and adjustable random base fraction, 30% faster

Homo NN: restructure torch backend and enhanced grammar; train and predict with raw image data

Intersect supports SM3 hashing method

Hetero SecureBoost: L1 penalty & adjustable min_child_weight to prevent overfitting

NEW SecureBoost Transformer: feature engineering module that encodes instances with leaf nodes from SecureBoost model

Hetero Pearson: support local VIF computation

Hetero Feature Selection: support selection based on VIF and Pearson

NEW Homo Feature Binning: support virtual/recursive binning strategy

NEW Sample Weight: set sample weights based on label or from feature column, Hetero GLM & Hetero SecureBoost support weighted training

NEW Data Transformer: case-insensitive on data schema

Local Baseline supports prediction task

Cross Validation: output fold split history

Evaluation: add multi-result-unfold option which unfolds multi-classification evaluation result to several binary evaluation results in a one-vs-rest manner

System Architecture

Added local file system directory path virtual storage engine to support image input data

Added the message queue Pulsar cross-site transmission engine, which can be used with the Spark computing engine, and can be added to the Exchange role to support the star networking mode

FATE-Test

Add Benchmark performance for efficiency comparison; add mock data generation tool; support metrics comparison between training and validation sets

FATE-Flow unittest for REST/CLI/SDK API and training-prediction workflow

Source code(tar.gz)
Source code(zip)
v1.5.1(Feb 10, 2021)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

FederatedML

Add Feldman Verifiable Secret Sharing protocol (contributed)

Add Feldman Verifiable Sum Module (contributed)

Updated FATE-Client and FATE-Test for new FATE-Flow

Upgraded early stopping strategy: record best model for each metric

Fate-Flow

Optimize the model center, reconstruct publishing model, support deploy, load, bind, migrate operations, and add new interfaces such as model info

Improve identity authentication and resource authorization, support party identity verification, and participate in the authorization of roles and components

Optimize and fix resource manager, add task_cores job parameters to adapt to different computing engines

Eggroll

In one-way communication mode, add party identity authentication function, which needs to be used with FATE-Cloud

Deploy

Support 1.5.0 retain data upgrade to 1.5.1

Bug Fixes

Fix predict-cache in SecureBoost validation

Fix job clean CLI

Source code(tar.gz)
Source code(zip)
v1.5.0(Nov 13, 2020)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Release 1.5.0（LTS）

Major Features and Improvements

FederatedML

Refactored Hetero FTL with optional communication-efficiency mechanism, with 4x time efficiency improvement

Hetero SecureBoost supports complete secure mode

Hetero SecureBoost now can reduce time consumption over highly sparse data by using sparse matrix computation on histogram aggregations.

Hetero SecureBoost optimization: the communication round in prediction is reduced to no larger than tree depth, prediction speed is improved by 32 times in a 100-tree model.

Addition of Hetero FastSecureBoost module, whose mixed/layered modeling method makes it twice as efficient as SecureBoost

Improved Hetero Federated Binning with 30%~50% time efficiency improvement

Better GLM: >10% improvement in time efficiency

FATE first unsupervised learning algorithm: Hetero KMeans

Upgraded Hetero Feature Selection: add PSI filter and SecureBoost feature importance filter

Add Data Split module: splitting data into train, validate, and test sets inside FATE modeling workflow

Add DataStatistic module: compute min/max, mean, median, skewness, kurtosis, coefficient of variance, percentile, etc.

Add PSI module for computing population stability index

Add Homo OneHot module for one-hot encoding in homogeneous scenario

Evaluation module adds metrics for clustering

Optional FedProx mechanism for Homo LR, useful for training with non-iid data

Add Oblivious Transfer Protocol and OT-based module Secure Information Retrieval

Random Iterative Affine protocol, providing additional security

Fate-Flow

Brand new scheduling framework based on global state and optimistic concurrency control and support multiple scheduler

Upgraded task scheduling: multi-model output for component, executing component in parallel, component rerun

Add new DSL v2 which significantly improves user experiences in comparison to DSL v1. Several syntax error detection functions are supported in v2. Now DSL v1 and v2 are compatible in the current FATE version

Enhanced resource scheduling: remove limit on job number, base on cores, memory and working node according to different computing engine supports

Add model registry, supports model query, import/export, model transfer between clusters

Add Reader component: automatically dump input data to FATE-compatible format and cluster storage engine; now data from HDFS

Refactor submit job configuration's parameters setting, support different parties use different job parameters when using dsl V2.

System Architecture

New architectural framework that supports a combination of different computing, storage, and transfer engines

Support new engine combination: Spark、HDFS、RabbitMQ

New data table management, standardized API for all different storage engines

Rearrange FATE code structure, conf setting at one place, streamlined user experiment

Support one-way network communication between parties, only one party needs to open the entrance network strategy

FATE-Client

Pipeline, a tool with a keras-like user interface and integrates TensorFlow, PyTorch, Keras in the backend, is used for fast federated model building with FATE

Brand new CLI v2 with easy independent installation, user-friendly programming syntax & command-line prompt

Support FLOW python language SDK

Support PyPI

FATE-Test

Testsuite: For Fate function regressions

Benchmark tool and examples for comparing modeling quality; provided examples include common models such as heterogeneous LR, SecureBoost, and NN

Performance Statistics: Log now includes statistics on timing, API usage, and variable transfer

Source code(tar.gz)
Source code(zip)
v1.4.6(Oct 22, 2020)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

FederatedML

Add Column Expand Module

Add Scorecard Module

Source code(tar.gz)
Source code(zip)
v1.5.0-preview(Oct 19, 2020)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

FederatedML

Refactored Hetero FTL with optional communication-efficiency mechanism, with 4x time efficiency improvement

Hetero SecureBoost supports complete secure mode

Hetero SecureBoost now can reduce time consumption over highly sparse data by using sparse matrix computation on histogram aggregations.

Hetero SecureBoost optimization: the communication round in prediction is reduced to no larger than tree depth, prediction speed is improved by 32 times in a 100-tree model.

Addition of Hetero FastSecureBoost module, whose mixed/layered modeling method makes it twice as efficient as SecureBoost

Improved Hetero Federated Binning with 30%~50% time efficiency improvement

Better GLM: >10% improvement in time efficiency

FATE first unsupervised learning algorithm: Hetero KMeans

Upgraded Hetero Feature Selection: add PSI filter and SecureBoost feature importance filter

Add Data Split module: splitting data into train, validate, and test sets inside FATE modeling workflow

Add DataStatistic module: compute min/max, mean, median, skewness, kurtosis, coefficient of variance, percentile, etc.

Add PSI module for computing population stability index

Add Homo OneHot module for one-hot encoding in homogeneous scenario

Evaluation module adds metrics for clustering

Optional FedProx mechanism for Homo LR, useful for training with non-iid data

Add Oblivious Transfer Protocol and OT-based module Secure Information Retrieval

Random Iterative Affine protocol, providing additional security

Fate-Flow

Brand new scheduling framework based on global state and optimistic concurrency control and support multiple scheduler

Upgraded task scheduling: multi-model output for component, executing component in parallel, component rerun

Add new DSL v2 which significantly improves user experiences in comparison to DSL v1. Several syntax error detection functions are supported in v2. Now DSL v1 and v2 are compatible in the current FATE version

Enhanced resource scheduling: remove limit on job number, base on cores, memory and working node according to different computing engine supports

Add model registry, supports model query, import/export, model transfer between clusters

Add Reader component: automatically dump input data to FATE-compatible format and cluster storage engine; now data from HDFS

System Architecture

New architectural framework that supports a combination of different computing, storage, and transfer engines

Support new engine combination: Spark、HDFS、RabbitMQ

New data table management, standardized API for all different storage engines

Rearrange FATE code structure, conf setting at one place, streamlined user experiment

FATE-Client

Pipeline, a tool with a keras-like user interface and integrates TensorFlow, PyTorch, Keras in the backend, is used for fast federated model building with FATE

Brand new CLI v2 with easy independent installation, user-friendly programming syntax & command-line prompt

Support FLOW python language SDK

Support PyPI

FATE-Test

Testsuite: For Fate function regressions

Benchmark tool and examples for comparing modeling quality; provided examples include common models such as heterogeneous LR, SecureBoost, and NN

Performance Statistics: Log now includes statistics on timing, API usage, and variable transfer

Source code(tar.gz)
Source code(zip)
v1.4.5(Sep 27, 2020)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

EggRoll

RollSite supports the communication certificates

Source code(tar.gz)
Source code(zip)
v1.4.4(Sep 4, 2020)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

FATE-Flow

Task Executor supports monkey patch

Add forward API

Source code(tar.gz)
Source code(zip)
v1.4.3(Aug 12, 2020)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

FederatedML

Fix bug of Hetero SecureBoost of sending tree weight info from guest to host.

Source code(tar.gz)
Source code(zip)
v1.4.2(Jul 27, 2020)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

FederatedML

Optimize performance of Pearson which increases efficiency by more than twice.

Optimize Min-test module: Add secure-boost as optional test task. Set partyid and work_mode as input parameters. Use pre-import data set as input so that improved test process.

Support tok_k iv filter in feature selection module.

Support filling missing value for tag:value format data in DataIO.

Fix bug of lacking one layer of depth of tree in HeteroSecureBoost and support automatically alignment header of input data in predict process

Standardize the naming of example data set and add a data pre-import script.

FATE-Flow

Distinguish between user stop job and system stop job;

Optimized some logs;

Optimize zookeeper configuration

The model supports persistent storage to mysql

Push the model to the online service to support the specified storage address (local file and FATEFlowServer interface)

Source code(tar.gz)
Source code(zip)
v1.4.1(Jun 23, 2020)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

FederatedML

Reconstructed Evaluation Module improves efficiency by 60 times

Add PSI, confusion matrix, f1-score and quantile threshold support for Precision/Recall in Evaluation.

Add option to retain duplicated keys in Union.

Support filter feature based on mode

Manual filter allows manually set columns to retain

Auto recoginize whether a data set includes a label column in predict process

Bug-fix: Missing schema after merge in Union; Fail to align label of multi-class in homo_nn with PyTorch backend; Floating-point precision error and value error due to int-type input in Feature Scale

FATE-Flow

Allow the host to stop the job

Optimize the task queue

Automatically align the input table partitions of all participants when the job is running

Fate flow client large file upload optimization

Fixed some bugs with abnormal status

Source code(tar.gz)
Source code(zip)
v1.4.0(May 15, 2020)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

FederatedML

Support Homo Secureboost

Support AIC/BIC-based Stepwise for Linear Models

Add Hetero Optimal Feature Binning, support iv/gini/chi-square/ks,and allow asymmetric binning methods

Interoperate with AI ecosystem: Add pytorch backend for Homo NN

Homo Framework factorization, simplify developing homo algorithms

Early stopping strategy for hetero algorithms.

Local Baseline supports multi-class classification

Add consistency check to Predict function

Optimize validation strategy，3x speed up in-training validation

FATE-Flow

Refactoring model management, native file directory storage, storage structure is more flexible, more information

Support model import and export, store and restore with reliable distributed system(Redis is currently supported)

Using MySQL instead of Redis to implement Job Queue, reducing system complexity

Support for uploading client local files

Automatically detects the existence of the table and provides the destroy option

Separate system, algorithm, scheduling command log, scheduling command log can be independently audited

Eggroll

Stability Boosts:

New resource management components introduce the brand new session mechanism. Processors can be cleaned up with a simple method call, even the session goes wrong.

Removes storage service. No C++ / native library compilation is needed.

Federated learning algorithms can still work at a 28% packet loss rate.

Performance Boosts:

Performances of federated learning algorithms are improved on Eggroll 2. Some algorithms get 10x performance boost.

Join interface is 16x faster than pyspark under federated learning scenarios.

User Experiences Boosts:

Quick deployment. Maven, pip, config and start.

Light dependencies. Check our requirements.txt / pom.xml and see.

Easy debugging. Necessary running contexts are provided. Runtime status are kept in log files and databases.

Few daemon processes. And they are all JVM applications.

Source code(tar.gz)
Source code(zip)
v1.3.1(Apr 18, 2020)
Major Features and Improvements

Deploy

Support deploying by MacOS

Support using external db

Deploy JDK and Python environments on demand

Improve MySQL and FATE Flow service.sh

Support more custom deployment configurations in the default_configurations.sh, such as ssh_port, mysql_port and so one.

Source code(tar.gz)
Source code(zip)
v1.2.2(Mar 23, 2020)
fix union component bug while only has one input

fix union component bug while input table is empty

Source code(tar.gz)
Source code(zip)
v1.3.0(Mar 6, 2020)
By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

FederatedREC

Add federated recommendation submodule

Add heterogeneous Factoraization Machine

Add hemogeneous Factoraization Machine

Add heterogeneous Matrix Factorization

Add heterogeneous Singular Value Decomposition

Add heterogeneous SVD++ (Factorization Meets the Neighborhood)

Add heterogeneous Generalized Matrix Factorization

FederatedML

Support Sparse data training in heterogeneous General Linear Model(Hetero-LR、Hetero-LinR、Hetero-PoissonR)

Fix 32M limitation of quantile binning to support higher feature dimension

Fix 32M limitation of histogram statistics for SecureBoost to support higher feature dimension training.

Add abnormal parameters and input data detection in OneHot Encoder

fix not passing validate data to fit process to support evaluate validation data during training process

Fate-Flow

Add clean job CLI for cleaning output and intermediate results, including data, metrics and sessions

Support for obtaining table namespace and name of output data via CLI

Fix KillJob unsuccessful execution in some special cases

Improve log system, add more exception and run time status prompts

Source code(tar.gz)
Source code(zip)
v1.2.1(Mar 19, 2020)

modify download
Source code(tar.gz)
Source code(zip)
v1.2.0(Dec 31, 2019)
Major Features and Improvements

FederatedML

Add heterogeneous Deep Neural Network

Add Secret-Sharing Protocol-SPDZ

Add heterogeneous feature correlation algorithm with SPDZ and support heterogeneous Pearson Correlation Calculation

Add heterogeneous SQN optimizer, available for Hetero-LogisticRegression and Hetero-LinearRegression, which can reduce communication costs significantly

Supports intersection for expanding duplicate IDs

Support multi-host in heterogeneous feature binning

Support multi-host in heterogeneous feature selection

Support IV calculation for categorical features in heterogeneous feature binning

Support transform raw feature value to WOE in heterogeneous feature binning

Add manual filters in heterogeneous feature selection

Support performance comparison with sklearn's logistic regression

Automatic object/table clean in training iteration procedure in Federation

Improve transfer performance for large object

Add automatic scripts for submitting and running tasks

FATE-Flow

Add data management module for recording the uploaded data tables and the outputs of the model in the job running, and for querying and cleaning up CLI.

Support registration center for simplifying communication configuration between FATEFlow and FATEServing

Restruct model release logic, FATE_Flow pushes model directly to FATE-Serving. Decouple FATE-Serving and Eggroll, and the offline and online architectures are connected only by FATE-Flow.

Provide CLI to query data upload record

Upload and download data support progress statistics by line

Add some abnormal diagnosis tips

Support adding note information to job

Native Deploy

Fix bugs in EggRoll startup script, add MySQL, Redis startup options.

Disable hostname resolution configuration for MySQL service.

The version number of each module of the software packaging script is updated using the automatic acquisition mode.

Source code(tar.gz)
Source code(zip)