A Research-oriented Federated Learning Library and Benchmark Platform for Graph Neural Networks. Accepted to ICLR'2021 - DPML and MLSys'21 - GNNSys workshops.

Overview

FedGraphNN: A Federated Learning System and Benchmark for Graph Neural Networks

A Research-oriented Federated Learning Library and Benchmark Platform for Graph Neural Networks. Accepted to ICLR-DPML and MLSys21 - GNNSys'21 workshops.

Datasets: http://moleculenet.ai/

Installation

After git clone-ing this repository, please run the following command to install our dependencies.

conda create -n fedgraphnn python=3.7
conda activate fedgraphnn
conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.1 -c pytorch -n fedmolecule
conda install -c anaconda mpi4py grpcio
conda install scikit-learn numpy h5py setproctitle networkx
pip install -r requirements.txt 
cd FedML; git submodule init; git submodule update; cd ../;
pip install -r FedML/requirements.txt

Data Preparation

Experiments

Centralized Molecule Property Classification experiments

python experiments/centralized/moleculenet/molecule_classification_multilabel.py

Centralized Molecule Property Regression experiments

python experiments/centralized/moleculenet/molecule_regression_multivariate.py

Arguments for Centralized Training

This is a list of arguments used in centralized experiments.

--dataset --> Dataset used for training
--data_dir' --> Data directory
--partition_method -> how to partition the dataset
--sage_hidden_size' -->Size of GraphSAGE hidden layer
--node_embedding_dim --> Dimensionality of the vector space the atoms will be embedded in
--sage_dropout --> Dropout used between GraphSAGE layers
--readout_hidden_dim --> Size of the readout hidden layer
--graph_embedding_dim --> Dimensionality of the vector space the molecule will be embedded in
--client_optimizer -> Optimizer function(Adam or SGD)
--lr --> learning rate (default: 0.0015)
--wd --> Weight decay(default=0.001)
--epochs -->Number of epochs
--frequency_of_the_test --> How frequently to run eval
--device -->gpu device for training

Distributed/Federated Molecule Property Classification experiments

sh run_fedavg_distributed_pytorch.sh 6 1 1 1 graphsage homo 150 1 1 0.0015 256 256 0.3 256 256  sider "./../../../data/sider/" 0

##run on background
nohup sh run_fedavg_distributed_pytorch.sh 6 1 1 1 graphsage homo 150 1 1 0.0015 256 256 0.3 256 256  sider "./../../../data/sider/" 0 > ./fedavg-graphsage.log 2>&1 &

Distributed/Federated Molecule Property Regression experiments

sh run_fedavg_distributed_reg.sh 6 1 1 1 graphsage homo 150 1 1 0.0015 256 256 0.3 256 256 freesolv "./../../../data/freesolv/" 0

##run on background
nohup sh run_fedavg_distributed_reg.sh 6 1 1 1 graphsage homo 150 1 1 0.0015 256 256 0.3 256 256 freesolv "./../../../data/freesolv/" 0 > ./fedavg-graphsage.log 2>&1 &

Arguments for Distributed/Federated Training

This is an ordered list of arguments used in distributed/federated experiments. Note, there are additional parameters for this setting.

CLIENT_NUM=$1 -> Number of clients in dist/fed setting
WORKER_NUM=$2 -> Number of workers
SERVER_NUM=$3 -> Number of servers
GPU_NUM_PER_SERVER=$4 -> GPU number per server
MODEL=$5 -> Model name
DISTRIBUTION=$6 -> Dataset distribution. homo for IID splitting. hetero for non-IID splitting.
ROUND=$7 -> Number of Distiributed/Federated Learning Rounds
EPOCH=$8 -> Number of epochs to train clients' local models
BATCH_SIZE=$9 -> Batch size 
LR=${10}  -> learning rate
SAGE_DIM=${11} -> Dimenionality of GraphSAGE embedding
NODE_DIM=${12} -> Dimensionality of node embeddings
SAGE_DR=${13} -> Dropout rate applied between GraphSAGE Layers
READ_DIM=${14} -> Dimensioanlity of readout embedding
GRAPH_DIM=${15} -> Dimensionality of graph embedding
DATASET=${16} -> Dataset name (Please check data folder to see all available datasets)
DATA_DIR=${17} -> Dataset directory
CI=${18}

Code Structure of FedGraphNN

  • FedML: A soft repository link generated using git submodule add https://github.com/FedML-AI/FedML.

  • data: Provide data downloading scripts and store the downloaded datasets. Note that in FedML/data, there also exists datasets for research, but these datasets are used for evaluating federated optimizers (e.g., FedAvg) and platforms. FedGraphNN supports more advanced datasets and models for federated training of graph neural networks. Currently, we have molecular machine learning datasets.

  • data_preprocessing: Domain-specific PyTorch Data loaders for centralized and distributed training.

  • model: GNN models.

  • trainer: please define your own trainer.py by inheriting the base class in FedML/fedml-core/trainer/fedavg_trainer.py. Some tasks can share the same trainer.

  • experiments/distributed:

  1. experiments is the entry point for training. It contains experiments in different platforms. We start from distributed.
  2. Every experiment integrates FOUR building blocks FedML (federated optimizers), data_preprocessing, model, trainer.
  3. To develop new experiments, please refer the code at experiments/distributed/text-classification.
  • experiments/centralized:
  1. please provide centralized training script in this directory.
  2. This is used to get the reference model accuracy for FL.
  3. You may need to accelerate your training through distributed training on multi-GPUs and multi-machines. Please refer the code at experiments/centralized/DDP_demo.

Update FedML Submodule

cd FedML
git checkout master && git pull
cd ..
git add FedML
git commit -m "updating submodule FedML to latest"
git push

Citation

Please cite our FedML paper if it helps your research. You can describe us in your paper like this: "We develop our experiments based on FedML".

@misc{he2021fedgraphnn,
      title={FedGraphNN: A Federated Learning System and Benchmark for Graph Neural Networks}, 
      author={Chaoyang He and Keshav Balasubramanian and Emir Ceyani and Yu Rong and Peilin Zhao and Junzhou Huang and Murali Annavaram and Salman Avestimehr},
      year={2021},
      eprint={2104.07145},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}
Comments
  • AttributeError: 'Namespace' object has no attribute 'backend'

    AttributeError: 'Namespace' object has no attribute 'backend'

    I'm trying to follow the classification experiments: From this link: https://github.com/FedML-AI/FedGraphNN/tree/main/experiments/distributed/moleculenet This command results in the error: sh run_fedavg_distributed_pytorch.sh 4 4 1 4 graphsage hetero 0.2 20 1 1 0.0015 64 32 0.3 64 64 sider "./../../../data/sider/" 0

    opened by akritikts 3
  • error: argument --node_embedding_dim: invalid int value: '0.3'

    error: argument --node_embedding_dim: invalid int value: '0.3'

    Just try to run the experiment based on the command as follow: sh run_fedavg_distributed_pytorch.sh 6 1 1 1 graphsage homo 150 1 1 0.0015 256 256 0.3 256 256 sider "./../../../data/sider/" 0

    opened by JLU-Neal 2
  • terminate called after throwing an instance of 'std::bad_alloc'

    terminate called after throwing an instance of 'std::bad_alloc'

    Hi, I encountered the 'std::bad_alloc' problem when running ego-networks problems, any suggestions for debugging? terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc

    Screenshot 2022-04-21 at 10 50 40 PM

    Thanks.

    opened by csshali 1
  • Environment settings for ego-networks

    Environment settings for ego-networks

    Are the environment settings for ego-networks and moleculenet the same? I can run distributed experiments for moleculenet but fail to run that of ego-networks under the same environment.

    opened by csshali 1
  • invalid value for int argument: NODE_DIM

    invalid value for int argument: NODE_DIM

    I am trying to run the Distributed Molecule classification experiment: sh run_fedavg_distributed_reg.sh 6 1 1 1 graphsage homo 150 1 1 0.0015 256 256 0.3 256 256 freesolv "./../../../data/freesolv/" 0

    opened by akritikts 1
  • A typo?

    A typo?

    ForInstallation

    conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.1 -c pytorch -n **fedmolecule** The parameter ‘fedmolecule’ maybe 'fedgraphnn'

    opened by youngfish42 1
  • illegal hardware instruction  python

    illegal hardware instruction python

    I'm using m1 pro chip. It seems that when I try to run centralized experiment it will report illegal hardware instruction. Is there any solution to it? thanks

    opened by toufunao 0
  • post_complete_message_to_sweep_process in main_fedavg.py is useless.

    post_complete_message_to_sweep_process in main_fedavg.py is useless.

    In experiments/distributed/moleculenet/main_fedavg.py, several lines of codes at the end of the file are shown as follows:

    if process_id == 0:
            post_complete_message_to_sweep_process(args)
    
    

    Actually, these two lines of codes cannot be run, since before getting into them the application will call MPI_Abort to terminate the processes.

    opened by JLU-Neal 0
  • The distributed experiment was stuck after creating model done

    The distributed experiment was stuck after creating model done

    I ran run_fedavg_distributed_pytorch but the experiment was stuck after creating the model done. What's wrong?

    2022-04-10,23:37:38.903 - {data_loader.py (453)} - load_partition_data(): Client idx = 0, local sample number = 191 2022-04-10,23:37:38.903 - {data_loader.py (453)} - load_partition_data(): Client idx = 1, local sample number = 190 2022-04-10,23:37:38.903 - {data_loader.py (453)} - load_partition_data(): Client idx = 2, local sample number = 190 2022-04-10,23:37:38.903 - {data_loader.py (453)} - load_partition_data(): Client idx = 3, local sample number = 190 2022-04-10,23:37:38.903 - {data_loader.py (453)} - load_partition_data(): Client idx = 4, local sample number = 190 2022-04-10,23:37:38.903 - {data_loader.py (453)} - load_partition_data(): Client idx = 5, local sample number = 190 2022-04-10,23:37:38.904 - {main_fedavg.py (139)} - create_model(): create_model. model_name = graphsage, output_dim = None 2022-04-10,23:37:38.929 - {main_fedavg.py (180)} - create_model(): done

    Screenshot 2022-04-10 at 7 38 41 PM
    opened by csshali 3
  • Cannot load

    Cannot load "ciao" and "epinions" datasets

    When loading the "ciao" and "epinions" datasets with function "load_partition_data", there is an error as follows:

    RuntimeError: The 'data' object was created by an older version of PyG. If this error occurred while loading an already existing dataset, remove the 'processed/' directory in the dataset's root folder and try again.

    Is there any solution to this error?

    opened by xueyuuu 1
  • About the missing code

    About the missing code

    Hellow! I am a beginner in knowledge graph embedding. It's a very up-to-date work which help a lot for me to understand the task of knowledge graph reasoning and federated learning. But when I try to run the relation prediction code,I find the 'fed_subgraph_rel_trainer' is missing. And because of the access problem,I can't download the processed datasets in the subgraph-level like FB15k-237 and WN18RR from the web you provided.Would you mind share these datasets and the missing file ''fed_subgraph_rel_trainer' ?

    Thank you very much for the early reply.

    opened by 2645227346 0
  • [Missing file]  No module named 'training.subgraph_level.fed_subgraph_rel_trainer'

    [Missing file] No module named 'training.subgraph_level.fed_subgraph_rel_trainer'

    Hi!

    When trying to run the code of "subgraph relation prediction", I received this feedback:

    Traceback (most recent call last): File "fed_subgraph_rel_pred.py", line 17, in from training.subgraph_level.fed_subgraph_rel_trainer import FedSubgraphRelTrainer ModuleNotFoundError: No module named 'training.subgraph_level.fed_subgraph_rel_trainer'

    I find there is no file named "fed_subgraph_rel_trainer" under the folder "training/subgraph_level" . And the README of this part is the same as "Ego Networks".

    Hope you can complete the file. Thanks!

    opened by QQQHY 0
Owner
FedML-AI
FedML: A Research Library and Benchmark for Federated Machine Learning
FedML-AI
PyTorch implementation of Federated Learning with Non-IID Data, and federated learning algorithms, including FedAvg, FedProx.

Federated Learning with Non-IID Data This is an implementation of the following paper: Yue Zhao, Meng Li, Liangzhen Lai, Naveen Suda, Damon Civin, Vik

Youngjoon Lee 48 Dec 29, 2022
TianyuQi 10 Dec 11, 2022
On Size-Oriented Long-Tailed Graph Classification of Graph Neural Networks

On Size-Oriented Long-Tailed Graph Classification of Graph Neural Networks We provide the code (in PyTorch) and datasets for our paper "On Size-Orient

Zemin Liu 4 Jun 18, 2022
ICLR2021 (Under Review)

Self-Supervised Time Series Representation Learning by Inter-Intra Relational Reasoning This repository contains the official PyTorch implementation o

Haoyi Fan 58 Dec 30, 2022
[ICLR2021] Unlearnable Examples: Making Personal Data Unexploitable

Unlearnable Examples Code for ICLR2021 Spotlight Paper "Unlearnable Examples: Making Personal Data Unexploitable " by Hanxun Huang, Xingjun Ma, Sarah

Hanxun Huang 98 Dec 7, 2022
FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

Detectron is deprecated. Please see detectron2, a ground-up rewrite of Detectron in PyTorch. Detectron Detectron is Facebook AI Research's software sy

Facebook Research 25.5k Jan 7, 2023
Plato: A New Framework for Federated Learning Research

a new software framework to facilitate scalable federated learning research.

System Group@Theory Lab 192 Jan 5, 2023
MGFN: Multi-Graph Fusion Networks for Urban Region Embedding was accepted by IJCAI-2022.

Multi-Graph Fusion Networks for Urban Region Embedding (IJCAI-22) This is the implementation of Multi-Graph Fusion Networks for Urban Region Embedding

null 202 Nov 18, 2022
[Preprint] "Bag of Tricks for Training Deeper Graph Neural Networks A Comprehensive Benchmark Study" by Tianlong Chen*, Kaixiong Zhou*, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang

Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive Benchmark Study Codes for [Preprint] Bag of Tricks for Training Deeper Graph

VITA 101 Dec 29, 2022
An easy-to-use federated learning platform

FederatedScope is a comprehensive federated learning platform that provides convenient usage and flexible customization for various federated learning

Alibaba 809 Dec 31, 2022
A static analysis library for computing graph representations of Python programs suitable for use with graph neural networks.

python_graphs This package is for computing graph representations of Python programs for machine learning applications. It includes the following modu

Google Research 258 Dec 29, 2022
Some tentative models that incorporate label propagation to graph neural networks for graph representation learning in nodes, links or graphs.

Some tentative models that incorporate label propagation to graph neural networks for graph representation learning in nodes, links or graphs.

zshicode 1 Nov 18, 2021
The source code of the paper "Understanding Graph Neural Networks from Graph Signal Denoising Perspectives"

GSDN-F and GSDN-EF This repository provides a reference implementation of GSDN-F and GSDN-EF as described in the paper "Understanding Graph Neural Net

Guoji Fu 18 Nov 14, 2022
A PyTorch library and evaluation platform for end-to-end compression research

CompressAI CompressAI (compress-ay) is a PyTorch library and evaluation platform for end-to-end compression research. CompressAI currently provides: c

InterDigital 680 Jan 6, 2023
An official source code for paper Deep Graph Clustering via Dual Correlation Reduction, accepted by AAAI 2022

Dual Correlation Reduction Network An official source code for paper Deep Graph Clustering via Dual Correlation Reduction, accepted by AAAI 2022. Any

yueliu1999 109 Dec 23, 2022
FedJAX is a library for developing custom Federated Learning (FL) algorithms in JAX.

FedJAX: Federated learning with JAX What is FedJAX? FedJAX is a library for developing custom Federated Learning (FL) algorithms in JAX. FedJAX priori

Google 208 Dec 14, 2022
GradAttack is a Python library for easy evaluation of privacy risks in public gradients in Federated Learning

GradAttack is a Python library for easy evaluation of privacy risks in public gradients in Federated Learning, as well as corresponding mitigation strategies.

null 129 Dec 30, 2022