Defending graph neural networks against adversarial attacks (NeurIPS 2020)

Zitnik Lab @ Harvard

Last update: Dec 7, 2022

Related tags

Overview

GNNGuard: Defending Graph Neural Networks against Adversarial Attacks

Authors: Xiang Zhang ([email protected]), Marinka Zitnik ([email protected])

Project website

Overview

This repository contains python codes and datasets necessary to run the GNNGuard algorithm. GNNGuard is a general defense approach against a variety of poisoning adversarial attacks that perturb the discrete graph structure. GNNGuard can be straightforwardly incorporated into any GNN models to prevent the misclassification caused by poisoning adversarial attacks on graphs. Please see our paper for more details on the algorithm.

Key Idea of GNNGuard

Deep learning methods for graphs achieve remarkable performance on many tasks. However, despite the proliferation of such methods and their success, recent findings indicate that small, unnoticeable perturbations of graph structure can catastrophically reduce performance of even the strongest and most popular Graph Neural Networks (GNNs). By integrating with the proposed GNNGuard, the GNN classifier can correctly classify the target node even under strong adversarial attacks.

The key idea of GNNGuard is to detect and quantify the relationship between the graph structure and node features, if one exists, and then exploit that relationship to mitigate negative effects of the attack. GNNGuard learns how to best assign higher weights to edges connecting similar nodes while pruning edges between unrelated nodes. In specific, instead of the neural message passing of typical GNN (shown as A), GNNGuard (B) controls the message stream such as blocking the message from irrelevent neighbors but strengthening messages from highly-related ones. Importantly, we are the first model that can defend heterophily graphs (\eg, with structural equivalence) while all the existing defenders only considering homophily graphs.

Running the code

The GNNGuard is evluated under three typical adversarial attacks including Direct Targeted Attack (Nettack-Di), Influence Targeted Attack (Nettack-In), and Non-Targeted Attack (Mettack). In GNNGuard folder, the Nettack-Di.py, Nettack-In.py, and Mettack.py corresponding to the three adversarial attacks.

For example, to check the performance of GCN without defense under direct targeted attack, run the following code:

python Nettack-Di.py --dataset Cora  --modelname GCN --GNNGuard False

Turn on the GNNGuard defense, run

python Nettack-Di.py --dataset Cora  --modelname GCN --GNNGuard True

Note: Please uncomment the defense models (Line 144 for Nettack-Di.py) to test different defense models.

Citing

If you find GNNGuard useful for your research, please consider citing this paper:

@inproceedings{zhang2020gnnguard,
title     = {GNNGuard: Defending Graph Neural Networks against Adversarial Attacks},
author    = {Zhang, Xiang and Zitnik, Marinka},
booktitle = {NeurIPS},
year      = {2020}
}

Requirements

GNNGuard is tested to work under Python >=3.5.

Recent versions of Pytorch, torch-geometric, numpy, and scipy are required. All the required basic packages can be installed using the following command: ''' pip install -r requirements.txt ''' Note: For toch-geometric and the related dependices (e.g., cluster, scatter, sparse), the higher version may work but haven't been tested yet.

Install DeepRobust

During the evaluation, the adversarial attacks on graph are performed by DeepRobust from MSU, please install it by

git clone https://github.com/DSE-MSU/DeepRobust.git
cd DeepRobust
python setup.py install

If you have trouble in installing DeepRobust, please try to replace the provided 'defense/setup.py' to replace the original DeepRobust-master/setup.py and manully reinstall it by

python setup.py install

We extend the original DeepRobust from single GCN to multiplye GNN variants including GAT, GIN, Jumping Knowledge, and GCN-SAINT. After installing DeepRobust, please replace the origininal folder DeepRobust-master/deeprobust/graph/defense by the defense folder that provided in our repository!
To better plugin GNNGuard to geometric codes, we slightly revised some functions in geometric. Please use the three files under our provided nn/conv/ to replace the corresponding files in the installed geometric folder (for example, the folder path could be /home/username/.local/lib/python3.5/site-packages/torch_geometric/nn/conv/).

Note: 1). Don't forget to backup all the original files when you replacing anything, in case you need them at other places! 2). Please install the corresponding CUDA versions if you are using GPU.

Datasets

Here we provide the datasets (including Cora, Citeseer, ogbn-arxiv, and DP) used in GNNGuard paper.

The ogbn-arxiv dataset can be easily access by python codes:

from ogb.nodeproppred import PygNodePropPredDataset
dataset = PygNodePropPredDataset(name = 'ogbn-arxiv')

More details about ogbn-arxiv dataset can be found here.

Find more details about Disease Pathway dataset at here.

For graphs with structural roles, a prominent type of heterophily, we calculate the nodes' similarity using graphlet degree vector instead of node embedding. The graphlet degree vector is generated/counted based on the Orbit Counting Algorithm (Orca).

Miscellaneous

Please send any questions you might have about the code and/or the algorithm to [email protected].

License

GNNGuard is licensed under the MIT License.

Comments

Questions regarding parameters for experiments in the paper
I have some questions regarding the parameters you used for the experiments in the paper:

In Appendix E of the paper, you mentioned that you set P_0 = 0.5. I assume this is the same P_0 as used in Eq. (5) for pruning the edges after importance estimation? While in your code, as you replied in another issue, P_0 seems to be 0.1. I am wondering which one is the value, 0.1 or 0.5, that you used in the experiments in your paper?

I saw in line 116 of your GCN implementation that the layer-wise graph memory seems to be disabled? Is that also the case for the experiments in your paper? In other words, what are the $$\beta$$ value you used for Eq. (7) for the experiments in your paper?
opened by jiong-zhu 2

Support for GPU.

Had some issue to run with GPU. Should change the code as follows. (Only for GCN as base)

For the gcn_conv.py file to replace in Torch_geometric, add the following code block before class GCN(MessagePassing)

@torch.jit._overload
def gcn_norm(edge_index, edge_weight=None, num_nodes=None, improved=False,
             add_self_loops=True, dtype=None):
    # type: (Tensor, OptTensor, Optional[int], bool, bool, Optional[int]) -> PairTensor  # noqa
    pass


@torch.jit._overload
def gcn_norm(edge_index, edge_weight=None, num_nodes=None, improved=False,
             add_self_loops=True, dtype=None):
    # type: (SparseTensor, OptTensor, Optional[int], bool, bool, Optional[int]) -> SparseTensor  # noqa
    pass


def gcn_norm(edge_index, edge_weight=None, num_nodes=None, improved=False,
             add_self_loops=True, dtype=None):

    fill_value = 2. if improved else 1.

    if isinstance(edge_index, SparseTensor):
        adj_t = edge_index
        if not adj_t.has_value():
            adj_t = adj_t.fill_value(1., dtype=dtype)
        if add_self_loops:
            adj_t = fill_diag(adj_t, fill_value)
        deg = sum(adj_t, dim=1)
        deg_inv_sqrt = deg.pow_(-0.5)
        deg_inv_sqrt.masked_fill_(deg_inv_sqrt == float('inf'), 0.)
        adj_t = mul(adj_t, deg_inv_sqrt.view(-1, 1))
        adj_t = mul(adj_t, deg_inv_sqrt.view(1, -1))
        return adj_t

    else:
        num_nodes = maybe_num_nodes(edge_index, num_nodes)

        if edge_weight is None:
            edge_weight = torch.ones((edge_index.size(1), ), dtype=dtype,
                                     device=edge_index.device)

        if add_self_loops:
            edge_index, tmp_edge_weight = add_remaining_self_loops(
                edge_index, edge_weight, fill_value, num_nodes)
            assert tmp_edge_weight is not None
            edge_weight = tmp_edge_weight

        row, col = edge_index[0], edge_index[1]
        deg = scatter_add(edge_weight, col, dim=0, dim_size=num_nodes)
        deg_inv_sqrt = deg.pow_(-0.5)
        deg_inv_sqrt.masked_fill_(deg_inv_sqrt == float('inf'), 0)
        return edge_index, deg_inv_sqrt[row] * edge_weight * deg_inv_sqrt[col]

Add the following line into 'gcn_conv.py'

@staticmethod
def norm(edge_index, num_nodes, edge_weight=None, improved=False,
             dtype=None):
    if edge_weight is None:
        edge_weight = torch.ones((edge_index.size(1), ), dtype=dtype, device=edge_index.device)
    
    # Add this line
    edge_weight = edge_weight.to(edge_index.device)
    # Add this line
    
fill_value = 1 if not improved else 2

In defense/gcn.py, add the following lines

def forward(self, x, adj):
    """we don't change the edge_index, just update the edge_weight;
    some edge_weight are regarded as removed if it equals to zero"""
    x = x.to_dense()

    """GCN and GAT"""
    if self.attention:
        adj = self.att_coef(x, adj, i=0)
    # Add this line
    edge_index = adj._indices().to(self.device)

    x = self.gc1(x, edge_index, edge_weight=adj._values())
    x = F.relu(x)
    # x = self.bn1(x)
    if self.attention:  # if attention=True, use attention mechanism
        adj_2 = self.att_coef(x, adj, i=1)
        adj_memory = adj_2.to_dense()  # without memory
        # adj_memory = self.gate * adj.to_dense() + (1 - self.gate) * adj_2.to_dense()
        row, col = adj_memory.nonzero()[:,0], adj_memory.nonzero()[:,1]
        edge_index = torch.stack((row, col), dim=0)
        adj_values = adj_memory[row, col]
    else:
        edge_index = adj._indices()
        adj_values = adj._values()
    # Add this line
    edge_index = edge_index.to(self.device)
    adj_values = adj_values.to(self.device)

    x = F.dropout(x, self.dropout, training=self.training)
    x = self.gc2(x, edge_index, edge_weight=adj_values)

I haven't used other models as the base model to run on GPU. But hopefully the code above will help with those who are using GPU. Cheers!

opened by kinkunchan 1

can you run mettack.py?

it seems type casting erros (of sparse,torch sparce & torch tensor) are everywhere in model implementation. simply run python Mettack.py and I got a runtime error. There seems to be so many bugs in the code that I can't fix them by myself. Can you give a clearer code or tell me how to run Mettack.py without an error?

opened by FFTYYY 1
A question about the pruning procedure

Good Paper! But I have a question. As described in paper, GNNGuard prunes graph edges according to Equation (5). But I do not find the any code to do this. Could you point out location of the pruning codes?

opened by wzfhaha 1
Bump numpy from 1.18.1 to 1.22.0
Bumps numpy from 1.18.1 to 1.22.0.

Release notes

Sourced from numpy's releases.

v1.22.0

NumPy 1.22.0 Release Notes

NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.

A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.

NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.

New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.

A new configurable allocator for use by downstream projects.

These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

Expired deprecations

Deprecated numeric style dtype strings have been removed

Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

(gh-19539)

Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

(gh-19615)

... (truncated)

Commits

4adc87d Merge pull request #20685 from charris/prepare-for-1.22.0-release

fd66547 REL: Prepare for the NumPy 1.22.0 release.

125304b wip

c283859 Merge pull request #20682 from charris/backport-20416

5399c03 Merge pull request #20681 from charris/backport-20954

f9c45f8 Merge pull request #20680 from charris/backport-20663

794b36f Update armccompiler.py

d93b14e Update test_public_api.py

7662c07 Update init.py

311ab52 Update armccompiler.py

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0
A question on hyperparameters
Hi,

Thanks for your great work!

I have trouble reproducing your results with GIN on Cora under 20% Metattack. I used the pre-perturbed data provided by DeepRobust with Pro-GNN splits (data and splits). However, I only got 58.22±4.04 (10 reps), which is far from 72.2 in your paper. Even if the dataset split in Pro-GNN is different from yours, I don't think the results should be so different.

So I wonder if I made mistakes about the hyperparameters. I followed the hyperparameters in your paper and your code:

epochs = 200, patience = 10, lr = 0.01, weight_decay = 5e-4, hidden = 16, dropout = 0.5, modelname = 'GIN', GNNGuard = True, seed = 15,

Could you take a look and see which values of hyperparameters should I use? Thanks.

P.S. To save memory, I changed cosine_similarity in defense/gin.py:

https://github.com/mims-harvard/GNNGuard/blob/33f5390bb5adff3f8bed5bcb01e9ba50e2c74c38/defense/gin.py#L157-L159

to the following:

from sklearn.preprocessing import normalize def paired_cosine_similarity(X, i, j): X_normed = normalize(X) return (X_normed[i] * X_normed[j]).sum(1) sim = paired_cosine_similarity(fea_copy, row, col)

I think this code should be equivalent to yours.
opened by ocrinz 0
Question upon running dataset ogbn-arxiv
Hi Xiang,

Thanks for your impressive work! I have a question regarding to running ogbn-arxiv dataset. It seems like directly running this large dataset under the current framework within a single GPU is not possible. Could you provide any tips on efficiently conduct robust evaluation on ogbn-arxiv? Specifically, I have several questions such as:

Do you use any sampling techniques during training GNN models? If it is, what sampling method would you recommend?

Do you run ogbn-arxiv on a single GPU or CPU? If it can be done on a single GPU, how large memories should it has?

It seems like it would take weeks for directly employing Mettack and Nettack to finish attacking on such large network, if it is the case it would be very infeasible especially for the poisoning setting. I am not sure if this observation is correct.

Thanks in advance!
opened by SwiftieH 0
Question about edge pruning

Hi,

Thanks for the great work! I have a question for the code https://github.com/mims-harvard/GNNGuard/blob/33f5390bb5adff3f8bed5bcb01e9ba50e2c74c38/defense/gcn.py#L184, where it drops the edge based on the similarity score in Equation 3 of the paper. However, the paper only mentions that dropping the edge is based on the Equation 5, not Equation 3. Could you kindly explain why do we add this line in the code? Or something I'm missing?

Thanks!

opened by Chenhui1016 1

Owner

Zitnik Lab @ Harvard

Machine Learning for Medicine and Science

GitHub https://zitniklab.hms.harvard.edu/projects/GNNGuard

Defending against Model Stealing via Verifying Embedded External Features

Defending against Model Stealing Attacks via Verifying Embedded External Features This is the official implementation of our paper Defending against M

20 Dec 30, 2022

Code for the paper: Adversarial Training Against Location-Optimized Adversarial Patches. ECCV-W 2020.

Adversarial Training Against Location-Optimized Adversarial Patches arXiv | Paper | Code | Video | Slides Code for the paper: Sukrut Rao, David Stutz,

32 Dec 13, 2022

A certifiable defense against adversarial examples by training neural networks to be provably robust

DiffAI v3 DiffAI is a system for training neural networks to be provably robust and for proving that they are robust. The system was developed for the

202 Dec 13, 2022

transfer attack; adversarial examples; black-box attack; unrestricted Adversarial Attacks on ImageNet; CVPR2021 天池黑盒竞赛

transfer_adv CVPR-2021 AIC-VI: unrestricted Adversarial Attacks on ImageNet CVPR2021 安全AI挑战者计划第六期赛道2：ImageNet无限制对抗攻击介绍：深度神经网络已经在各种视觉识别问题上取得了最先进的性能。

25 Dec 8, 2022

G-NIA model from "Single Node Injection Attack against Graph Neural Networks" (CIKM 2021)

Single Node Injection Attack against Graph Neural Networks This repository is our Pytorch implementation of our paper: Single Node Injection Attack ag

18 Nov 21, 2022

Source code of NeurIPS 2021 Paper ''Be Confident! Towards Trustworthy Graph Neural Networks via Confidence Calibration''

CaGCN This repo is for source code of NeurIPS 2021 paper "Be Confident! Towards Trustworthy Graph Neural Networks via Confidence Calibration". Paper L

6 Dec 19, 2022

Attack classification models with transferability, black-box attack; unrestricted adversarial attacks on imagenet

Attack classification models with transferability, black-box attack; unrestricted adversarial attacks on imagenet, CVPR2021 安全AI挑战者计划第六期：ImageNet无限制对抗攻击决赛第四名（team name: Advers）

51 Dec 1, 2022

Implementation of Wasserstein adversarial attacks.

Stronger and Faster Wasserstein Adversarial Attacks Code for Stronger and Faster Wasserstein Adversarial Attacks, appeared in ICML 2020. This reposito

21 Oct 6, 2022

Adversarial Attacks on Probabilistic Autoregressive Forecasting Models.

Attack-Probabilistic-Models This is the source code for Adversarial Attacks on Probabilistic Autoregressive Forecasting Models. This repository contai

25 Sep 14, 2022

Boosting Adversarial Attacks with Enhanced Momentum (BMVC 2021)

EMI-FGSM This repository contains code to reproduce results from the paper: Boosting Adversarial Attacks with Enhanced Momentum (BMVC 2021) Xiaosen Wa

10 Sep 26, 2022

PyTorch implementation of our method for adversarial attacks and defenses in hyperspectral image classification.

Self-Attention Context Network for Hyperspectral Image Classification PyTorch implementation of our method for adversarial attacks and defenses in hyp

22 Dec 2, 2022

Official implementation of "Open-set Label Noise Can Improve Robustness Against Inherent Label Noise" (NeurIPS 2021)

Open-set Label Noise Can Improve Robustness Against Inherent Label Noise NeurIPS 2021: This repository is the official implementation of ODNL. Require

12 Dec 7, 2022

The official implementation of NeurIPS 2021 paper: Finding Optimal Tangent Points for Reducing Distortions of Hard-label Attacks

11 Nov 27, 2022

Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement (NeurIPS 2020)

MTTS-CAN: Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement Paper Xin Liu, Josh Fromm, Shwetak Patel, Daniel M

106 Dec 30, 2022

We have implemented shaDow-GNN as a general and powerful pipeline for graph representation learning. For more details, please find our paper titled Deep Graph Neural Networks with Shallow Subgraph Samplers, available on arXiv (https//arxiv.org/abs/2012.01380).

Deep GNN, Shallow Sampling Hanqing Zeng, Muhan Zhang, Yinglong Xia, Ajitesh Srivastava, Andrey Malevich, Rajgopal Kannan, Viktor Prasanna, Long Jin, R

117 Dec 20, 2022

A static analysis library for computing graph representations of Python programs suitable for use with graph neural networks.

python_graphs This package is for computing graph representations of Python programs for machine learning applications. It includes the following modu

258 Dec 29, 2022

The source code of the paper "Understanding Graph Neural Networks from Graph Signal Denoising Perspectives"

GSDN-F and GSDN-EF This repository provides a reference implementation of GSDN-F and GSDN-EF as described in the paper "Understanding Graph Neural Net

18 Nov 14, 2022

Some tentative models that incorporate label propagation to graph neural networks for graph representation learning in nodes, links or graphs.

1 Nov 18, 2021

On Size-Oriented Long-Tailed Graph Classification of Graph Neural Networks

On Size-Oriented Long-Tailed Graph Classification of Graph Neural Networks We provide the code (in PyTorch) and datasets for our paper "On Size-Orient

4 Jun 18, 2022

Defending graph neural networks against adversarial attacks (NeurIPS 2020)

Related tags

Overview

GNNGuard: Defending Graph Neural Networks against Adversarial Attacks

Authors: Xiang Zhang ([email protected]), Marinka Zitnik ([email protected])

Overview

Key Idea of GNNGuard

Running the code

Citing

Requirements

Install DeepRobust

Datasets

Miscellaneous

License

Comments

v1.22.0

NumPy 1.22.0 Release Notes

Expired deprecations

Deprecated numeric style dtype strings have been removed

Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

Owner

Zitnik Lab @ Harvard

Defending against Model Stealing via Verifying Embedded External Features

Code for the paper: Adversarial Training Against Location-Optimized Adversarial Patches. ECCV-W 2020.

A certifiable defense against adversarial examples by training neural networks to be provably robust

transfer attack; adversarial examples; black-box attack; unrestricted Adversarial Attacks on ImageNet; CVPR2021 天池黑盒竞赛

G-NIA model from "Single Node Injection Attack against Graph Neural Networks" (CIKM 2021)

Source code of NeurIPS 2021 Paper ''Be Confident! Towards Trustworthy Graph Neural Networks via Confidence Calibration''

Attack classification models with transferability, black-box attack; unrestricted adversarial attacks on imagenet

Implementation of Wasserstein adversarial attacks.

Adversarial Attacks on Probabilistic Autoregressive Forecasting Models.

Boosting Adversarial Attacks with Enhanced Momentum (BMVC 2021)

PyTorch implementation of our method for adversarial attacks and defenses in hyperspectral image classification.

Official implementation of "Open-set Label Noise Can Improve Robustness Against Inherent Label Noise" (NeurIPS 2021)

The official implementation of NeurIPS 2021 paper: Finding Optimal Tangent Points for Reducing Distortions of Hard-label Attacks

Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement (NeurIPS 2020)

We have implemented shaDow-GNN as a general and powerful pipeline for graph representation learning. For more details, please find our paper titled Deep Graph Neural Networks with Shallow Subgraph Samplers, available on arXiv (https//arxiv.org/abs/2012.01380).

A static analysis library for computing graph representations of Python programs suitable for use with graph neural networks.

The source code of the paper "Understanding Graph Neural Networks from Graph Signal Denoising Perspectives"

Some tentative models that incorporate label propagation to graph neural networks for graph representation learning in nodes, links or graphs.

On Size-Oriented Long-Tailed Graph Classification of Graph Neural Networks

Expired deprecations for `loads`, `ndfromtxt`, and `mafromtxt` in npyio