Official implementation of Neural Bellman-Ford Networks (NeurIPS 2021)

Related tags

Deep Learning NBFNet
Overview

NBFNet: Neural Bellman-Ford Networks

This is the official codebase of the paper

Neural Bellman-Ford Networks: A General Graph Neural Network Framework for Link Prediction

Zhaocheng Zhu, Zuobai Zhang, Louis-Pascal Xhonneux, Jian Tang

NeurIPS 2021

Overview

NBFNet is a graph neural network framework inspired by traditional path-based methods. It enjoys the advantages of both traditional path-based methods and modern graph neural networks, including generalization in the inductive setting, interpretability, high model capacity and scalability. NBFNet can be applied to solve link prediction on both homogeneous graphs and knowledge graphs.

NBFNet

This codebase is based on PyTorch and TorchDrug. It supports training and inference with multiple GPUs or multiple machines.

Installation

You may install the dependencies via either conda or pip. Generally, NBFNet works with Python 3.7/3.8 and PyTorch version >= 1.8.0.

From Conda

conda install torchdrug pytorch=1.8.2 cudatoolkit=11.1 -c milagraph -c pytorch-lts -c pyg -c conda-forge
conda install ogb easydict pyyaml -c conda-forge

From Pip

pip install torch==1.8.2+cu111 -f https://download.pytorch.org/whl/lts/1.8/torch_lts.html
pip install torchdrug
pip install ogb easydict pyyaml

Reproduction

To reproduce the results of NBFNet, use the following command. All the datasets will be automatically downloaded in the code.

python script/run.py -c config/inductive/wn18rr.yaml --gpus [0] --version v1

We provide the hyperparameters for each experiment in configuration files. All the configuration files can be found in config/*/*.yaml.

For experiments on inductive relation prediction, you need to additionally specify the split version with --version v1.

To run NBFNet with multiple GPUs or multiple machines, use the following commands

python -m torch.distributed.launch --nproc_per_node=4 script/run.py -c config/inductive/wn18rr.yaml --gpus [0,1,2,3]
python -m torch.distributed.launch --nnodes=4 --nproc_per_node=4 script/run.py -c config/inductive/wn18rr.yaml --gpus[0,1,2,3,0,1,2,3,0,1,2,3,0,1,2,3]

Visualize Interpretations on FB15k-237

Once you have models trained on FB15k237, you can visualize the path interpretations with the following line. Please replace the checkpoint with your own path.

python script/visualize.py -c config/knowledge_graph/fb15k237_visualize.yaml --checkpoint /path/to/nbfnet/experiment/model_epoch_20.pth

Evaluate ogbl-biokg

Due to the large size of ogbl-biokg, we only evaluate on a small portion of the validation set during training. The following line evaluates a model on the full validation / test sets of ogbl-biokg. Please replace the checkpoint with your own path.

python script/run.py -c config/knowledge_graph/ogbl-biokg_test.yaml --checkpoint /path/to/nbfnet/experiment/model_epoch_10.pth

Results

Here are the results of NBFNet on standard benchmark datasets. All the results are obtained with 4 V100 GPUs (32GB). Note results may be slightly different if the model is trained with 1 GPU and/or a smaller batch size.

Knowledge Graph Completion

Dataset MR MRR HITS@1 HITS@3 HITS@10
FB15k-237 114 0.415 0.321 0.454 0.599
WN18RR 636 0.551 0.497 0.573 0.666
ogbl-biokg - 0.829 0.768 0.870 0.946

Homogeneous Graph Link Prediction

Dataset AUROC AP
Cora 0.956 0.962
CiteSeer 0.923 0.936
PubMed 0.983 0.982

Inductive Relation Prediction

Dataset HITS@10 (50 sample)
v1 v2 v3 v4
FB15k-237 0.834 0.949 0.951 0.960
WN18RR 0.948 0.905 0.893 0.890

Frequently Asked Questions

  1. The code is stuck at the beginning of epoch 0.

    This is probably because the JIT cache is broken. Try rm -r ~/.cache/torch_extensions/* and run the code again.

Citation

If you find this codebase useful in your research, please cite the following paper.

@article{zhu2021neural,
  title={Neural Bellman-Ford Networks: A General Graph Neural Network Framework for Link Prediction},
  author={Zhu, Zhaocheng and Zhang, Zuobai and Xhonneux, Louis-Pascal and Tang, Jian},
  journal={arXiv preprint arXiv:2106.06935},
  year={2021}
}
Comments
  • [Feature Request] `Dockerfile` / `environment.yml` for better reproducibility

    [Feature Request] `Dockerfile` / `environment.yml` for better reproducibility

    Congratulations to the authors for NeurIPS'21, looking forward to your talk during LoGaG


    While installing the project on VMs and local systems, I've been running into multiple issues getting the correct package versions installed. Be it CUDA errors while installing torch-scatter and torchdrug or simply pybind11 issues. Having a Dockerfile would help out with preventing such errors and make reproducibility + experimentation easier.

    I think it'd be easier and better for there to be a Docker image for torchdrug itself and then the image for NBFNet would just use that as the base image. More than happy to take this up.

    This way one could also use the nvidia container toolkit for running experiments across multiple GPUs/nodes easily.

    opened by SauravMaheshkar 7
  • JIT compile fail when using `functional.generalized_rspmm` with CUDA on Linux

    JIT compile fail when using `functional.generalized_rspmm` with CUDA on Linux

    Hey,

    Most likely this is an error with torch drug itself however when I try to run any of the examples from the readme, the code will crash with the following error:

    spmm.cuda.o.d -DTORCH_EXTENSION_NAME=spmm -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/user/miniconda3/envs/path/lib/python3.8/site-packages/torch/include -isystem /home/user/miniconda3/envs/path/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/miniconda3/envs/path/lib/python3.8/site-packages/torch/include/TH -isystem /home/user/miniconda3/envs/path/lib/python3.8/site-packages/torch/include/THC -isystem /opt/scp/software/CUDA/11.1.0/include -isystem /home/miniconda3/envs/path/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/user/miniconda3/envs/path/lib/python3.8/site-packages/torchdrug/layers/functional/extension/spmm.cu -o spmm.cuda.o
    FAILED: spmm.cuda.o
    /opt/scp/software/CUDA/11.1.0/bin/nvcc --generate-dependencies-with-compile --dependency-output spmm.cuda.o.d -DTORCH_EXTENSION_NAME=spmm -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/user/miniconda3/envs/path/lib/python3.8/site-packages/torch/include -isystem /home/user/miniconda3/envs/path/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/user/miniconda3/envs/path/lib/python3.8/site-packages/torch/include/TH -isystem /home/user/miniconda3/envs/path/lib/python3.8/site-packages/torch/include/THC -isystem /opt/scp/software/CUDA/11.1.0/include -isystem /home/user/miniconda3/envs/path/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/user/miniconda3/envs/path/lib/python3.8/site-packages/torchdrug/layers/functional/extension/spmm.cu -o spmm.cuda.o
    /opt/software/CUDA/11.1.0/include/cuComplex.h: In function ‘float cuCabsf(cuFloatComplex)’:
    /opt/software/CUDA/11.1.0/include/cuComplex.h:179:16: error: expected ‘)’ before numeric constant
    

    This only occurs on a GPU linux machine, which is using CUDA 11.1 and GCC 10.3.

    The conda env is as follows:

    blas                      1.0                         mkl
    boost                     1.74.0           py38hc10631b_3    conda-forge
    boost-cpp                 1.74.0               h9359b55_0    conda-forge
    brotlipy                  0.7.0           py38h497a2fe_1001    conda-forge
    bzip2                     1.0.8                h7f98852_4    conda-forge
    ca-certificates           2021.10.8            ha878542_0    conda-forge
    cairo                     1.16.0            h3fc0475_1005    conda-forge
    certifi                   2021.10.8        py38h578d9bd_1    conda-forge
    cffi                      1.15.0           py38hd667e15_1
    charset-normalizer        2.0.10             pyhd8ed1ab_0    conda-forge
    colorama                  0.4.4              pyh9f0ad1d_0    conda-forge
    cryptography              35.0.0           py38ha5dfef3_0    conda-forge
    cudatoolkit               11.1.1               h6406543_8    conda-forge
    cycler                    0.11.0             pyhd8ed1ab_0    conda-forge
    decorator                 4.4.2                      py_0    conda-forge
    easydict                  1.9                        py_0    conda-forge
    fontconfig                2.13.1            hba837de_1005    conda-forge
    freetype                  2.10.4               h0708190_1    conda-forge
    glib                      2.69.1               h4ff587b_1
    icu                       67.1                 he1b5a44_0    conda-forge
    idna                      3.3                pyhd8ed1ab_0    conda-forge
    intel-openmp              2021.4.0          h06a4308_3561
    jinja2                    3.0.3              pyhd8ed1ab_0    conda-forge
    joblib                    1.1.0              pyhd8ed1ab_0    conda-forge
    jpeg                      9d                   h36c2ea0_0    conda-forge
    kiwisolver                1.3.1            py38h2531618_0
    ld_impl_linux-64          2.35.1               h7274673_9
    libffi                    3.3                  he6710b0_2
    libgcc-ng                 9.3.0               h5101ec6_17
    libgfortran-ng            7.5.0               h14aa051_19    conda-forge
    libgfortran4              7.5.0               h14aa051_19    conda-forge
    libgomp                   9.3.0               h5101ec6_17
    libiconv                  1.16                 h516909a_0    conda-forge
    libpng                    1.6.37               h21135ba_2    conda-forge
    libstdcxx-ng              9.3.0               hd4cf53a_17
    libtiff                   4.0.10            hc3755c2_1005    conda-forge
    libuuid                   2.32.1            h7f98852_1000    conda-forge
    libuv                     1.42.0               h7f98852_0    conda-forge
    libxcb                    1.13              h7f98852_1003    conda-forge
    libxml2                   2.9.10               h68273f3_2    conda-forge
    littleutils               0.2.2                      py_0    conda-forge
    lz4-c                     1.9.3                h9c3ff4c_1    conda-forge
    markupsafe                2.0.1            py38h497a2fe_0    conda-forge
    matplotlib                3.2.2                         1    conda-forge
    matplotlib-base           3.2.2            py38h5d868c9_1    conda-forge
    mkl                       2021.4.0           h06a4308_640
    mkl-service               2.4.0            py38h497a2fe_0    conda-forge
    mkl_fft                   1.3.1            py38hd3c417c_0
    mkl_random                1.2.2            py38h1abd341_0    conda-forge
    ncurses                   6.3                  h7f8727e_2
    networkx                  2.5.1              pyhd8ed1ab_0    conda-forge
    ninja                     1.10.2               h4bd325d_0    conda-forge
    numpy                     1.21.2           py38h20f2e39_0
    numpy-base                1.21.2           py38h79a1101_0
    ogb                       1.3.2              pyhd8ed1ab_0    conda-forge
    olefile                   0.46               pyh9f0ad1d_1    conda-forge
    openssl                   1.1.1m               h7f8727e_0
    outdated                  0.2.1              pyhd8ed1ab_0    conda-forge
    pandas                    1.2.5            py38h1abd341_0    conda-forge
    pcre                      8.45                 h9c3ff4c_0    conda-forge
    pillow                    6.2.1            py38h6b7be26_0    conda-forge
    pip                       21.2.4           py38h06a4308_0
    pixman                    0.38.0            h516909a_1003    conda-forge
    pthread-stubs             0.4               h36c2ea0_1001    conda-forge
    pycairo                   1.20.1           py38hf61ee4a_0    conda-forge
    pycparser                 2.21               pyhd8ed1ab_0    conda-forge
    pyopenssl                 21.0.0             pyhd8ed1ab_0    conda-forge
    pyparsing                 3.0.7              pyhd8ed1ab_0    conda-forge
    pysocks                   1.7.1            py38h578d9bd_4    conda-forge
    python                    3.8.12               h12debd9_0
    python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
    python_abi                3.8                      2_cp38    conda-forge
    pytorch                   1.8.2           py3.8_cuda11.1_cudnn8.0.5_0    pytorch-lts
    pytorch-scatter           2.0.8           py38_torch_1.8.0_cu111    pyg
    pytz                      2021.3             pyhd8ed1ab_0    conda-forge
    pyyaml                    5.4.1            py38h497a2fe_0    conda-forge
    rdkit                     2020.09.5        py38h2bca085_0    conda-forge
    readline                  8.1.2                h7f8727e_1
    reportlab                 3.5.68           py38hadf75a6_0    conda-forge
    requests                  2.27.1             pyhd8ed1ab_0    conda-forge
    scikit-learn              1.0.2            py38h51133e4_1
    scipy                     1.7.3            py38hc147768_0
    setuptools                58.0.4           py38h06a4308_0
    six                       1.16.0             pyh6c4a22f_0    conda-forge
    sqlalchemy                1.3.23           py38h497a2fe_0    conda-forge
    sqlite                    3.37.0               hc218d9a_0
    threadpoolctl             3.0.0              pyh8a188c0_0    conda-forge
    tk                        8.6.11               h1ccaba5_0
    torchdrug                 0.1.2                  ha710097    milagraph
    tornado                   6.1              py38h497a2fe_1    conda-forge
    tqdm                      4.62.3             pyhd8ed1ab_0    conda-forge
    typing_extensions         4.0.1              pyha770c72_0    conda-forge
    urllib3                   1.26.8             pyhd8ed1ab_1    conda-forge
    wheel                     0.37.1             pyhd3eb1b0_0
    xorg-kbproto              1.0.7             h7f98852_1002    conda-forge
    xorg-libice               1.0.10               h7f98852_0    conda-forge
    xorg-libsm                1.2.3             hd9c2040_1000    conda-forge
    xorg-libx11               1.7.2                h7f98852_0    conda-forge
    xorg-libxau               1.0.9                h7f98852_0    conda-forge
    xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
    xorg-libxext              1.3.4                h7f98852_1    conda-forge
    xorg-libxrender           0.9.10            h7f98852_1003    conda-forge
    xorg-renderproto          0.11.1            h7f98852_1002    conda-forge
    xorg-xextproto            7.3.0             h7f98852_1002    conda-forge
    xorg-xproto               7.0.31            h7f98852_1007    conda-forge
    xz                        5.2.5                h7b6447c_0
    yaml                      0.2.5                h516909a_0    conda-forge
    zlib                      1.2.11               h7f8727e_4
    zstd                      1.4.9                ha95c52a_0    conda-forge
    

    Any ideas how to get this to run?

    Many thanks!

    opened by sbonner0 5
  • Error when loading pretrained epoch

    Error when loading pretrained epoch

    Hi,

    I tried a new model on NBFNet and tried to load it. But I cannot load it, the issues seem to come from the torchdrug/patch.py. I wonder if you have a good solution on this: Traceback (most recent call last): File "script/run.py", line 60, in solver = util.build_solver(cfg, dataset) File "/shared-datadrive/shared-training/NBFNet/nbfnet/util.py", line 120, in build_solver solver.load(cfg.checkpoint) File "/home/azureuser/.pyenv/versions/nbfnet/lib/python3.8/site-packages/torchdrug-0.1.2-py3.8.egg/torchdrug/core/engine.py", line 231, in load self.model.load_state_dict(state["model"]) File "/home/azureuser/.pyenv/versions/nbfnet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1497, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for KnowledgeGraphCompletion: While copying the parameter named "graph", expected torch.Tensor or Tensor-like object from checkpoint but received <class 'torchdrug.data.graph.Graph'> While copying the parameter named "fact_graph", expected torch.Tensor or Tensor-like object from checkpoint but received <class 'torchdrug.data.graph.Graph'>

    And I checked the module in nn.Module is actually overwritten by PatchedModule -> self.model.load_state_dict(state["model"]) (Pdb) nn.Module <class 'torchdrug.patch.PatchedModule'>

    opened by jwzhi 2
  • Hits@10 of RotatE is higher than original paper.

    Hits@10 of RotatE is higher than original paper.

    Hi,

    I found the Hits@10 of RotatE in FB15k237 (0.553) is higher than original paper (0.533). And others are same.

    Is this a recording error or did you improve the performance of RotatE?

    opened by quqxui 1
  • Unable to run the code with ImportError in cpp_extension

    Unable to run the code with ImportError in cpp_extension

    Hi! I followed the instruction to install the packages. But now I'm getting an ImportError when reproducing the results. The error is as following. I also tried rm -r ~/.cache/torch_extensions/* as suggested in Readme but that will cause more error.

    Traceback (most recent call last): File "script/run.py", line 69, in train_and_validate(cfg, solver) File "script/run.py", line 28, in train_and_validate solver.evaluate("test") File "/home/lja/anaconda3/envs/NBFnet/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/home/lja/anaconda3/envs/NBFnet/lib/python3.8/site-packages/torchdrug-0.1.2-py3.8.egg/torchdru g/core/engine.py", line 206, in evaluate pred, target = model.predict_and_target(batch) File "/home/lja/anaconda3/envs/NBFnet/lib/python3.8/site-packages/torchdrug-0.1.2-py3.8.egg/torchdru g/tasks/task.py", line 27, in predict_and_target return self.predict(batch, all_loss, metric), self.target(batch) File "/home/lja/git_clone/NBFNet/nbfnet/task.py", line 277, in predict t_pred = self.model(graph, h_index, t_index, r_index, all_loss=all_loss, metric=metric) File "/home/lja/anaconda3/envs/NBFnet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "/home/lja/git_clone/NBFNet/nbfnet/model.py", line 149, in forward output = self.bellmanford(graph, h_index[:, 0], r_index[:, 0]) File "/home/lja/anaconda3/envs/NBFnet/lib/python3.8/site-packages/decorator.py", line 232, in fun return caller(func, *(extras + args), **kw) File "/home/lja/anaconda3/envs/NBFnet/lib/python3.8/site-packages/torchdrug-0.1.2-py3.8.egg/torchdru g/utils/decorator.py", line 88, in wrapper result = forward(self, *args, **kwargs) File "/home/lja/git_clone/NBFNet/nbfnet/model.py", line 115, in bellmanford hidden = layer(step_graph, layer_input) File "/home/lja/anaconda3/envs/NBFnet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "/home/lja/anaconda3/envs/NBFnet/lib/python3.8/site-packages/torchdrug-0.1.2-py3.8.egg/torchdru g/layers/conv.py", line 91, in forward update = self.message_and_aggregate(graph, input) File "/home/lja/git_clone/NBFNet/nbfnet/layer.py", line 140, in message_and_aggregate sum = functional.generalized_rspmm(adjacency, relation_input, input, sum="add", mul=mul) File "/home/lja/anaconda3/envs/NBFnet/lib/python3.8/site-packages/torchdrug-0.1.2-py3.8.egg/torchdru g/layers/functional/spmm.py", line 378, in generalized_rspmm return Function.apply(sparse.coalesce(), relation, input) File "/home/lja/anaconda3/envs/NBFnet/lib/python3.8/site-packages/torchdrug-0.1.2-py3.8.egg/torchdru g/layers/functional/spmm.py", line 172, in forward forward = spmm.rspmm_add_mul_forward_cuda File "/home/lja/anaconda3/envs/NBFnet/lib/python3.8/site-packages/torchdrug-0.1.2-py3.8.egg/torchdru g/utils/torch.py", line 27, in getattr return getattr(self.module, key) File "/home/lja/anaconda3/envs/NBFnet/lib/python3.8/site-packages/torchdrug-0.1.2-py3.8.egg/torchdru g/utils/decorator.py", line 21, in get result = self.func(obj) File "/home/lja/anaconda3/envs/NBFnet/lib/python3.8/site-packages/torchdrug-0.1.2-py3.8.egg/torchdru g/utils/torch.py", line 31, in module return cpp_extension.load(self.name, self.sources, self.extra_cflags, self.extra_cuda_cflags, File "/home/lja/anaconda3/envs/NBFnet/lib/python3.8/site-packages/torch/utils/cpp_extension.py", lin e 1144, in load return _jit_compile( File "/home/lja/anaconda3/envs/NBFnet/lib/python3.8/site-packages/torch/utils/cpp_extension.py", lin e 1382, in _jit_compile return _import_module_from_library(name, build_directory, is_python_module) File "/home/lja/anaconda3/envs/NBFnet/lib/python3.8/site-packages/torch/utils/cpp_extension.py", lin e 1776, in _import_module_from_library module = importlib.util.module_from_spec(spec) File "", line 556, in module_from_spec File "", line 1166, in create_module File "", line 219, in _call_with_frames_removed ImportError: /home/lja/.cache/torch_extensions/spmm_0/spmm.so: cannot open shared object file: No such file or directory

    I'm using torch1.11+cuda11.3 \ torchdrug0.1.2

    Do you know how to dealing with this? Any help is appreciated! By the way, in other issues I noticed an enviroment.yml would be released. Where can I find that? Thanks!

    opened by JiaangL 1
  • [Question] Proportion of training triples used

    [Question] Proportion of training triples used

    Hello!

    First of all thanks so much for this awesome publication & codebase.

    I'm in the process of tweaking a config for training NBFNet, and trying to understand the proportion of the training triples used when training on ogbl-biokg using the provided config config/knowledge_graph/ogbl-biokg.yaml.

    Since batch_size: 8, batch_per_epoch: 200 and num_epoch: 10, and the number of training triples in ogbl-biokg being 4,762,678, is it correct to assume that only (8 * 200 * 10)/4,762,678 = 0.000335... ≈ 0.34% of the training triples is used for the entire training run?

    It seems very small and I'm most likely missing some vital implementation details - I'd appreciate your help.

    Thanks so much!

    opened by MisaOgura 1
  • Problem about wikikg90m

    Problem about wikikg90m

    Hello, Thank you for your wonderful work . Can you provide the code for NBFNet to implement wikikg90m? How can I reproduce this result? I hope to get your help.

    opened by QingFei1 0
  • Issues of the experiment settings of inductive link prediction

    Issues of the experiment settings of inductive link prediction

    First of all, thanks for the awesome code!

    The authors claim that they follow the experiment settings of GraIL, which draws 50 negative triplets for each positive triplet and use the filtered ranking. However, I do not find the corresponding process of drawing 50 negative samples in the code. Can the authors please answer my question?

    opened by smart-lty 0
  • Train on new datasets.

    Train on new datasets.

    Hi, Thank you for your wonderful work and open source code.

    I want to know how to train other KGs on the KG completion task? Such as FB15K.

    Thanks very much.

    opened by quqxui 0
  • Seems unable to utilize multiple GPUs

    Seems unable to utilize multiple GPUs

    Hi there.

    I have tried running this code on one of my machine with four RTX3090 GPUs (GPU memory 24GB for each)

    python -m torch.distributed.launch --nproc_per_node=4 script/run.py -c config/inductive/wn18rr.yaml --gpus [0,1,2,3]
    

    I do not change any other parts of this repo. However, I encountered the CUDA error saying that I need more GPU memory. Later I modified this code as follows:

    python script/run.py -c config/inductive/wn18rr.yaml --gpus [0]
    

    and run it on a machine with one A100 GPU with 40GB GPU memory. The code runs successfully and costs roughly 32GB GPU memory. I am really puzzled for this: why the code does not properly utilize the total 24GB*4=96GB GPU memory and still report a memory issue? Is there something wrong with my setups?

    opened by jerermyyoung 1
  • Problems about ninja

    Problems about ninja

    Hi, Doctor. I meet some problems when I run the code on the Linux. I do really need your help. Could you help me? It really troubles me a lot.

    15:43:32   Preprocess training set
    15:43:36   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
    15:43:36   Epoch 0 begin
    Traceback (most recent call last):
      File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1666, in _run_ninja_build
        subprocess.run(
      File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/subprocess.py", line 516, in run
        raise CalledProcessError(retcode, process.args,
    subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "script/run.py", line 62, in <module>
        train_and_validate(cfg, solver)
      File "script/run.py", line 27, in train_and_validate
        solver.train(**kwargs)
      File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/core/engine.py", line 143, in train
        loss, metric = model(batch)
      File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
        return forward_call(*input, **kwargs)
      File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/tasks/reasoning.py", line 85, in forward
        pred = self.predict(batch, all_loss, metric)
      File "/data1/home/wza/nbfnet/nbfnet/task.py", line 288, in predict
        pred = self.model(graph, h_index, t_index, r_index, all_loss=all_loss, metric=metric)
      File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
        return forward_call(*input, **kwargs)
      File "/data1/home/wza/nbfnet/nbfnet/model.py", line 149, in forward
        output = self.bellmanford(graph, h_index[:, 0], r_index[:, 0])
      File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/decorator.py", line 232, in fun
        return caller(func, *(extras + args), **kw)
      File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/utils/decorator.py", line 56, in wrapper
        return forward(self, *args, **kwargs)
      File "/data1/home/wza/nbfnet/nbfnet/model.py", line 115, in bellmanford
        hidden = layer(step_graph, layer_input)
      File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
        return forward_call(*input, **kwargs)
      File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/conv.py", line 91, in forward
        update = self.message_and_aggregate(graph, input)
      File "/data1/home/wza/nbfnet/nbfnet/layer.py", line 140, in message_and_aggregate
        sum = functional.generalized_rspmm(adjacency, relation_input, input, sum="add", mul=mul)
      File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/spmm.py", line 378, in generalized_rspmm
        return Function.apply(sparse.coalesce(), relation, input)
      File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/spmm.py", line 172, in forward
        forward = spmm.rspmm_add_mul_forward_cuda
      File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/utils/torch.py", line 27, in __getattr__
        return getattr(self.module, key)
      File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/utils/decorator.py", line 21, in __get__
        result = self.func(obj)
      File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/utils/torch.py", line 31, in module
        return cpp_extension.load(self.name, self.sources, self.extra_cflags, self.extra_cuda_cflags,
      File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1080, in load
        return _jit_compile(
      File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1293, in _jit_compile
        _write_ninja_file_and_build_library(
      File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1405, in _write_ninja_file_and_build_library
        _run_ninja_build(
      File "/data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1682, in _run_ninja_build
        raise RuntimeError(message) from e
    RuntimeError: Error building extension 'spmm': [1/3] /usr/local/cuda-10.2/bin/nvcc  -DTORCH_EXTENSION_NAME=spmm -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/TH -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-10.2/include -isystem /data1/home/wza/.conda/envs/linkp/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -O3 -std=c++14 -c /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/rspmm.cu -o rspmm.cuda.o
    FAILED: rspmm.cuda.o
    /usr/local/cuda-10.2/bin/nvcc  -DTORCH_EXTENSION_NAME=spmm -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/TH -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-10.2/include -isystem /data1/home/wza/.conda/envs/linkp/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -O3 -std=c++14 -c /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/rspmm.cu -o rspmm.cuda.o
    /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/rspmm.cu: In instantiation of ‘at::rspmm_forward_cuda(const SparseTensor&, const at::Tensor&, const at::Tensor&)::<lambda()>::<lambda()> [with NaryOp = at::NaryAdd; BinaryOp = at::BinaryMul]’:
    /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/rspmm.cu:246:600:   required from ‘struct at::rspmm_forward_cuda(const SparseTensor&, const at::Tensor&, const at::Tensor&)::<lambda()> [with NaryOp = at::NaryAdd; BinaryOp = at::BinaryMul]::<lambda()>’
    /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/rspmm.cu:246:608:   required from ‘at::rspmm_forward_cuda(const SparseTensor&, const at::Tensor&, const at::Tensor&)::<lambda()> [with NaryOp = at::NaryAdd; BinaryOp = at::BinaryMul]’
    /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/rspmm.cu:246:607:   required from ‘struct at::rspmm_forward_cuda(const SparseTensor&, const at::Tensor&, const at::Tensor&) [with NaryOp = at::NaryAdd; BinaryOp = at::BinaryMul; at::sparse::SparseTensor = at::Tensor]::<lambda()>’
    /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/rspmm.cu:246:28:   required from ‘at::Tensor at::rspmm_forward_cuda(const SparseTensor&, const at::Tensor&, const at::Tensor&) [with NaryOp = at::NaryAdd; BinaryOp = at::BinaryMul; at::sparse::SparseTensor = at::Tensor]’
    /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/rspmm.cu:356:193:   required from here
    /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/rspmm.cu:244:37: internal compiler error: in tsubst_copy, at cp/pt.c:13189
         const int num_row_block = (num_row + row_per_block - 1) / row_per_block;
                                         ^
    Please submit a full bug report,
    with preprocessed source if appropriate.
    See <file:///usr/share/doc/gcc-5/README.Bugs> for instructions.
    [2/3] /usr/local/cuda-10.2/bin/nvcc  -DTORCH_EXTENSION_NAME=spmm -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/TH -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-10.2/include -isystem /data1/home/wza/.conda/envs/linkp/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -O3 -std=c++14 -c /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/spmm.cu -o spmm.cuda.o
    FAILED: spmm.cuda.o
    /usr/local/cuda-10.2/bin/nvcc  -DTORCH_EXTENSION_NAME=spmm -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/TH -isystem /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-10.2/include -isystem /data1/home/wza/.conda/envs/linkp/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -O3 -std=c++14 -c /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/spmm.cu -o spmm.cuda.o
    /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/spmm.cu: In instantiation of ‘at::spmm_forward_cuda(const SparseTensor&, const at::Tensor&)::<lambda()>::<lambda()> [with NaryOp = at::NaryAdd; BinaryOp = at::BinaryMul]’:
    /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/spmm.cu:219:506:   required from ‘struct at::spmm_forward_cuda(const SparseTensor&, const at::Tensor&)::<lambda()> [with NaryOp = at::NaryAdd; BinaryOp = at::BinaryMul]::<lambda()>’
    /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/spmm.cu:219:514:   required from ‘at::spmm_forward_cuda(const SparseTensor&, const at::Tensor&)::<lambda()> [with NaryOp = at::NaryAdd; BinaryOp = at::BinaryMul]’
    /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/spmm.cu:219:512:   required from ‘struct at::spmm_forward_cuda(const SparseTensor&, const at::Tensor&) [with NaryOp = at::NaryAdd; BinaryOp = at::BinaryMul; at::sparse::SparseTensor = at::Tensor]::<lambda()>’
    /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/spmm.cu:219:28:   required from ‘at::Tensor at::spmm_forward_cuda(const SparseTensor&, const at::Tensor&) [with NaryOp = at::NaryAdd; BinaryOp = at::BinaryMul; at::sparse::SparseTensor = at::Tensor]’
    /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/spmm.cu:315:157:   required from here
    /data1/home/wza/.conda/envs/linkp/lib/python3.8/site-packages/torchdrug/layers/functional/extension/spmm.cu:217:37: internal compiler error: in tsubst_copy, at cp/pt.c:13189
         const int num_row_block = (num_row + row_per_block - 1) / row_per_block;
                                         ^
    Please submit a full bug report,
    with preprocessed source if appropriate.
    See <file:///usr/share/doc/gcc-5/README.Bugs> for instructions.
    ninja: build stopped: subcommand failed.
    
    opened by Robot-2020 4
  • Unable to run the code with error importing 'spmm'

    Unable to run the code with error importing 'spmm'

    Hi, I followed the instruction to reproduce results but had a problem with module 'spmm'. My torch version is 1.8.2, torchdrug is 0.1.2. Any ideas how to fix it?

    12:53:15 Epoch 0 begin Traceback (most recent call last): File "script/run.py", line 78, in File "script/run.py", line 30, in train_and_validate File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torchdrug\core\engine.py", line 143, in train loss, metric = model(batch) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torchdrug\tasks\reasoning.py", line 85, in forward pred = self.predict(batch, all_loss, metric) File "C:\Users\Pengfei\Documents\cse research\NBFNet-master\nbfnet\task.py", line 288, in predict pred = self.model(graph, h_index, t_index, r_index, all_loss=all_loss, metric=metric) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "C:\Users\Pengfei\Documents\cse research\NBFNet-master\nbfnet\model.py", line 149, in forward output = self.bellmanford(graph, h_index[:, 0], r_index[:, 0]) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\decorator.py", line 232, in fun return caller(func, *(extras + args), **kw) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torchdrug\utils\decorator.py", line 56, in wrapper return forward(self, *args, **kwargs) File "C:\Users\Pengfei\Documents\cse research\NBFNet-master\nbfnet\model.py", line 115, in bellmanford hidden = layer(step_graph, layer_input) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torchdrug\layers\conv.py", line 91, in forward update = self.message_and_aggregate(graph, input) File "C:\Users\Pengfei\Documents\cse research\NBFNet-master\nbfnet\layer.py", line 140, in message_and_aggregate sum = functional.generalized_rspmm(adjacency, relation_input, input, sum="add", mul=mul) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torchdrug\layers\functional\spmm.py", line 378, in generalized_rspmm return Function.apply(sparse.coalesce(), relation, input) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torchdrug\layers\functional\spmm.py", line 172, in forward forward = spmm.rspmm_add_mul_forward_cuda File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torchdrug\utils\torch.py", line 27, in getattr return getattr(self.module, key) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torchdrug\utils\decorator.py", line 21, in get result = self.func(obj) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torchdrug\utils\torch.py", line 31, in module return cpp_extension.load(self.name, self.sources, self.extra_cflags, self.extra_cuda_cflags, File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torch\utils\cpp_extension.py", line 1079, in load return _jit_compile( File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torch\utils\cpp_extension.py", line 1317, in _jit_compile return _import_module_from_library(name, build_directory, is_python_module) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\site-packages\torch\utils\cpp_extension.py", line 1700, in _import_module_from_library file, path, description = imp.find_module(module_name, [path]) File "C:\Users\Pengfei\anaconda3\envs\py38\lib\imp.py", line 296, in find_module raise ImportError(_ERR_MSG.format(name), name=name) ImportError: No module named 'spmm'

    opened by BillBote 1
  • Unable to run the code with error regarding 'mpiicpc'

    Unable to run the code with error regarding 'mpiicpc'

    Hello,

    I followed the instruction to install the torchdrug-related packages and matching PyTorch/CUDA version. However, I got this following error when initializing the code. Any ideas to fix this? The system has intel/19.0.3.199 loaded.

    01:24:15   Epoch 0 begin
    Traceback (most recent call last):
      File "script/run.py", line 62, in <module>
        train_and_validate(cfg, solver)
      File "script/run.py", line 27, in train_and_validate
        solver.train(**kwargs)
      File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/core/engine.py", line 143, in train
        loss, metric = model(batch)
      File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
        result = self.forward(*input, **kwargs)
      File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/tasks/reasoning.py", line 85, in forward
        pred = self.predict(batch, all_loss, metric)
      File "~/Workspace/Python/NBFNet/nbfnet/task.py", line 288, in predict
        pred = self.model(graph, h_index, t_index, r_index, all_loss=all_loss, metric=metric)
      File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
        result = self.forward(*input, **kwargs)
      File "~/Workspace/Python/NBFNet/nbfnet/model.py", line 149, in forward
        output = self.bellmanford(graph, h_index[:, 0], r_index[:, 0])
      File "<decorator-gen-888>", line 2, in bellmanford
      File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/utils/decorator.py", line 56, in wrapper
        return forward(self, *args, **kwargs)
      File "~/Workspace/Python/NBFNet/nbfnet/model.py", line 115, in bellmanford
        hidden = layer(step_graph, layer_input)
      File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
        result = self.forward(*input, **kwargs)
      File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/layers/conv.py", line 91, in forward
        update = self.message_and_aggregate(graph, input)
      File "~/Workspace/Python/NBFNet/nbfnet/layer.py", line 124, in message_and_aggregate
        adjacency = graph.adjacency.transpose(0, 1)
      File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/utils/decorator.py", line 21, in __get__
        result = self.func(obj)
      File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/data/graph.py", line 658, in adjacency
        return utils.sparse_coo_tensor(self.edge_list.t(), self.edge_weight, self.shape)
      File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/utils/torch.py", line 182, in sparse_coo_tensor
        return torch_ext.sparse_coo_tensor_unsafe(indices, values, size)
      File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/utils/torch.py", line 27, in __getattr__
        return getattr(self.module, key)
      File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/utils/decorator.py", line 21, in __get__
        result = self.func(obj)
      File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/utils/torch.py", line 31, in module
        return cpp_extension.load(self.name, self.sources, self.extra_cflags, self.extra_cuda_cflags,
      File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1079, in load
        return _jit_compile(
      File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1292, in _jit_compile
        _write_ninja_file_and_build_library(
      File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1378, in _write_ninja_file_and_build_library
        check_compiler_abi_compatibility(compiler)
      File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 282, in check_compiler_abi_compatibility
        if not check_compiler_ok_for_platform(compiler):
      File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 249, in check_compiler_ok_for_platform
        version_string = subprocess.check_output([compiler, '-v'], stderr=subprocess.STDOUT).decode()
      File "~/anaconda3/envs/dlg_env/lib/python3.8/subprocess.py", line 415, in check_output
        return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
      File "~/anaconda3/envs/dlg_env/lib/python3.8/subprocess.py", line 516, in run
        raise CalledProcessError(retcode, process.args,
    subprocess.CalledProcessError: Command '['icpc', '-v']' returned non-zero exit status 1.
    
    opened by VeritasYin 1
Owner
MilaGraph
Research group led by Prof. Jian Tang at Mila-Quebec AI Institute (https://mila.quebec/) focusing on graph representation learning and graph neural networks.
MilaGraph
Code to reproduce the experiments from our NeurIPS 2021 paper " The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective"

Code To run: python runner.py new --save <SAVE_NAME> --data <PATH_TO_DATA_DIR> --dataset <DATASET> --model <model_name> [options] --n 1000 - train - t

Geoff Pleiss 5 Dec 12, 2022
Source code of NeurIPS 2021 Paper ''Be Confident! Towards Trustworthy Graph Neural Networks via Confidence Calibration''

CaGCN This repo is for source code of NeurIPS 2021 paper "Be Confident! Towards Trustworthy Graph Neural Networks via Confidence Calibration". Paper L

null 6 Dec 19, 2022
Official Pytorch implementation of 'GOCor: Bringing Globally Optimized Correspondence Volumes into Your Neural Network' (NeurIPS 2020)

Official implementation of GOCor This is the official implementation of our paper : GOCor: Bringing Globally Optimized Correspondence Volumes into You

Prune Truong 71 Nov 18, 2022
An implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.

Neural Attention Distillation This is an implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep

Yige-Li 84 Jan 4, 2023
A tensorflow=1.13 implementation of Deconvolutional Networks on Graph Data (NeurIPS 2021)

GDN A tensorflow=1.13 implementation of Deconvolutional Networks on Graph Data (NeurIPS 2021) Abstract In this paper, we consider an inverse problem i

null 4 Sep 13, 2022
Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)

Complex-Valued Neural Networks (CVNN) Done by @NEGU93 - J. Agustin Barrachina Using this library, the only difference with a Tensorflow code is that y

youceF 1 Nov 12, 2021
Official implementation of NeurIPS 2021 paper "One Loss for All: Deep Hashing with a Single Cosine Similarity based Learning Objective"

Official implementation of NeurIPS 2021 paper "One Loss for All: Deep Hashing with a Single Cosine Similarity based Learning Objective"

Ng Kam Woh 71 Dec 22, 2022
Official implementation of NeurIPS 2021 paper "Contextual Similarity Aggregation with Self-attention for Visual Re-ranking"

CSA: Contextual Similarity Aggregation with Self-attention for Visual Re-ranking PyTorch training code for CSA (Contextual Similarity Aggregation). We

Hui Wu 19 Oct 21, 2022
Official Pytorch implementation of "Unbiased Classification Through Bias-Contrastive and Bias-Balanced Learning (NeurIPS 2021)

Unbiased Classification Through Bias-Contrastive and Bias-Balanced Learning (NeurIPS 2021) Official Pytorch implementation of Unbiased Classification

Youngkyu 17 Jan 1, 2023
Official implementation of "Open-set Label Noise Can Improve Robustness Against Inherent Label Noise" (NeurIPS 2021)

Open-set Label Noise Can Improve Robustness Against Inherent Label Noise NeurIPS 2021: This repository is the official implementation of ODNL. Require

Hongxin Wei 12 Dec 7, 2022
Official implementation of Generalized Data Weighting via Class-level Gradient Manipulation (NeurIPS 2021).

Generalized Data Weighting via Class-level Gradient Manipulation This repository is the official implementation of Generalized Data Weighting via Clas

null 9 Nov 3, 2021
The official implementation of NeurIPS 2021 paper: Finding Optimal Tangent Points for Reducing Distortions of Hard-label Attacks

The official implementation of NeurIPS 2021 paper: Finding Optimal Tangent Points for Reducing Distortions of Hard-label Attacks

machen 11 Nov 27, 2022
Official Pytorch implementation for Deep Contextual Video Compression, NeurIPS 2021

Introduction Official Pytorch implementation for Deep Contextual Video Compression, NeurIPS 2021 Prerequisites Python 3.8 and conda, get Conda CUDA 11

null 51 Dec 3, 2022
Official implementation of NeurIPS'2021 paper TransformerFusion

TransformerFusion: Monocular RGB Scene Reconstruction using Transformers Project Page | Paper | Video TransformerFusion: Monocular RGB Scene Reconstru

Aljaz Bozic 118 Dec 25, 2022
Defending graph neural networks against adversarial attacks (NeurIPS 2020)

GNNGuard: Defending Graph Neural Networks against Adversarial Attacks Authors: Xiang Zhang ([email protected]), Marinka Zitnik (marinka@hms.

Zitnik Lab @ Harvard 44 Dec 7, 2022
Code for "Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks", CVPR 2021

Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks This repository contains the code that accompanies our CVPR 20

Despoina Paschalidou 161 Dec 20, 2022
This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

The Neural Process Family This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CN

DeepMind 892 Dec 28, 2022
Official code for On Path Integration of Grid Cells: Group Representation and Isotropic Scaling (NeurIPS 2021)

On Path Integration of Grid Cells: Group Representation and Isotropic Scaling This repo contains the official implementation for the paper On Path Int

Ruiqi Gao 39 Nov 10, 2022
Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

This is the official PyTorch implementation of our paper: "Joint Object Detection and Multi-Object Tracking with Graph Neural Networks". Our project website and video demos are here.

Richard Wang 443 Dec 6, 2022