Continuous Query Decomposition for Complex Query Answering in Incomplete Knowledge Graphs

Related tags

Deep Learning cqd
Overview

Continuous Query Decomposition

This repository contains the official implementation for our ICLR 2021 (Oral) paper, Complex Query Answering with Neural Link Predictors:

@inproceedings{
    arakelyan2021complex,
    title={Complex Query Answering with Neural Link Predictors},
    author={Erik Arakelyan and Daniel Daza and Pasquale Minervini and Michael Cochez},
    booktitle={International Conference on Learning Representations},
    year={2021},
    url={https://openreview.net/forum?id=Mos9F9kDwkz}
}

In this work we present CQD, a method that reuses a pretrained link predictor to answer complex queries, by scoring atom predicates independently and aggregating the scores via t-norms and t-conorms.

Our code is based on an implementation of ComplEx-N3 available here.

Please follow the instructions next to reproduce the results in our experiments.

1. Install the requirements

We recommend creating a new environment:

% conda create --name cqd python=3.8 && conda activate cqd
% pip install -r requirements.txt

2. Download the data

We use 3 knowledge graphs: FB15k, FB15k-237, and NELL. From the root of the repository, download and extract the files to obtain the folder data, containing the sets of triples and queries for each graph.

% wget http://data.neuralnoise.com/cqd-data.tgz
% tar xvf cqd-data.tgz

3. Download the models

Then you need neural link prediction models -- one for each of the datasets. Our pre-trained neural link prediction models are available here:

% wget http://data.neuralnoise.com/cqd-models.tgz
% tar xvf cqd-data.tgz

3. Alternative -- Train your own models

To obtain entity and relation embeddings, we use ComplEx. Use the next commands to train the embeddings for each dataset.

FB15k

% python -m kbc.learn data/FB15k --rank 1000 --reg 0.01 --max_epochs 100  --batch_size 100

FB15k-237

% python -m kbc.learn data/FB15k-237 --rank 1000 --reg 0.05 --max_epochs 100  --batch_size 1000

NELL

% python -m kbc.learn data/NELL --rank 1000 --reg 0.05 --max_epochs 100  --batch_size 1000

Once training is done, the models will be saved in the models directory.

4. Answering queries with CQD

CQD can answer complex queries via continuous (CQD-CO) or combinatorial optimisation (CQD-Beam).

CQD-Beam

Use the kbc.cqd_beam script to answer queries, providing the path to the dataset, and the saved link predictor trained in the previous step. For example,

% python -m kbc.cqd_beam --model_path models/[model_filename].pt

Example:

% PYTHONPATH=. python3 kbc/cqd_beam.py \
  --model_path models/FB15k-model-rank-1000-epoch-100-*.pt \
  --dataset FB15K --mode test --t_norm product --candidates 64 \
  --scores_normalize 0 data/FB15k

models/FB15k-model-rank-1000-epoch-100-1602520745.pt FB15k product 64
ComplEx(
  (embeddings): ModuleList(
    (0): Embedding(14951, 2000, sparse=True)
    (1): Embedding(2690, 2000, sparse=True)
  )
)

[..]

This will save a series of JSON fils with results, e.g.

% cat "topk_d=FB15k_t=product_e=2_2_rank=1000_k=64_sn=0.json"
{
  "MRRm_new": 0.7542805715523118,
  "MRm_new": 50.71081983144581,
  "HITS@1m_new": 0.6896709378392843,
  "HITS@3m_new": 0.7955001359095913,
  "HITS@10m_new": 0.8676865172456019
}

CQD-CO

Use the kbc.cqd_co script to answer queries, providing the path to the dataset, and the saved link predictor trained in the previous step. For example,

% python -m kbc.cqd_co data/FB15k --model_path models/[model_filename].pt --chain_type 1_2

Final Results

All results from the paper can be produced as follows:

% cd results/topk
% ../topk-parse.py *.json | grep rank=1000
d=FB15K rank=1000 & 0.779 & 0.584 & 0.796 & 0.837 & 0.377 & 0.658 & 0.839 & 0.355
d=FB237 rank=1000 & 0.279 & 0.219 & 0.352 & 0.457 & 0.129 & 0.249 & 0.284 & 0.128
d=NELL rank=1000 & 0.343 & 0.297 & 0.410 & 0.529 & 0.168 & 0.283 & 0.536 & 0.157
% cd ../cont
% ../cont-parse.py *.json | grep rank=1000
d=FB15k rank=1000 & 0.454 & 0.191 & 0.796 & 0.837 & 0.336 & 0.513 & 0.816 & 0.319
d=FB15k-237 rank=1000 & 0.213 & 0.131 & 0.352 & 0.457 & 0.146 & 0.222 & 0.281 & 0.132
d=NELL rank=1000 & 0.265 & 0.220 & 0.410 & 0.529 & 0.196 & 0.302 & 0.531 & 0.194
You might also like...
Language models are open knowledge graphs ( non official implementation )
Language models are open knowledge graphs ( non official implementation )

language-models-are-knowledge-graphs-pytorch Language models are open knowledge graphs ( work in progress ) A non official reimplementation of Languag

Compositional and Parameter-Efficient Representations for Large Knowledge Graphs

NodePiece - Compositional and Parameter-Efficient Representations for Large Knowledge Graphs NodePiece is a "tokenizer" for reducing entity vocabulary

Collective Multi-type Entity Alignment Between Knowledge Graphs (WWW'20)

CG-MuAlign A reference implementation for "Collective Multi-type Entity Alignment Between Knowledge Graphs", published in WWW 2020. If you find our pa

ZSL-KG is a general-purpose zero-shot learning framework with a novel transformer graph convolutional network (TrGCN) to learn class representation from common sense knowledge graphs.
ZSL-KG is a general-purpose zero-shot learning framework with a novel transformer graph convolutional network (TrGCN) to learn class representation from common sense knowledge graphs.

ZSL-KG is a general-purpose zero-shot learning framework with a novel transformer graph convolutional network (TrGCN) to learn class representa

Implementation for the EMNLP 2021 paper "Interactive Machine Comprehension with Dynamic Knowledge Graphs".

Interactive Machine Comprehension with Dynamic Knowledge Graphs Implementation for the EMNLP 2021 paper. Dependencies apt-get -y update apt-get instal

This is the source code for: Context-aware Entity Typing in Knowledge Graphs.

This is the source code for: Context-aware Entity Typing in Knowledge Graphs.

ConE: Cone Embeddings for Multi-Hop Reasoning over Knowledge Graphs

ConE: Cone Embeddings for Multi-Hop Reasoning over Knowledge Graphs This is the code of paper ConE: Cone Embeddings for Multi-Hop Reasoning over Knowl

Code for ACL 21: Generating Query Focused Summaries from Query-Free Resources

marge This repository releases the code for Generating Query Focused Summaries from Query-Free Resources. Please cite the following paper [bib] if you

[IJCAI-2021] A benchmark of data-free knowledge distillation from paper
[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

DataFree A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation" Authors: Gongfa

Comments
  • Answering queries with CQD Error

    Answering queries with CQD Error

    When I have trained the model, I followed the the path to answer the question but it showed that missing the entity2text.text .Could you please give some suggestions to slove the problems? thx!😊

    image

    opened by Joyrocky 17
  • 2u/up queries reproduction with CQD @ KGReasoning

    2u/up queries reproduction with CQD @ KGReasoning

    Since there is no issue board at https://github.com/pminervini/KGReasoning I thought I could write it here and tag @pminervini 😃

    I'm trying to run CQD CO and Beam on the BetaE version of FB15k-237 and NELL-995 datasets using that repo, but for some reason, the numbers for union queries are very low.

    After downloading the pre-trained models (fb15k-237-betae and nell-betae, respectively), I'm using the following commands:

    python main.py -cuda --do_test --data_path FB15k-237-betae --cpu_num 1 --geo cqd --tasks "1p.2p.3p.2i.3i.ip.pi.2u.up" --checkpoint_path models/fb15k-237-betae -d 1000
    
    python main.py -cuda --do_test --data_path NELL-betae --cpu_num 1 --geo cqd --tasks "1p.2p.3p.2i.3i.ip.pi.2u.up" --checkpoint_path models/nell-betae -d 1000
    

    Other hyperparams are set as default ones (there is no info on when to put --cqd-sigmoid-scores or --cqd-normalize-scores, so I presume they should be turned off).

    The numbers for 2u/up FB15k-237:

    Test 2u-DNF MRR at step 99999: 0.005257
    Test 2u-DNF HITS1 at step 99999: 0.001895
    Test 2u-DNF HITS3 at step 99999: 0.004898
    Test 2u-DNF HITS10 at step 99999: 0.010378
    est 2u-DNF num_queries at step 99999: 5000.000000
    Test up-DNF MRR at step 99999: 0.016857
    Test up-DNF HITS1 at step 99999: 0.005590
    Test up-DNF HITS3 at step 99999: 0.014344
    Test up-DNF HITS10 at step 99999: 0.033338
    Test up-DNF num_queries at step 99999: 5000.000000
    

    And for NELL:

    Test 2u-DNF MRR at step 99999: 0.007676
    Test 2u-DNF HITS1 at step 99999: 0.004144
    Test 2u-DNF HITS3 at step 99999: 0.006924
    Test 2u-DNF HITS10 at step 99999: 0.014262
    Test 2u-DNF num_queries at step 99999: 4000.000000
    Test up-DNF MRR at step 99999: 0.023296
    Test up-DNF HITS1 at step 99999: 0.010295
    Test up-DNF HITS3 at step 99999: 0.022723
    Test up-DNF HITS10 at step 99999: 0.045247
    Test up-DNF num_queries at step 99999: 4000.000000
    

    Is there anything missing or those are expected numbers for betae datasets?

    P.S. Would be good to have an example of how to properly run CQD with KGReasoning in the example.sh :)

    opened by migalkin 8
  • Typo in model download instructions

    Typo in model download instructions

    Hi! I'm trying to reproduce the code following the instructions on the README and I found an error in the instructions. In the instruction, a command for downloading and decompressing the model is provided. However, the decompressing instruction that follows does not refer to the models' file but to cqd-data. I've found that replacing the second line with

    % tar xvf cqd-models.tgz
    

    solves this issue :)

    opened by eamadord 2
  • The procedure of the t-norms and neural link prediction

    The procedure of the t-norms and neural link prediction

    I sorry to say that I could't really understand the procedure of the t-norms and neural link prediction when I studying the codes of this module😔 ,could you give some pseudo codes, math formulas or illustrations about this module. thx!😘 Looking forward for your reply😊

    opened by Joyrocky 1
Owner
UCL Natural Language Processing
UCL Natural Language Processing
Code for Blind Image Decomposition (BID) and Blind Image Decomposition network (BIDeN).

arXiv, porject page, paper Blind Image Decomposition (BID) Blind Image Decomposition is a novel task. The task requires separating a superimposed imag

null 64 Dec 20, 2022
Code for the paper "Query Embedding on Hyper-relational Knowledge Graphs"

Query Embedding on Hyper-Relational Knowledge Graphs This repository contains the code used for the experiments in the paper Query Embedding on Hyper-

DimitrisAlivas 19 Jul 26, 2022
Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)

Complex-Valued Neural Networks (CVNN) Done by @NEGU93 - J. Agustin Barrachina Using this library, the only difference with a Tensorflow code is that y

youceF 1 Nov 12, 2021
MLOps will help you to understand how to build a Continuous Integration and Continuous Delivery pipeline for an ML/AI project.

page_type languages products description sample python azure azure-machine-learning-service azure-devops Code which demonstrates how to set up and ope

null 1 Nov 1, 2021
PyTorch implementation for COMPLETER: Incomplete Multi-view Clustering via Contrastive Prediction (CVPR 2021)

Completer: Incomplete Multi-view Clustering via Contrastive Prediction This repo contains the code and data of the following paper accepted by CVPR 20

XLearning Group 72 Dec 7, 2022
A PyTorch Implementation of "SINE: Scalable Incomplete Network Embedding" (ICDM 2018).

Scalable Incomplete Network Embedding ⠀⠀ A PyTorch implementation of Scalable Incomplete Network Embedding (ICDM 2018). Abstract Attributed network em

Benedek Rozemberczki 69 Sep 22, 2022
Incomplete easy-to-use math solver and PDF generator.

Math Expert Let me do your work Preview preview.mp4 Introduction Math Expert is our (@salastro, @younis-tarek, @marawn-mogeb) math high school graduat

SalahDin Ahmed 22 Jul 11, 2022
RNG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering

RNG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering Authors: Xi Ye, Semih Yavuz, Kazuma Hashimoto, Yingbo Zhou and

Salesforce 72 Dec 5, 2022
Learning from History: Modeling Temporal Knowledge Graphs with Sequential Copy-Generation Networks

CyGNet This repository reproduces the AAAI'21 paper “Learning from History: Modeling Temporal Knowledge Graphs with Sequential Copy-Generation Network

CunchaoZ 89 Jan 3, 2023
ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs

(Comet-) ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs Paper Jena D. Hwang, Chandra Bhagavatula, Ronan Le Bras, Jeff Da, Keisuke Sa

AI2 152 Dec 27, 2022