Continuous Query Decomposition for Complex Query Answering in Incomplete Knowledge Graphs

UCL Natural Language Processing

Last update: Dec 29, 2022

Related tags

Deep Learning cqd

Overview

Continuous Query Decomposition

This repository contains the official implementation for our ICLR 2021 (Oral) paper, Complex Query Answering with Neural Link Predictors:

@inproceedings{
    arakelyan2021complex,
    title={Complex Query Answering with Neural Link Predictors},
    author={Erik Arakelyan and Daniel Daza and Pasquale Minervini and Michael Cochez},
    booktitle={International Conference on Learning Representations},
    year={2021},
    url={https://openreview.net/forum?id=Mos9F9kDwkz}
}

In this work we present CQD, a method that reuses a pretrained link predictor to answer complex queries, by scoring atom predicates independently and aggregating the scores via t-norms and t-conorms.

Our code is based on an implementation of ComplEx-N3 available here.

Please follow the instructions next to reproduce the results in our experiments.

1. Install the requirements

We recommend creating a new environment:

% conda create --name cqd python=3.8 && conda activate cqd
% pip install -r requirements.txt

2. Download the data

We use 3 knowledge graphs: FB15k, FB15k-237, and NELL. From the root of the repository, download and extract the files to obtain the folder data, containing the sets of triples and queries for each graph.

% wget http://data.neuralnoise.com/cqd-data.tgz
% tar xvf cqd-data.tgz

3. Download the models

Then you need neural link prediction models -- one for each of the datasets. Our pre-trained neural link prediction models are available here:

% wget http://data.neuralnoise.com/cqd-models.tgz
% tar xvf cqd-data.tgz

3. Alternative -- Train your own models

To obtain entity and relation embeddings, we use ComplEx. Use the next commands to train the embeddings for each dataset.

FB15k

% python -m kbc.learn data/FB15k --rank 1000 --reg 0.01 --max_epochs 100  --batch_size 100

FB15k-237

% python -m kbc.learn data/FB15k-237 --rank 1000 --reg 0.05 --max_epochs 100  --batch_size 1000

NELL

% python -m kbc.learn data/NELL --rank 1000 --reg 0.05 --max_epochs 100  --batch_size 1000

Once training is done, the models will be saved in the models directory.

4. Answering queries with CQD

CQD can answer complex queries via continuous (CQD-CO) or combinatorial optimisation (CQD-Beam).

CQD-Beam

Use the kbc.cqd_beam script to answer queries, providing the path to the dataset, and the saved link predictor trained in the previous step. For example,

% python -m kbc.cqd_beam --model_path models/[model_filename].pt

Example:

% PYTHONPATH=. python3 kbc/cqd_beam.py \
  --model_path models/FB15k-model-rank-1000-epoch-100-*.pt \
  --dataset FB15K --mode test --t_norm product --candidates 64 \
  --scores_normalize 0 data/FB15k

models/FB15k-model-rank-1000-epoch-100-1602520745.pt FB15k product 64
ComplEx(
  (embeddings): ModuleList(
    (0): Embedding(14951, 2000, sparse=True)
    (1): Embedding(2690, 2000, sparse=True)
  )
)

[..]

This will save a series of JSON fils with results, e.g.

% cat "topk_d=FB15k_t=product_e=2_2_rank=1000_k=64_sn=0.json"
{
  "MRRm_new": 0.7542805715523118,
  "MRm_new": 50.71081983144581,
  "HITS@1m_new": 0.6896709378392843,
  "HITS@3m_new": 0.7955001359095913,
  "HITS@10m_new": 0.8676865172456019
}

CQD-CO

Use the kbc.cqd_co script to answer queries, providing the path to the dataset, and the saved link predictor trained in the previous step. For example,

% python -m kbc.cqd_co data/FB15k --model_path models/[model_filename].pt --chain_type 1_2

Final Results

All results from the paper can be produced as follows:

% cd results/topk
% ../topk-parse.py *.json | grep rank=1000
d=FB15K rank=1000 & 0.779 & 0.584 & 0.796 & 0.837 & 0.377 & 0.658 & 0.839 & 0.355
d=FB237 rank=1000 & 0.279 & 0.219 & 0.352 & 0.457 & 0.129 & 0.249 & 0.284 & 0.128
d=NELL rank=1000 & 0.343 & 0.297 & 0.410 & 0.529 & 0.168 & 0.283 & 0.536 & 0.157
% cd ../cont
% ../cont-parse.py *.json | grep rank=1000
d=FB15k rank=1000 & 0.454 & 0.191 & 0.796 & 0.837 & 0.336 & 0.513 & 0.816 & 0.319
d=FB15k-237 rank=1000 & 0.213 & 0.131 & 0.352 & 0.457 & 0.146 & 0.222 & 0.281 & 0.132
d=NELL rank=1000 & 0.265 & 0.220 & 0.410 & 0.529 & 0.196 & 0.302 & 0.531 & 0.194

You might also like...

Language models are open knowledge graphs ( non official implementation )

language-models-are-knowledge-graphs-pytorch Language models are open knowledge graphs ( work in progress ) A non official reimplementation of Languag

132 Dec 18, 2022

Compositional and Parameter-Efficient Representations for Large Knowledge Graphs

NodePiece - Compositional and Parameter-Efficient Representations for Large Knowledge Graphs NodePiece is a "tokenizer" for reducing entity vocabulary

107 Jan 4, 2023

Collective Multi-type Entity Alignment Between Knowledge Graphs (WWW'20)

CG-MuAlign A reference implementation for "Collective Multi-type Entity Alignment Between Knowledge Graphs", published in WWW 2020. If you find our pa

28 Dec 11, 2022

ZSL-KG is a general-purpose zero-shot learning framework with a novel transformer graph convolutional network (TrGCN) to learn class representation from common sense knowledge graphs.

ZSL-KG is a general-purpose zero-shot learning framework with a novel transformer graph convolutional network (TrGCN) to learn class representa

94 Nov 21, 2022

Implementation for the EMNLP 2021 paper "Interactive Machine Comprehension with Dynamic Knowledge Graphs".

Interactive Machine Comprehension with Dynamic Knowledge Graphs Implementation for the EMNLP 2021 paper. Dependencies apt-get -y update apt-get instal

19 Aug 23, 2022

This is the source code for: Context-aware Entity Typing in Knowledge Graphs.

9 Sep 1, 2022

ConE: Cone Embeddings for Multi-Hop Reasoning over Knowledge Graphs

ConE: Cone Embeddings for Multi-Hop Reasoning over Knowledge Graphs This is the code of paper ConE: Cone Embeddings for Multi-Hop Reasoning over Knowl

33 Dec 7, 2022

Code for ACL 21: Generating Query Focused Summaries from Query-Free Resources

marge This repository releases the code for Generating Query Focused Summaries from Query-Free Resources. Please cite the following paper [bib] if you

28 Nov 10, 2022

[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

DataFree A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation" Authors: Gongfa

47 Jan 9, 2023

Comments

Answering queries with CQD Error

When I have trained the model, I followed the the path to answer the question but it showed that missing the entity2text.text .Could you please give some suggestions to slove the problems? thx!😊

opened by Joyrocky 17

2u/up queries reproduction with CQD @ KGReasoning

Since there is no issue board at https://github.com/pminervini/KGReasoning I thought I could write it here and tag @pminervini 😃

I'm trying to run CQD CO and Beam on the BetaE version of FB15k-237 and NELL-995 datasets using that repo, but for some reason, the numbers for union queries are very low.

After downloading the pre-trained models (fb15k-237-betae and nell-betae, respectively), I'm using the following commands:

python main.py -cuda --do_test --data_path FB15k-237-betae --cpu_num 1 --geo cqd --tasks "1p.2p.3p.2i.3i.ip.pi.2u.up" --checkpoint_path models/fb15k-237-betae -d 1000

python main.py -cuda --do_test --data_path NELL-betae --cpu_num 1 --geo cqd --tasks "1p.2p.3p.2i.3i.ip.pi.2u.up" --checkpoint_path models/nell-betae -d 1000

Other hyperparams are set as default ones (there is no info on when to put --cqd-sigmoid-scores or --cqd-normalize-scores, so I presume they should be turned off).

The numbers for 2u/up FB15k-237:

Test 2u-DNF MRR at step 99999: 0.005257
Test 2u-DNF HITS1 at step 99999: 0.001895
Test 2u-DNF HITS3 at step 99999: 0.004898
Test 2u-DNF HITS10 at step 99999: 0.010378
est 2u-DNF num_queries at step 99999: 5000.000000
Test up-DNF MRR at step 99999: 0.016857
Test up-DNF HITS1 at step 99999: 0.005590
Test up-DNF HITS3 at step 99999: 0.014344
Test up-DNF HITS10 at step 99999: 0.033338
Test up-DNF num_queries at step 99999: 5000.000000

And for NELL:

Test 2u-DNF MRR at step 99999: 0.007676
Test 2u-DNF HITS1 at step 99999: 0.004144
Test 2u-DNF HITS3 at step 99999: 0.006924
Test 2u-DNF HITS10 at step 99999: 0.014262
Test 2u-DNF num_queries at step 99999: 4000.000000
Test up-DNF MRR at step 99999: 0.023296
Test up-DNF HITS1 at step 99999: 0.010295
Test up-DNF HITS3 at step 99999: 0.022723
Test up-DNF HITS10 at step 99999: 0.045247
Test up-DNF num_queries at step 99999: 4000.000000

Is there anything missing or those are expected numbers for betae datasets?

P.S. Would be good to have an example of how to properly run CQD with KGReasoning in the example.sh :)

opened by migalkin 8

Typo in model download instructions
Hi! I'm trying to reproduce the code following the instructions on the README and I found an error in the instructions. In the instruction, a command for downloading and decompressing the model is provided. However, the decompressing instruction that follows does not refer to the models' file but to cqd-data. I've found that replacing the second line with

% tar xvf cqd-models.tgz

solves this issue :)
opened by eamadord 2
The procedure of the t-norms and neural link prediction

I sorry to say that I could't really understand the procedure of the t-norms and neural link prediction when I studying the codes of this module😔 ,could you give some pseudo codes, math formulas or illustrations about this module. thx!😘 Looking forward for your reply😊

opened by Joyrocky 1

Owner

UCL Natural Language Processing

GitHub

Code for Blind Image Decomposition (BID) and Blind Image Decomposition network (BIDeN).

arXiv, porject page, paper Blind Image Decomposition (BID) Blind Image Decomposition is a novel task. The task requires separating a superimposed imag

64 Dec 20, 2022

Code for the paper "Query Embedding on Hyper-relational Knowledge Graphs"

Query Embedding on Hyper-Relational Knowledge Graphs This repository contains the code used for the experiments in the paper Query Embedding on Hyper-

19 Jul 26, 2022

Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)

Complex-Valued Neural Networks (CVNN) Done by @NEGU93 - J. Agustin Barrachina Using this library, the only difference with a Tensorflow code is that y

1 Nov 12, 2021

MLOps will help you to understand how to build a Continuous Integration and Continuous Delivery pipeline for an ML/AI project.

page_type languages products description sample python azure azure-machine-learning-service azure-devops Code which demonstrates how to set up and ope

1 Nov 1, 2021

Continuous Query Decomposition for Complex Query Answering in Incomplete Knowledge Graphs

Related tags

Overview

Continuous Query Decomposition

1. Install the requirements

2. Download the data

3. Download the models

3. Alternative -- Train your own models

FB15k

FB15k-237

NELL

4. Answering queries with CQD

CQD-Beam

CQD-CO

Final Results

You might also like...

Language models are open knowledge graphs ( non official implementation )

Compositional and Parameter-Efficient Representations for Large Knowledge Graphs

Collective Multi-type Entity Alignment Between Knowledge Graphs (WWW'20)

ZSL-KG is a general-purpose zero-shot learning framework with a novel transformer graph convolutional network (TrGCN) to learn class representation from common sense knowledge graphs.

Implementation for the EMNLP 2021 paper "Interactive Machine Comprehension with Dynamic Knowledge Graphs".

This is the source code for: Context-aware Entity Typing in Knowledge Graphs.

ConE: Cone Embeddings for Multi-Hop Reasoning over Knowledge Graphs

Code for ACL 21: Generating Query Focused Summaries from Query-Free Resources

[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

Comments

Answering queries with CQD Error

2u/up queries reproduction with CQD @ KGReasoning

Typo in model download instructions

The procedure of the t-norms and neural link prediction

Owner

UCL Natural Language Processing

Code for Blind Image Decomposition (BID) and Blind Image Decomposition network (BIDeN).

Code for the paper "Query Embedding on Hyper-relational Knowledge Graphs"

Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)

MLOps will help you to understand how to build a Continuous Integration and Continuous Delivery pipeline for an ML/AI project.

PyTorch implementation for COMPLETER: Incomplete Multi-view Clustering via Contrastive Prediction (CVPR 2021)

A PyTorch Implementation of "SINE: Scalable Incomplete Network Embedding" (ICDM 2018).

Incomplete easy-to-use math solver and PDF generator.

RNG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering

Learning from History: Modeling Temporal Knowledge Graphs with Sequential Copy-Generation Networks

ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs