TuckER: Tensor Factorization for Knowledge Graph Completion

Related tags

Deep Learning TuckER
Overview

TuckER: Tensor Factorization for Knowledge Graph Completion

This codebase contains PyTorch implementation of the paper:

TuckER: Tensor Factorization for Knowledge Graph Completion. Ivana Balažević, Carl Allen, and Timothy M. Hospedales. Empirical Methods in Natural Language Processing (EMNLP), 2019. [Paper]

TuckER: Tensor Factorization for Knowledge Graph Completion. Ivana Balažević, Carl Allen, and Timothy M. Hospedales. ICML Adaptive & Multitask Learning Workshop, 2019. [Short Paper]

Link Prediction Results

Dataset MRR Hits@10 Hits@3 Hits@1
FB15k 0.795 0.892 0.833 0.741
WN18 0.953 0.958 0.955 0.949
FB15k-237 0.358 0.544 0.394 0.266
WN18RR 0.470 0.526 0.482 0.443

Running a model

To run the model, execute the following command:

 CUDA_VISIBLE_DEVICES=0 python main.py --dataset FB15k-237 --num_iterations 500 --batch_size 128
                                       --lr 0.0005 --dr 1.0 --edim 200 --rdim 200 --input_dropout 0.3 
                                       --hidden_dropout1 0.4 --hidden_dropout2 0.5 --label_smoothing 0.1

Available datasets are:

FB15k-237
WN18RR
FB15k
WN18

To reproduce the results from the paper, use the following combinations of hyperparameters with batch_size=128:

dataset lr dr edim rdim input_d hidden_d1 hidden_d2 label_smoothing
FB15k 0.003 0.99 200 200 0.2 0.2 0.3 0.
WN18 0.005 0.995 200 30 0.2 0.1 0.2 0.1
FB15k-237 0.0005 1.0 200 200 0.3 0.4 0.5 0.1
WN18RR 0.003 1.0 200 30 0.2 0.2 0.3 0.1

Requirements

The codebase is implemented in Python 3.6.6. Required packages are:

numpy      1.15.1
pytorch    1.0.1

Citation

If you found this codebase useful, please cite:

@inproceedings{balazevic2019tucker,
title={TuckER: Tensor Factorization for Knowledge Graph Completion},
author={Bala\v{z}evi\'c, Ivana and Allen, Carl and Hospedales, Timothy M},
booktitle={Empirical Methods in Natural Language Processing},
year={2019}
}
Comments
  • Unable to reproduce results on WN18RR

    Unable to reproduce results on WN18RR

    Hi Ivana

    I am trying to get entity embeddings for a downstream application. For WN18RR dataset I was unable to reproduce the reported results of TuckER. I used the hyperparameters given in the README of this repo. Following is the command I used:

     CUDA_VISIBLE_DEVICES=3 python main.py --dataset WN18RR --num_iterations 500 --batch_size 128 \
                                           --lr 0.01 --dr 1.0 --edim 200 --rdim 30 --input_dropout 0.2 \
                                           --hidden_dropout1 0.2 --hidden_dropout2 0.3 --label_smoothing 0.1
    
    

    And the results are:

    495
    12.792492151260376
    0.00035594542557143403
    Validation:
    Number of data points: 6068
    Hits @10: 0.5121951219512195
    Hits @3: 0.4728081740276862
    Hits @1: 0.43638760711931446
    Mean rank: 6254.662491760053
    Mean reciprocal rank: 0.4624483298017613
    Test:
    Number of data points: 6268
    Hits @10: 0.5140395660497766
    Hits @3: 0.4738353541799617
    Hits @1: 0.43123803446075304
    Mean rank: 6595.924856413529
    Mean reciprocal rank: 0.45961590280892123
    5.328977823257446
    

    Should I increase the number of epochs or am I missing something?

    Thanks

    opened by apoorvumang 9
  • Reopening evaluation issue

    Reopening evaluation issue

    Hi,

    I was just going through your code and found out that the training data has been augmented by adding new relations for reversed triples from the training set (correct me if I am wrong). I am not sure whether this is harmless, as this might have a regularzing effect on the weights the model learns.

    Instead of adding new relations for reversing the triples, could you try the following and check whether this gives the same result?

    1. Create d.train_data_reversed, where for each triple from d.train_data you only switch e_s and e_o and keep the relation. (So you don't create any new relations in this dataset.)
    2. Add to class TuckER a method forward_reversed that is exactly the same as forward, but transposes the tensor W, so that the axes for e_s and e_o are switched.
    3. When training, use forward for d.train_data and use forward_reversed for d.train_data_reversed

    I think this way, one can guarantee that the evaluation is fair. It would be also interesting to know how you evaluate other models you compare with, for examples, whether you use the BCE loss and augment the training data for other models as well. This will make sure that it is is not the BCE loss or data augmentation that helps TuckER perform well.

    opened by dschaehi 8
  • Parameters for reproducing results from paper

    Parameters for reproducing results from paper

    Can you provide the parameters for reproducing the results from the paper on FB15k and FB15K-237? I ran the command from the README:

     CUDA_VISIBLE_DEVICES=0 python main.py --dataset FB15k-237 --num_iterations 500 --batch_size 128
                                           --lr 0.0005 --dr 1.0 --edim 200 --rdim 200 --input_dropout 0.3 
                                           --hidden_dropout1 0.4 --hidden_dropout2 0.5 --label_smoothing 0.1
    

    which gave final performance of

    Number of data points: 35070
    Hits @10: 0.4009124607927003
    Hits @3: 0.2555460507556316
    Hits @1: 0.1760193897918449
    Mean rank: 291.46401482748786
    Mean reciprocal rank: 0.24741750020439274
    Test:
    Number of data points: 40932
    Hits @10: 0.3974396560148539
    Hits @3: 0.2546662757744552
    Hits @1: 0.17094205022964917
    Mean rank: 304.61949086289457
    Mean reciprocal rank: 0.24344486414937788
    

    Any ideas?

    UPDATE: I noticed in the paper that you mention the best learning rate for FB15k-237 is 0.005 instead of 0.0005 and best the learning rate decay is 0.995 instead of 1.0 -- might that be the issue?

    opened by bkj 7
  • why do you have reverse triples in evaluation?

    why do you have reverse triples in evaluation?

    In code

    self.valid_data = self.load_data(data_dir, "valid", reverse=reverse)
    self.test_data = self.load_data(data_dir, "test", reverse=reverse)
    

    it should be

    self.valid_data = self.load_data(data_dir, "valid", reverse=False)
    self.test_data = self.load_data(data_dir, "test", reverse=False)
    

    I did testing with this and the results it shows are much better than reported in the paper. Please let me know if I am wrong.

    opened by apoorvumang 3
  • Could the one-way evaluation be a problem?

    Could the one-way evaluation be a problem?

    Hi,

    I have a question on the evaluation in the code.

    when the test rank is evaluated, the scores seem only be calculated for each head toward all tails. I didn't see the scores are calculated for each tail toward all heads in the code. Don't people usually calculate them both and average them as the final scores? Would this one-way evaluation be a problem, such as having some bias?

    Thank you!

    opened by Xiaobeing 2
  • Why set

    Why set "padding_idx=0" in nn.Embedding

    Hi~ I have found that the code set "padding_idx=0" in nn.Embedding, like self.E = torch.nn.Embedding(len(d.entities), d1, ) self.R = torch.nn.Embedding(len(d.relations), d2, padding_idx=0) However, this will lead the gradient of the first entity and relation becoming zero. This is very interesting and I want to know the reason for this. Thank you!

    opened by THUCSTHanxu13 2
  • Hyperparameters for Yago3-10

    Hyperparameters for Yago3-10

    Hi, thanks for developing this amazing model. I'd like to try and train it on the Yago3-10 dataset (I think you have used it in your other work titled "Hypernetwork Knowledge Graph Embeddings").

    Have you ever tried to train TuckER on that dataset? Can you suggest me any hyperparameter settings, before I start running a long grid search? :)

    Thanks for your help!

    Andrea

    opened by AndRossi 2
  • question about evaluation

    question about evaluation

    In the paper, you say

    for a given triple, we generate 2*n_e test triples by

    1. keeping the subject entity e_s and relation r fixed and replacing the object entity e_o with all possible entities E and by
    2. keeping the object entity e_o and relation r fixed and replacing the subject entity e_s with all entities E.

    In the evaluate function, it looks like you score all possilbee_o's given an (e_s, e_r) tuple, then compute the rank of the true e_o. So I see how you're doing 1) above, but are you actually doing 2)?

    Thanks! ~ Ben

    opened by bkj 2
  • Reverse flag implementation

    Reverse flag implementation

    Hey if I just change the line 194 (d = Data(data_dir=data_dir, reverse=True) in main.py file, and use reverse=False, and run the code for FB15k-237 with recommended settings, the MRR shoots up to 0.4067. Is it expected behaviour?

    To replicate:

    1. Just change reverse=False in main.py
    2. CUDA_VISIBLE_DEVICES=0 python main.py --dataset FB15k-237 --num_iterations 500 --batch_size 128 --lr 0.0005 --dr 1.0 --edim 200 --rdim 200 --input_dropout 0.3 --hidden_dropout1 0.4 --hidden_dropout2 0.5 --label_smoothing 0.1

    MRR keeps increasing.

    Log for iteration 145: 145 21.162700176239014 0.001321860825107114 Test: Number of data points: 20466 Hits @10: 0.6135053259063813 Hits @3: 0.4763998827323366 Hits @1: 0.3409557314570507 Mean rank: 147.47825662073683 Mean reciprocal rank: 0.4335430118109515

    opened by luffycodes 1
  • data progress,

    data progress,

    def load_data(self, data_dir, data_type="train", reverse=False): with open("%s%s.txt" % (data_dir, data_type), "r") as f: data = f.read().strip().split("\n") data = [i.split("") for i in data] if reverse: data += [[i[2], i[1]+"_reverse", i[0]] for i in data] return data

    Are your sure the data is data = [i.split("") for i in data] not data = [i.split("\t") for i in data], your data is splited by "\t", but you used space, if I use data = [i.split("\t") for i in data], I can not get the result you report in your paper about FB15k-237, can you explain it?

    opened by ToneLi 1
  • Do you only test tails in the evalution?

    Do you only test tails in the evalution?

    I am a bit confused about the evaluation protocol. In the evaluation, you only feed (head, rel) to the model and get predictions with n elements representing the scores of (head, rel, t_1) ... (head, rel, t_n). Why you don't repeat this process for the tail? Could you explain the reason? I think it should be done right? Previous works all conduct the evaluation in this way. Maybe I misunderstand your code. Look forward to your reply.

    opened by liu-jc 1
  • why not use 1-x score function.

    why not use 1-x score function.

    Hi, thanks for your elegent code and job!!
    I am thinking how to achieve the 1-x socre funtion( The x of 1-x means the number of entity to form a loss. 1-N uses the whole entity.). In my opinion, 1-x should has much better performace than 1-N, becase is hard to train in high dim. So why not use 1-x score function ?

    opened by quqxui 0
  • Unable to reproduce results on FB15k

    Unable to reproduce results on FB15k

    Hey, I ran the code with suggested parameters, however, I was not able to reproduce the results on FB15k.

    On FB15K, I got the following MRR (The best MRR is 0.789 in 500 epochs): 500 30.736143827438354 0.00022165691843464258 Validation: Number of data points: 100000 Hits @10: 0.88763 Hits @3: 0.8288 Hits @1: 0.73175 Mean rank: 39.8066 Mean reciprocal rank: 0.789614621151087 Test: Number of data points: 118142 Hits @10: 0.8898613532867228 Hits @3: 0.8294086776929458 Hits @1: 0.7293595842291479 Mean rank: 38.221682382218006 Mean reciprocal rank: 0.7889229105464421

    opened by luffycodes 0
Owner
Ivana Balazevic
PhD candidate in Machine Learning @ University of Edinburgh. Ex Research Scientist Intern @ Facebook AI Research (FAIR) and @ Samsung AI.
Ivana Balazevic
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin

Chao Ma 2.8k Feb 12, 2021
PyTorch framework, for reproducing experiments from the paper Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks

Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks. Code, based on the PyTorch framework, for reprodu

Asaf 3 Dec 27, 2022
Using pretrained language models for biomedical knowledge graph completion.

LMs for biomedical KG completion This repository contains code to run the experiments described in: Scientific Language Models for Biomedical Knowledg

Rahul Nadkarni 41 Nov 30, 2022
Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion"

MKGFormer Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion" Model Architecture Illu

ZJUNLP 68 Dec 28, 2022
git《Commonsense Knowledge Base Completion with Structural and Semantic Context》(AAAI 2020) GitHub: [fig1]

Commonsense Knowledge Base Completion with Structural and Semantic Context Code for the paper Commonsense Knowledge Base Completion with Structural an

AI2 96 Nov 5, 2022
A PoC Corporation Relationship Knowledge Graph System on top of Nebula Graph.

Corp-Rel is a PoC of Corpartion Relationship Knowledge Graph System. It's built on top of the Open Source Graph Database: Nebula Graph with a dataset

Wey Gu 20 Dec 11, 2022
Neural Factorization of Shape and Reflectance Under An Unknown Illumination

NeRFactor [Paper] [Video] [Project] This is the authors' code release for: NeRFactor: Neural Factorization of Shape and Reflectance Under an Unknown I

Google 283 Jan 4, 2023
A PyTorch implementation of a Factorization Machine module in cython.

fmpytorch A library for factorization machines in pytorch. A factorization machine is like a linear model, except multiplicative interaction terms bet

Jack Hessel 167 Jul 6, 2022
Implementation of SSMF: Shifting Seasonal Matrix Factorization

SSMF Implementation of SSMF: Shifting Seasonal Matrix Factorization, Koki Kawabata, Siddharth Bhatia, Rui Liu, Mohit Wadhwa, Bryan Hooi. NeurIPS, 2021

Koki Kawabata 9 Jun 10, 2022
[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

DataFree A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation" Authors: Gongfa

ZJU-VIPA 47 Jan 9, 2023
TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A good teacher is patient and consistent by Beyer et al.

FunMatch-Distillation TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A g

Sayak Paul 67 Dec 20, 2022
Source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated Recurrent Memory Network

KaGRMN-DSG_ABSA This repository contains the PyTorch source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated

XingBowen 4 May 20, 2022
Simulating Sycamore quantum circuits classically using tensor network algorithm.

Simulating the Sycamore quantum supremacy circuit This repo contains data we have obtained in simulating the Sycamore quantum supremacy circuits with

Feng Pan 46 Nov 17, 2022
Spectral Tensor Train Parameterization of Deep Learning Layers

Spectral Tensor Train Parameterization of Deep Learning Layers This repository is the official implementation of our AISTATS 2021 paper titled "Spectr

Anton Obukhov 12 Oct 23, 2022
FluidNet re-written with ATen tensor lib

fluidnet_cxx: Accelerating Fluid Simulation with Convolutional Neural Networks. A PyTorch/ATen Implementation. This repository is based on the paper,

JoliBrain 50 Jun 7, 2022
Pretty Tensor - Fluent Neural Networks in TensorFlow

Pretty Tensor provides a high level builder API for TensorFlow. It provides thin wrappers on Tensors so that you can easily build multi-layer neural networks.

Google 1.2k Dec 29, 2022
A torch.Tensor-like DataFrame library supporting multiple execution runtimes and Arrow as a common memory format

TorchArrow (Warning: Unstable Prototype) This is a prototype library currently under heavy development. It does not currently have stable releases, an

Facebook Research 536 Jan 6, 2023
Gradient-free global optimization algorithm for multidimensional functions based on the low rank tensor train format

ttopt Description Gradient-free global optimization algorithm for multidimensional functions based on the low rank tensor train (TT) format and maximu

null 5 May 23, 2022
(Py)TOD: Tensor-based Outlier Detection, A General GPU-Accelerated Framework

(Py)TOD: Tensor-based Outlier Detection, A General GPU-Accelerated Framework Background: Outlier detection (OD) is a key data mining task for identify

Yue Zhao 127 Jan 5, 2023