Code for ICML2019 Paper "Compositional Invariance Constraints for Graph Embeddings"

Overview

Dependencies

NOTE: This code has been updated, if you were using this repo earlier and experienced issues that was due to an outaded codebase. Please try again, and if you're still stuck please send me an email: [email protected]

Paper Link: https://arxiv.org/abs/1905.10674

  1. Comet ML for logging. You will need an API key, username, and project name to do online logging.
  2. Pytorch version=1.0
  3. scikit-learn
  4. tqdm for progress bar
  5. pickle
  6. json
  7. joblib
  8. networkx for creating reddit graph

To conduct experiments you will need to download the appropriate datasets and preprocess them with the given preprocesssing scripts. This will involve changing the file paths from their default ones. For FB15k-237 there is the main dataset as well as the entity types dataset (links are provided in the main paper). Further, note that reddit uses 2 steps of preprocessing, the first to parse the json objects and then a second one to create the K-core graph.

Sample Commands

To reproduce the results we provide sample commands. Command Line arguments control which sensitive attributes are use and whether there is a compositional adversary or not.

  1. FB15k-237: ipython --pdb -- paper_trans_e.py --namestr='FB15k Comp Gamma=1000' --do_log --num_epochs=100 --embed_dim=20 --test_new_disc --sample_mask=True --use_attr=True --gamma=1000 --valid_freq=50

  2. MovieLens1M:

ipython --pdb -- main_movielens.py --namestr='100 GCMC Comp and Dummy' --use_cross_entropy --num_epochs=200 --test_new_disc --use_1M=True --show_tqdm=True --report_bias=True --valid_freq=5 --use_gcmc=True --num_classifier_epochs=200 --embed_dim=30 --sample_mask=True --use_attr=True --gamma=10 --do_log

  1. Reddit:

ipython --pdb -- main_reddit.py --namestr='Reddit Compositional No Held Out V2 Gamma=1' --valid_freq=5 --num_sensitive=10 --use_attr=True --use_cross_entropy --test_new_disc --num_epochs=50 --num_nce=1 --sample_mask=True --debug --gamma=1000

If you use this codebase or ideas in the paper please cite:

@article{bose2019compositional, \ title={Compositional Fairness Constraints for Graph Embeddings},\ author={Bose, Avishek Joey and Hamilton, William},\ conference={Proceedings of the Thirty-sixth International Conference on Machine Learning, Long Beach CA},\ year={2019} \ }

Comments
  • problem when training on ml-1m

    problem when training on ml-1m

    we are trying to operate an experiment on your codes. Dataset--movieslens-1M, however, when training modelD and filters(these two trained together),we get a loss which value below 0. is that normal?

    opened by ShaoPengyang 5
  • RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [*size*,*size*] is at version 3; expected version 2 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

    RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [*size*,*size*] is at version 3; expected version 2 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

    Hello,

    (Scenario: Trying to verify the MovieLens dataset results)

    There seem to be some gradient computation issues with your repo. Specifically, way in which the TransD_movielens.py graidents are created seem to be incorrect. See relevant findings.

    I find that the error seems to be inside the train_gcmc function when back-propagating for each discriminator model.

    PyTorch 1.4 does not show this error, PyTorch 1.5 and above does.

    opened by deancochran 2
  • Inquiry about the mean rank results reported in the paper

    Inquiry about the mean rank results reported in the paper

    Hi! I was just trying to reproduce the paper's result on FB15K-237 and on Comet.ml I observed that the training mean-rank keeps increasing and reached around 860. How shall I interpret the result with regards to what's reported in the paper?

    opened by liaopeiyuan 2
  • Where can I find FB15k_Entity_Type_%s.txt?

    Where can I find FB15k_Entity_Type_%s.txt?

    I tried to reproduce your FB15k experiments but was never able to generate the data needed. It seems that FB15k_Entity_Type_%s.txt is some special text file not included in the official dataset. Is there a separate website where I can obtain the file, or is there another script I should download to generate this? Thanks!

    opened by liaopeiyuan 2
  • Out of Bounds Runtime Error: Reproducing MovieLens 1M Results

    Out of Bounds Runtime Error: Reproducing MovieLens 1M Results

    Hello,

    I'm reproducing the results of the MovieLens 1M, such that once I can produce results, I can apply it to the Movie Lens 100k dataset.

    I have run the main_movielens.py file as instructed with the arguments in the README.md and have put in my paths for the correct data directory/comet ml experiment api credentials.

    ---------------------------------------------------------------------------
    RuntimeError                              Traceback (most recent call last)
    File ~/Dev/Flexible-Fairness-Constraints/main_movielens.py:383, in <module>
        380         experiment.end()
        382 if __name__ == '__main__':
    --> 383     main(parse_args())
    
    File ~/Dev/Flexible-Fairness-Constraints/main_movielens.py:274, in main(args)
        272 with torch.no_grad():
        273     if args.use_gcmc:
    --> 274         rmse,test_loss = test_gcmc(test_set,args,modelD,filter_set)
        275     else:
        276         # l_ranks,r_ranks,avg_mr,avg_mrr,avg_h10,avg_h5 = test(test_set, args, all_hash,\
        277                 # modelD,subsample=20)
        278         test_nce(test_set,args,modelD,epoch,experiment)
    
    File ~/Dev/Flexible-Fairness-Constraints/eval_movielens.py:537, in test_gcmc(dataset, args, modelD, filter_set)
        535 p_batch_var = Variable(p_batch).to(torch.device("cuda" if torch.cuda.is_available() else "cpu"))
        536 lhs, rel, rhs = p_batch_var[:,0],p_batch_var[:,1],p_batch_var[:,2]
    --> 537 test_loss,preds = modelD(p_batch_var,filters=filter_set)
        538 rel += 1
        539 preds_list.append(preds.squeeze())
    
    File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:1130, in Module._call_impl(self, *input, **kwargs)
       1126 # If we don't have any hooks, we want to skip the rest of the logic in
       1127 # this function, and just call forward.
       1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
       1129         or _global_forward_hooks or _global_forward_pre_hooks):
    -> 1130     return forward_call(*input, **kwargs)
       1131 # Do not call functions when jit is used
       1132 full_backward_hooks, non_full_backward_hooks = [], []
    
    File ~/Dev/Flexible-Fairness-Constraints/model.py:541, in SimpleGCMC.forward(self, pos_edges, weights, return_embeds, filters)
        539 pos_tail_embeds = self.encode(pos_edges[:,-1])
        540 rels = pos_edges[:,1]
    --> 541 loss, preds = self.decoder(pos_head_embeds, pos_tail_embeds, rels)
        542 if return_embeds:
        543     return loss, preds, pos_head_embeds, pos_tail_embeds
    
    File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:1130, in Module._call_impl(self, *input, **kwargs)
       1126 # If we don't have any hooks, we want to skip the rest of the logic in
       1127 # this function, and just call forward.
       1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
       1129         or _global_forward_hooks or _global_forward_pre_hooks):
    -> 1130     return forward_call(*input, **kwargs)
       1131 # Do not call functions when jit is used
       1132 full_backward_hooks, non_full_backward_hooks = [], []
    
    File ~/Dev/Flexible-Fairness-Constraints/model.py:488, in SharedBilinearDecoder.forward(self, embeds1, embeds2, rels)
        486 logit = torch.matmul(basis_outputs,self.weight_scalars)
        487 outputs = F.log_softmax(logit,dim=1)
    --> 488 log_probs = torch.gather(outputs,1,rels.unsqueeze(1))
        489 loss = self.nll(outputs,rels)
        490 preds = 0
    
    RuntimeError: index 6369 is out of bounds for dimension 1 with size 5
    > /home/charles.cochran/Dev/Flexible-Fairness-Constraints/model.py(488)forward()
        486         logit = torch.matmul(basis_outputs,self.weight_scalars)
        487         outputs = F.log_softmax(logit,dim=1)
    --> 488         log_probs = torch.gather(outputs,1,rels.unsqueeze(1))
        489         loss = self.nll(outputs,rels)
        490         preds = 0
    
    opened by deancochran 1
  • Question about the backward propagation of train_gcmc function in transD_movielens.py

    Question about the backward propagation of train_gcmc function in transD_movielens.py

    Hello, I have a question about the loss in function train_gcmc() in transD_movielens.py. Below is the code. Why are you accumulating the l_penalty_2 of all discriminators(the code which I made it bold)? In my opinion, each discriminator could be trained seperately with its own loss, and it has nothing to do with other discriminators.

    for k in range(0,args.D_steps): l_penalty_2 = 0 for fairD_disc, fair_optim in zip(masked_fairD_set,
    masked_optimizer_fairD_set): if fairD_disc is not None and fair_optim is not None: fair_optim.zero_grad() l_penalty_2 += fairD_disc(filter_l_emb.detach(),
    p_batch[:,0],True) if not args.use_cross_entropy: fairD_loss = -1*(1 - l_penalty_2) else: fairD_loss = l_penalty_2 fairD_loss.backward(retain_graph=True) fair_optim.step()

    opened by 2016312357 0
Owner
Avishek (Joey) Bose
I’m a PhD student at McGill / MILA where I work on Generative Models and Graph Representation Learning. Previously at Uber AI, UofT and Borealis AI
Avishek (Joey) Bose
This is discord nitro code generator and checker made with python. This will generate nitro codes and checks if the code is valid or not. If code is valid then it will print the code leaving 2 lines and if not then it will print '*'.

Discord Nitro Generator And Checker ⚙️ Rᴜɴ Oɴ Rᴇᴘʟɪᴛ ??️ Lᴀɴɢᴜᴀɢᴇs Aɴᴅ Tᴏᴏʟs If you are taking code from this repository without a fork, then atleast

Vɪɴᴀʏᴀᴋ Pᴀɴᴅᴇʏ 37 Jan 7, 2023
This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

SLATE This is the official source code for SLATE. We provide the code for the model, the training code and a dataset loader for the 3D Shapes dataset.

Gautam Singh 66 Dec 26, 2022
Source code of our TPAMI'21 paper Dual Encoding for Video Retrieval by Text and CVPR'19 paper Dual Encoding for Zero-Example Video Retrieval.

Dual Encoding for Video Retrieval by Text Source code of our TPAMI'21 paper Dual Encoding for Video Retrieval by Text and CVPR'19 paper Dual Encoding

null 81 Dec 1, 2022
Code for paper ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop.

Who Left the Dogs Out? Evaluation and demo code for our ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization

Benjamin Biggs 29 Dec 28, 2022
TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

SLM: Structural Language Models of Code This is an official implementation of the model described in: "Structural Language Models of Code" [PDF] To ap

null 73 Nov 6, 2022
Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

This codebase is being actively maintained, please create and issue if you have issues using it Basics All data files are included under losses and ea

Justin Terry 32 Nov 9, 2021
Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

CoProtector Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

Zhensu Sun 1 Oct 26, 2021
Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

This codebase is being actively maintained, please create and issue if you have issues using it Basics All data files are included under losses and ea

J K Terry 32 Nov 9, 2021
A toolkit to automatically crawl the paper list and download paper pdfs of ACL Ahthology.

ACL-Anthology-Crawler A toolkit to automatically crawl the paper list and download paper pdfs of ACL Anthology

Ray GG 9 Oct 9, 2022
PaperRobot: a paper crawler that can quickly download numerous papers, facilitating paper studying and management

PaperRobot PaperRobot 是一个论文抓取工具,可以快速批量下载大量论文,方便后期进行持续的论文管理与学习。 PaperRobot通过多个接口抓取论文,目前抓取成功率维持在90%以上。通过配置Config文件,可以抓取任意计算机领域相关会议的论文。 Installation Down

moxiaoxi 47 Nov 23, 2022
A Replit Game Know As ROCK PAPER SCISSOR AND ALSO KNOW AS STONE PAPER SCISSOR

?? ᴿᴼᶜᴷ ᴾᴬᴾᴱᴿ ᔆᶜᴵᔆᔆᴼᴿ ?? ⚙️ Rᴜɴ Oɴ Rᴇᴘʟɪᴛ ??️ Lᴀɴɢᴜᴀɢᴇs Aɴᴅ Tᴏᴏʟs If you are taking code from this repository without a fork, then atleast give credit

ANKIT KUMAR 1 Dec 25, 2021
Rock-Paper-Scissors - Rock Paper Scissors With Python

Rock-Paper-Scissors The familiar game of Rock, Paper, Scissors is played like th

Lateefah Ajadi 0 Jan 15, 2022
Code for our method RePRI for Few-Shot Segmentation. Paper at http://arxiv.org/abs/2012.06166

Region Proportion Regularized Inference (RePRI) for Few-Shot Segmentation In this repo, we provide the code for our paper : "Few-Shot Segmentation Wit

Malik Boudiaf 138 Dec 12, 2022
Code of paper: A Recurrent Vision-and-Language BERT for Navigation

Recurrent VLN-BERT Code of the Recurrent-VLN-BERT paper: A Recurrent Vision-and-Language BERT for Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian

YicongHong 109 Dec 21, 2022
Code for ACM MM 2020 paper "NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination"

NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination The offical implementation for the "NOH-NMS: Improving Pedestrian Detection by

Tencent YouTu Research 64 Nov 11, 2022
Official TensorFlow code for the forthcoming paper

~ Efficient-CapsNet ~ Are you tired of over inflated and overused convolutional neural networks? You're right! It's time for CAPSULES :)

Vittorio Mazzia 203 Jan 8, 2023
This is the code for the paper "Contrastive Clustering" (AAAI 2021)

Contrastive Clustering (CC) This is the code for the paper "Contrastive Clustering" (AAAI 2021) Dependency python>=3.7 pytorch>=1.6.0 torchvision>=0.8

Yunfan Li 210 Dec 30, 2022
Code for the paper Learning the Predictability of the Future

Learning the Predictability of the Future Code from the paper Learning the Predictability of the Future. Website of the project in hyperfuture.cs.colu

Computer Vision Lab at Columbia University 139 Nov 18, 2022
PyTorch code for the paper: FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning

FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning This is the PyTorch implementation of our paper: FeatMatch: Feature-Based Augmentat

null 43 Nov 19, 2022
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

T5: Text-To-Text Transfer Transformer The t5 library serves primarily as code for reproducing the experiments in Exploring the Limits of Transfer Lear

Google Research 4.6k Jan 1, 2023