Residual2Vec: Debiasing graph embedding using random graphs

Overview

Unit Test & Deploy

Residual2Vec: Debiasing graph embedding using random graphs

This repository contains the code for

  • S. Kojaku, J. Yoon, I. Constantino, and Y.-Y. Ahn, Residual2Vec: Debiasing graph embedding using random graphs. NerurIPS (2021). [link will be added when available]

  • Preprint (arXiv): https://arxiv.org/abs/2110.07654

  • BibTex entry:

@inproceedings{kojaku2021neurips,
 title={Residual2Vec: Debiasing graph embedding using random graphs},
 author={Sadamori Kojaku and Jisung Yoon and Isabel Constantino and Yong-Yeol Ahn},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {},
 pages = {},
 publisher = {Curran Associates, Inc.},
 volume = {},
 year = {2021}
}

Installation and Usage of residual2vec package

pip install residual2vec

The code and instruction for residual2vec sits in libs/residual2vec. See here.

Reproducing the results

We set up Snakemake workflow to reproduce our results. To this end, install snakemake and run

snakemake --cores <# of cores available> all

which will produce all figures for the link prediction and community detection benchmarks. The results for the case study are not generated due to the limitation by our data sharing aggreements.

You might also like...
Code for paper "Role-oriented Network Embedding Based on Adversarial Learning between Higher-order and Local Features"

Role-oriented Network Embedding Based on Adversarial Learning between Higher-order and Local Features Train python main.py --dataset brazil-flights C

Korean Sentence Embedding Repository

Korean-Sentence-Embedding 🍭 Korean sentence embedding repository. You can download the pre-trained models and inference right away, also it provides

nlabel is a library for generating, storing and retrieving tagging information and embedding vectors from various nlp libraries through a unified interface.
nlabel is a library for generating, storing and retrieving tagging information and embedding vectors from various nlp libraries through a unified interface.

nlabel is a library for generating, storing and retrieving tagging information and embedding vectors from various nlp libraries through a unified interface.

[WWW 2021 GLB] New Benchmarks for Learning on Non-Homophilous Graphs
[WWW 2021 GLB] New Benchmarks for Learning on Non-Homophilous Graphs

New Benchmarks for Learning on Non-Homophilous Graphs Here are the codes and datasets accompanying the paper: New Benchmarks for Learning on Non-Homop

Extracting Summary Knowledge Graphs from Long Documents

GraphSum This repo contains the data and code for the G2G model in the paper: Extracting Summary Knowledge Graphs from Long Documents. The other basel

This repo is to provide a list of literature regarding Deep Learning on Graphs for NLP

This repo is to provide a list of literature regarding Deep Learning on Graphs for NLP

This script just scrapes the most recent Nepali news from Kathmandu Post and notifies the user about current events at regular intervals.It sends out the most recent news at random!

Nepali-news-notifier This script just scrapes the most recent Nepali news from Kathmandu Post and notifies the user about current events at regular in

Random-Word-Generator - Generates meaningful words from dictionary with given no. of letters and words.

Random Word Generator Generates meaningful words from dictionary with given no. of letters and words. This might be useful for generating short links

Comments
  • Adding single context window

    Adding single context window

    Update

    • Implemented a single window that extends leftward or rightward of a center node. This feature may be useful to embed directed networks. Available only for the pytorch version of the residual2vec

    How to specify the context window types

    The residual2vec_sgd takes a new argument context_window_type that specifies the type of context window. The default is double:

    context_window_type = 'left' # {'left', 'right', 'double'} 
    rv.residual2vec_sgd(context_window_type=context_window_type)
    

    context_window_type="double" specifies the context window that extends both left and right of a focal node. context_window_type="left" or ="right" specifies that extends left or right, respectively.

    opened by skojaku 0
  • Remove clamp to prevent the gradient to be nan

    Remove clamp to prevent the gradient to be nan

    Update

    • Stop using .clamp in the computation of the sigmoid function. clamp has been used for the numerical stability of the sigmoid function. But later, I found that, when using clamp function, sometimes the gradient becomes nan, and this breaks the embedding. PyTorch community is aware of this problem and recommends using 'logsigmoid' function, instead. So, in this new version, the logsimoid function is used.
    • Add an experimental sampler ConditionalContextSampler. This sampler takes group memberships of nodes, and samples a random context from a group that a given context belongs to. May be useful to control the effect of group structure.
    opened by skojaku 0
  • residual2vec with a stochastic gradient descent algorithm for embedding large networks

    residual2vec with a stochastic gradient descent algorithm for embedding large networks

    Issue: The current implementation based on a matrix factorization is memory demanding especially for large networks. The memory consumption is marginal up to 1M but considerable for larger networks, which prevents me to use residua2vec for some of my projects. This update aims to address this issue by using the stochastic gradient descent algorithm, which updates the embedding incrementally using a small chunk of data that can be fitted into memory.

    opened by skojaku 0
  • Fix bugs that appear when simplifying code

    Fix bugs that appear when simplifying code

    This repo contains the simplified version of my code in yy/residual-node2vec. Because I have some time, I run each line of code for both implementations and compared. I found that a bug enters my code in the process of simplification. This patch fixes this bug and I checked that the results are qualitatively reproduced.

    opened by skojaku 0
Owner
SADAMORI KOJAKU
SADAMORI KOJAKU
Random Directed Acyclic Graph Generator

DAG_Generator Random Directed Acyclic Graph Generator verison1.0 简介 工作流通常由DAG(有向无环图)来定义,其中每个计算任务$T_i$由一个顶点(node,task,vertex)表示。同时,任务之间的每个数据或控制依赖性由一条加权

Livion 17 Dec 27, 2022
REST API for sentence tokenization and embedding using Multilingual Universal Sentence Encoder.

What is MUSE? MUSE stands for Multilingual Universal Sentence Encoder - multilingual extension (16 languages) of Universal Sentence Encoder (USE). MUS

Dani El-Ayyass 47 Sep 5, 2022
Some embedding layer implementation using ivy library

ivy-manual-embeddings Some embedding layer implementation using ivy library. Just for fun. It is based on NYCTaxiFare dataset from kaggle (cut down to

Ishtiaq Hussain 2 Feb 10, 2022
Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Kashgari Overview | Performance | Installation | Documentation | Contributing ?? ?? ?? We released the 2.0.0 version with TF2 Support. ?? ?? ?? If you

Eliyar Eziz 2.3k Dec 29, 2022
Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Kashgari Overview | Performance | Installation | Documentation | Contributing ?? ?? ?? We released the 2.0.0 version with TF2 Support. ?? ?? ?? If you

Eliyar Eziz 2k Feb 9, 2021
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

⚠️ Checkout develop branch to see what is coming in pyannote.audio 2.0: a much smaller and cleaner codebase Python-first API (the good old pyannote-au

pyannote 2.2k Jan 9, 2023
source code for paper: WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach.

WhiteningBERT Source code and data for paper WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach. Preparation git clone https://github.com

null 49 Dec 17, 2022
A Structured Self-attentive Sentence Embedding

Structured Self-attentive sentence embeddings Implementation for the paper A Structured Self-Attentive Sentence Embedding, which was published in ICLR

Kaushal Shetty 488 Nov 28, 2022
ETM - R package for Topic Modelling in Embedding Spaces

ETM - R package for Topic Modelling in Embedding Spaces This repository contains an R package called topicmodels.etm which is an implementation of ETM

bnosac 37 Nov 6, 2022
🍊 PAUSE (Positive and Annealed Unlabeled Sentence Embedding), accepted by EMNLP'2021 🌴

PAUSE: Positive and Annealed Unlabeled Sentence Embedding Sentence embedding refers to a set of effective and versatile techniques for converting raw

EQT 21 Dec 15, 2022