Residual2Vec: Debiasing graph embedding using random graphs

SADAMORI KOJAKU

Last update: Oct 12, 2022

Related tags

Text Data & NLP residual2vec

Overview

Residual2Vec: Debiasing graph embedding using random graphs

This repository contains the code for

S. Kojaku, J. Yoon, I. Constantino, and Y.-Y. Ahn, Residual2Vec: Debiasing graph embedding using random graphs. NerurIPS (2021). [link will be added when available]
Preprint (arXiv): https://arxiv.org/abs/2110.07654
BibTex entry:

@inproceedings{kojaku2021neurips,
 title={Residual2Vec: Debiasing graph embedding using random graphs},
 author={Sadamori Kojaku and Jisung Yoon and Isabel Constantino and Yong-Yeol Ahn},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {},
 pages = {},
 publisher = {Curran Associates, Inc.},
 volume = {},
 year = {2021}
}

Installation and Usage of `residual2vec` package

pip install residual2vec

The code and instruction for residual2vec sits in libs/residual2vec. See here.

Reproducing the results

We set up Snakemake workflow to reproduce our results. To this end, install snakemake and run

snakemake --cores <# of cores available> all

which will produce all figures for the link prediction and community detection benchmarks. The results for the case study are not generated due to the limitation by our data sharing aggreements.

This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Combating Embedding Barrier in Multilingual Models for Low-Resource Language Understanding".

BanglaBERT This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced i

197 Dec 25, 2022

Code for paper "Role-oriented Network Embedding Based on Adversarial Learning between Higher-order and Local Features"

Role-oriented Network Embedding Based on Adversarial Learning between Higher-order and Local Features Train python main.py --dataset brazil-flights C

0 Jun 28, 2022

Korean Sentence Embedding Repository

Korean-Sentence-Embedding 🍭 Korean sentence embedding repository. You can download the pre-trained models and inference right away, also it provides

80 Jan 2, 2023

nlabel is a library for generating, storing and retrieving tagging information and embedding vectors from various nlp libraries through a unified interface.

2 Jun 10, 2022

[WWW 2021 GLB] New Benchmarks for Learning on Non-Homophilous Graphs

New Benchmarks for Learning on Non-Homophilous Graphs Here are the codes and datasets accompanying the paper: New Benchmarks for Learning on Non-Homop

94 Dec 21, 2022

Extracting Summary Knowledge Graphs from Long Documents

GraphSum This repo contains the data and code for the G2G model in the paper: Extracting Summary Knowledge Graphs from Long Documents. The other basel

10 Oct 21, 2022

This repo is to provide a list of literature regarding Deep Learning on Graphs for NLP

230 Nov 22, 2022

This script just scrapes the most recent Nepali news from Kathmandu Post and notifies the user about current events at regular intervals.It sends out the most recent news at random!

Nepali-news-notifier This script just scrapes the most recent Nepali news from Kathmandu Post and notifies the user about current events at regular in

1 Feb 11, 2022

Random-Word-Generator - Generates meaningful words from dictionary with given no. of letters and words.

Random Word Generator Generates meaningful words from dictionary with given no. of letters and words. This might be useful for generating short links

1 Jan 1, 2022

Comments

Adding single context window
Update

Implemented a single window that extends leftward or rightward of a center node. This feature may be useful to embed directed networks. Available only for the pytorch version of the residual2vec

How to specify the context window types

The residual2vec_sgd takes a new argument context_window_type that specifies the type of context window. The default is double:

context_window_type = 'left' # {'left', 'right', 'double'} rv.residual2vec_sgd(context_window_type=context_window_type)

context_window_type="double" specifies the context window that extends both left and right of a focal node. context_window_type="left" or ="right" specifies that extends left or right, respectively.
opened by skojaku 0
Remove clamp to prevent the gradient to be nan
Update

Stop using .clamp in the computation of the sigmoid function. clamp has been used for the numerical stability of the sigmoid function. But later, I found that, when using clamp function, sometimes the gradient becomes nan, and this breaks the embedding. PyTorch community is aware of this problem and recommends using 'logsigmoid' function, instead. So, in this new version, the logsimoid function is used.

Add an experimental sampler ConditionalContextSampler. This sampler takes group memberships of nodes, and samples a random context from a group that a given context belongs to. May be useful to control the effect of group structure.
opened by skojaku 0
residual2vec with a stochastic gradient descent algorithm for embedding large networks

Issue: The current implementation based on a matrix factorization is memory demanding especially for large networks. The memory consumption is marginal up to 1M but considerable for larger networks, which prevents me to use residua2vec for some of my projects. This update aims to address this issue by using the stochastic gradient descent algorithm, which updates the embedding incrementally using a small chunk of data that can be fitted into memory.

opened by skojaku 0
Fix bugs that appear when simplifying code

This repo contains the simplified version of my code in yy/residual-node2vec. Because I have some time, I run each line of code for both implementations and compared. I found that a bug enters my code in the process of simplification. This patch fixes this bug and I checked that the results are qualitatively reproduced.

opened by skojaku 0

Owner

SADAMORI KOJAKU

GitHub

Random Directed Acyclic Graph Generator

DAG_Generator Random Directed Acyclic Graph Generator verison1.0 简介工作流通常由DAG（有向无环图）来定义，其中每个计算任务$T_i$由一个顶点(node,task,vertex)表示。同时，任务之间的每个数据或控制依赖性由一条加权

17 Dec 27, 2022

REST API for sentence tokenization and embedding using Multilingual Universal Sentence Encoder.

What is MUSE? MUSE stands for Multilingual Universal Sentence Encoder - multilingual extension (16 languages) of Universal Sentence Encoder (USE). MUS

47 Sep 5, 2022

Some embedding layer implementation using ivy library

ivy-manual-embeddings Some embedding layer implementation using ivy library. Just for fun. It is based on NYCTaxiFare dataset from kaggle (cut down to

2 Feb 10, 2022

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Kashgari Overview | Performance | Installation | Documentation | Contributing ?? ?? ?? We released the 2.0.0 version with TF2 Support. ?? ?? ?? If you

2.3k Dec 29, 2022

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Kashgari Overview | Performance | Installation | Documentation | Contributing ?? ?? ?? We released the 2.0.0 version with TF2 Support. ?? ?? ?? If you

2k Feb 9, 2021

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

⚠️ Checkout develop branch to see what is coming in pyannote.audio 2.0: a much smaller and cleaner codebase Python-first API (the good old pyannote-au

2.2k Jan 9, 2023

Residual2Vec: Debiasing graph embedding using random graphs

Related tags

Overview

Residual2Vec: Debiasing graph embedding using random graphs

Installation and Usage of `residual2vec` package

Reproducing the results

You might also like...

This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Combating Embedding Barrier in Multilingual Models for Low-Resource Language Understanding".

Code for paper "Role-oriented Network Embedding Based on Adversarial Learning between Higher-order and Local Features"

Korean Sentence Embedding Repository

nlabel is a library for generating, storing and retrieving tagging information and embedding vectors from various nlp libraries through a unified interface.

[WWW 2021 GLB] New Benchmarks for Learning on Non-Homophilous Graphs

Extracting Summary Knowledge Graphs from Long Documents

This repo is to provide a list of literature regarding Deep Learning on Graphs for NLP

This script just scrapes the most recent Nepali news from Kathmandu Post and notifies the user about current events at regular intervals.It sends out the most recent news at random!

Random-Word-Generator - Generates meaningful words from dictionary with given no. of letters and words.

Comments

Adding single context window

Update

How to specify the context window types

Remove clamp to prevent the gradient to be nan

Update

residual2vec with a stochastic gradient descent algorithm for embedding large networks

Fix bugs that appear when simplifying code

Owner

SADAMORI KOJAKU

Random Directed Acyclic Graph Generator

REST API for sentence tokenization and embedding using Multilingual Universal Sentence Encoder.

Some embedding layer implementation using ivy library

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

source code for paper: WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach.

A Structured Self-attentive Sentence Embedding

ETM - R package for Topic Modelling in Embedding Spaces

🍊 PAUSE (Positive and Annealed Unlabeled Sentence Embedding), accepted by EMNLP'2021 🌴

Residual2Vec: Debiasing graph embedding using random graphs

Related tags

Overview

Residual2Vec: Debiasing graph embedding using random graphs

Installation and Usage of residual2vec package

Reproducing the results

You might also like...

This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Combating Embedding Barrier in Multilingual Models for Low-Resource Language Understanding".

Code for paper "Role-oriented Network Embedding Based on Adversarial Learning between Higher-order and Local Features"

Korean Sentence Embedding Repository

nlabel is a library for generating, storing and retrieving tagging information and embedding vectors from various nlp libraries through a unified interface.

[WWW 2021 GLB] New Benchmarks for Learning on Non-Homophilous Graphs

Extracting Summary Knowledge Graphs from Long Documents

This repo is to provide a list of literature regarding Deep Learning on Graphs for NLP

This script just scrapes the most recent Nepali news from Kathmandu Post and notifies the user about current events at regular intervals.It sends out the most recent news at random!

Random-Word-Generator - Generates meaningful words from dictionary with given no. of letters and words.

Comments

Adding single context window

Update

How to specify the context window types

Remove clamp to prevent the gradient to be nan

Update

residual2vec with a stochastic gradient descent algorithm for embedding large networks

Fix bugs that appear when simplifying code

Owner

SADAMORI KOJAKU

Random Directed Acyclic Graph Generator

REST API for sentence tokenization and embedding using Multilingual Universal Sentence Encoder.

Some embedding layer implementation using ivy library

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

source code for paper: WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach.

A Structured Self-attentive Sentence Embedding

ETM - R package for Topic Modelling in Embedding Spaces

🍊 PAUSE (Positive and Annealed Unlabeled Sentence Embedding), accepted by EMNLP'2021 🌴

Installation and Usage of `residual2vec` package