ECLARE: Extreme Classification with Label Graph Correlations

Extreme Classification

Last update: Nov 6, 2022

Related tags

Deep Learning python machine-learning deeplearning multi-label-classification gcn meta-data extreme-classification label-text graph-correlation label-correlations eclare

Overview

ECLARE

ECLARE: Extreme Classification with Label Graph Correlations

@InProceedings{Mittal21b,
	author       = "Mittal, A. and Sachdeva, N. and Agrawal, S. and Agarwal, S. and Kar, P. and Varma, M.",
	title        = "ECLARE: Extreme classification with label graph correlations",
	booktitle    = "Proceedings of The ACM International World Wide Web Conference",
	month = "April",
	year = "2021",
	}

SETUP WORKSPACE

mkdir -p ${HOME}/scratch/XC/data 
mkdir -p ${HOME}/scratch/XC/programs

SETUP ECLARE

cd ${HOME}/scratch/XC/programs
git clone https://github.com/Extreme-classification/ECLARE.git
conda create -f ECLARE/eclare_env.yml
conda activate eclare
git clone https://github.com/kunaldahiya/pyxclib.git
cd pyxclib
python setup.py install
cd ../ECLARE

DOWNLOAD DATASET

cd ${HOME}/scratch/XC/data
gdown --id <dataset id>
unzip *.zip

dataset	dataset id
LF-AmazonTitles-131K	1VlfcdJKJA99223fLEawRmrXhXpwjwJKn
LF-WikiSeeAlsoTitles-131K	1edWtizAFBbUzxo9Z2wipGSEA9bfy5mdX
LF-AmazonTitles-1.3M	1Davc6BIfoTIAS3mP1mUY5EGcGr2zN2pO

RUNNING ECLARE

cd ${HOME}/scratch/XC/programs/ECLARE
chmod +x run_ECLARE.sh
./run_ECLARE.sh <gpu_id> <ECLARE TYPE> <dataset> <folder name>
e.g.
./run_ECLARE.sh 0 ECLARE LF-AmazonTitles-131K ECLARE_RUN

How to obtain trn_X_Xf.txt, trn_X_Y.txt, tst_X_Xf.txt, tst_X_Y.txt

Hi,

To run run_ECLARE.sh, it seems four additonal data files trn_X_Xf.txt, trn_X_Y.txt, tst_X_Xf.txt, tst_X_Y.txt is needed. However, I cannot find them from the provided download link for all three datasets (and also not in the dataset webpage http://manikvarma.org/downloads/XC/XMLRepository.html). Could you please point me to where the data files are, or how can I generate the input data?

Also, in run_ECLARE.sh line 32, json.load(open('${model_type}/$dataset.json'). May I ask where I can find this JSON file?

Thanks a lot!

opened by xieyujia000 4
ablout the 1-vs-all label classifier Wl

Hi! Eclare is absolutely an amazing work！ After reading I was a little bit confused about the one-vs-all label classifier Wl. As you mentioned in the paper, Wl is a vector generated by z1, z2, and refinement vector z3. So I was wondering how it could accomplish the classification task（like input a sentence and judge whether it belongs to the specific label）? It is just a label's feature vector. Thank u soooooooo much！: D

opened by wangchichi1999 2
Application to long document XC

Thanks for this amazing work.

I am curious to hear your thoughts on if and how this method can be applied to long document classification (>10K tokens, like e.g. patent applications) instead of short text. At first sight it seems the simple composition of token embeddings in the document embedding module may be problematic, and may require at least something like a conv layer. Are there other components in the model architecture that you suspect would be a problem? Have you maybe tried the method on longer documents than product titles? I would be grateful for any insights you can share.

Cheers!

opened by trpstra 2
Input data format

Hi,

I am trying to apply your method to my dataset, and attempting to figure out the format of the input data from reading the code and looking at the example datasets (LF-AmazonTitles-131K). Some documentation for this would be greatly appreciated, but I can mostly figure it out. Only the filter_labels_train.txt and filter_labels_test.txt matrices are not clear to me. If you could briefly explain what these matrices are and why they are needed, that would be great.

Thanks in advance!

opened by trpstra 1
What if #labels in training set is different than #labels in test set?

Hi,

I realized that the datasets used for XMC models have the same number of labels in both training & test set. I'm wondering what happens if they are different. I tried to run ECLARE model on a custom dataset that satisfies the previous condition (#labels_train != #labels_test) and the mismatch dimension issue crops up.

Why should #labels_train & #labels_test be equal? Do I need to tweak the code to resolve this problem? Many thanks,

opened by hdm30 1
Error Dimensions

Hello,

I tried ECLARE model on custom dataset for multilabel classification task. I encountered the following error after running the code:

Traceback (most recent call last): File "/root/scratch/XC/programs/ECLARE/ECLARE/main.py", line 213, in main(args.params) File "/root/scratch/XC/programs/ECLARE/ECLARE/main.py", line 194, in main train(model, params) File "/root/scratch/XC/programs/ECLARE/ECLARE/main.py", line 58, in train model.fit( File "/root/scratch/XC/programs/ECLARE/ECLARE/libs/model_base.py", line 373, in fit self._fit(train_dataset, valid, model_dir, result_dir, validate_after) File "/root/scratch/XC/programs/ECLARE/ECLARE/libs/model_base.py", line 305, in _fit self._train_depth(train_ds, valid_ds, model_dir, File "/root/scratch/XC/programs/ECLARE/ECLARE/libs/model_base.py", line 288, in _train_depth tr_avg_loss = self._step(train_dl) File "/root/scratch/XC/programs/ECLARE/ECLARE/libs/model_base.py", line 196, in _step loss = self._compute_loss(out_ans, batch_data) File "/root/scratch/XC/programs/ECLARE/ECLARE/libs/model_base.py", line 183, in _compute_loss return self.criterion(out_ans, _true).to(device) File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/loss.py", line 713, in forward return F.binary_cross_entropy_with_logits(input, target, File "/usr/local/lib/python3.8/dist-packages/torch/nn/functional.py", line 3130, in binary_cross_entropy_with_logits raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size()))

I have no idea why the target size mismatch the input size?! Many thanks in advance.

opened by hdm30 6

Owner

Extreme Classification

GitHub

This is the implementation of our work Deep Extreme Cut (DEXTR), for object segmentation from extreme points.

828 Jan 5, 2023

Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction".

GNN_PPI Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction". Lear

2 Dec 14, 2022

Label Mask for Multi-label Classification

LM-MLC 一种基于完型填空的多标签分类算法 1 前言本文主要介绍本人在全球人工智能技术创新大赛【赛道一】设计的一种基于完型填空(模板)的多标签分类算法：LM-MLC，该算法拟合能力很强能感知标签关联性，在多个数据集上测试表明该算法与主流算法无显著性差异，在该比赛数据集上的dev效果很好，但是由

52 Nov 20, 2022

Official implementation of "Open-set Label Noise Can Improve Robustness Against Inherent Label Noise" (NeurIPS 2021)

Open-set Label Noise Can Improve Robustness Against Inherent Label Noise NeurIPS 2021: This repository is the official implementation of ODNL. Require

12 Dec 7, 2022

A PyTorch implementation of ICLR 2022 Oral paper PiCO: Contrastive Label Disambiguation for Partial Label Learning

PiCO: Contrastive Label Disambiguation for Partial Label Learning This is a PyTorch implementation of ICLR 2022 Oral paper PiCO; also see our Project

83 May 11, 2022

Some tentative models that incorporate label propagation to graph neural networks for graph representation learning in nodes, links or graphs.

1 Nov 18, 2021

Official implementation of AAAI-21 paper "Label Confusion Learning to Enhance Text Classification Models"

Description: This is the official implementation of our AAAI-21 accepted paper Label Confusion Learning to Enhance Text Classification Models. The str

101 Nov 25, 2022

Official Pytorch Implementation of: "Semantic Diversity Learning for Zero-Shot Multi-label Classification"(2021) paper

Semantic Diversity Learning for Zero-Shot Multi-label Classification Paper Official PyTorch Implementation Avi Ben-Cohen, Nadav Zamir, Emanuel Ben Bar

28 Aug 29, 2022

General Multi-label Image Classification with Transformers

General Multi-label Image Classification with Transformers Jack Lanchantin, Tianlu Wang, Vicente Ordóñez Román, Yanjun Qi Conference on Computer Visio

154 Dec 21, 2022

Official implementation of paper "Query2Label: A Simple Transformer Way to Multi-Label Classification".

Introdunction This is the official implementation of the paper "Query2Label: A Simple Transformer Way to Multi-Label Classification". Abstract This pa

274 Dec 28, 2022

Official implementation for the paper: "Multi-label Classification with Partial Annotations using Class-aware Selective Loss"

Multi-label Classification with Partial Annotations using Class-aware Selective Loss Paper | Pretrained models Official PyTorch Implementation Emanuel

99 Dec 27, 2022

ECLARE: Extreme Classification with Label Graph Correlations

Related tags

Overview

ECLARE

ECLARE: Extreme Classification with Label Graph Correlations

SETUP WORKSPACE

SETUP ECLARE

DOWNLOAD DATASET

RUNNING ECLARE

YOU MAY ALSO LIKE

Comments

How to obtain trn_X_Xf.txt, trn_X_Y.txt, tst_X_Xf.txt, tst_X_Y.txt

ablout the 1-vs-all label classifier Wl

Application to long document XC

Input data format

What if #labels in training set is different than #labels in test set?

Error Dimensions

Owner

Extreme Classification

This is the implementation of our work Deep Extreme Cut (DEXTR), for object segmentation from extreme points.

Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction".

Label Mask for Multi-label Classification

Official implementation of "Open-set Label Noise Can Improve Robustness Against Inherent Label Noise" (NeurIPS 2021)

A PyTorch implementation of ICLR 2022 Oral paper PiCO: Contrastive Label Disambiguation for Partial Label Learning

Some tentative models that incorporate label propagation to graph neural networks for graph representation learning in nodes, links or graphs.

Official implementation of AAAI-21 paper "Label Confusion Learning to Enhance Text Classification Models"

Official Pytorch Implementation of: "Semantic Diversity Learning for Zero-Shot Multi-label Classification"(2021) paper

General Multi-label Image Classification with Transformers

Official implementation of paper "Query2Label: A Simple Transformer Way to Multi-Label Classification".

Official implementation for the paper: "Multi-label Classification with Partial Annotations using Class-aware Selective Loss"

PyTorch implementation of Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network

A benchmark dataset for mesh multi-label-classification based on cube engravings introduced in MeshCNN

This project aim to create multi-label classification annotation tool to boost annotation speed and make it more easier.

A PyTorch implementation of "Semi-Supervised Graph Classification: A Hierarchical Graph Perspective" (WWW 2019)

On Size-Oriented Long-Tailed Graph Classification of Graph Neural Networks

git《Investigating Loss Functions for Extreme Super-Resolution》(CVPR 2020) GitHub:

This is the official repository of XVFI (eXtreme Video Frame Interpolation)

A new data augmentation method for extreme lighting conditions.