ECLARE: Extreme Classification with Label Graph Correlations

Overview

ECLARE

ECLARE: Extreme Classification with Label Graph Correlations

@InProceedings{Mittal21b,
	author       = "Mittal, A. and Sachdeva, N. and Agrawal, S. and Agarwal, S. and Kar, P. and Varma, M.",
	title        = "ECLARE: Extreme classification with label graph correlations",
	booktitle    = "Proceedings of The ACM International World Wide Web Conference",
	month = "April",
	year = "2021",
	}

SETUP WORKSPACE

mkdir -p ${HOME}/scratch/XC/data 
mkdir -p ${HOME}/scratch/XC/programs

SETUP ECLARE

cd ${HOME}/scratch/XC/programs
git clone https://github.com/Extreme-classification/ECLARE.git
conda create -f ECLARE/eclare_env.yml
conda activate eclare
git clone https://github.com/kunaldahiya/pyxclib.git
cd pyxclib
python setup.py install
cd ../ECLARE

DOWNLOAD DATASET

cd ${HOME}/scratch/XC/data
gdown --id <dataset id>
unzip *.zip
dataset dataset id
LF-AmazonTitles-131K 1VlfcdJKJA99223fLEawRmrXhXpwjwJKn
LF-WikiSeeAlsoTitles-131K 1edWtizAFBbUzxo9Z2wipGSEA9bfy5mdX
LF-AmazonTitles-1.3M 1Davc6BIfoTIAS3mP1mUY5EGcGr2zN2pO

RUNNING ECLARE

cd ${HOME}/scratch/XC/programs/ECLARE
chmod +x run_ECLARE.sh
./run_ECLARE.sh <gpu_id> <ECLARE TYPE> <dataset> <folder name>
e.g.
./run_ECLARE.sh 0 ECLARE LF-AmazonTitles-131K ECLARE_RUN

YOU MAY ALSO LIKE

Comments
  • How to obtain trn_X_Xf.txt, trn_X_Y.txt, tst_X_Xf.txt, tst_X_Y.txt

    How to obtain trn_X_Xf.txt, trn_X_Y.txt, tst_X_Xf.txt, tst_X_Y.txt

    Hi,

    To run run_ECLARE.sh, it seems four additonal data files trn_X_Xf.txt, trn_X_Y.txt, tst_X_Xf.txt, tst_X_Y.txt is needed. However, I cannot find them from the provided download link for all three datasets (and also not in the dataset webpage http://manikvarma.org/downloads/XC/XMLRepository.html). Could you please point me to where the data files are, or how can I generate the input data?

    Also, in run_ECLARE.sh line 32, json.load(open('${model_type}/$dataset.json'). May I ask where I can find this JSON file?

    Thanks a lot!

    opened by xieyujia000 4
  • ablout the 1-vs-all label classifier Wl

    ablout the 1-vs-all label classifier Wl

    GZ5VPURXKDGYG Q64C_(7NY Hi! Eclare is absolutely an amazing work! After reading I was a little bit confused about the one-vs-all label classifier Wl. As you mentioned in the paper, Wl is a vector generated by z1, z2, and refinement vector z3. So I was wondering how it could accomplish the classification task(like input a sentence and judge whether it belongs to the specific label)? It is just a label's feature vector. Thank u soooooooo much!: D

    opened by wangchichi1999 2
  • Application to long document XC

    Application to long document XC

    Thanks for this amazing work.

    I am curious to hear your thoughts on if and how this method can be applied to long document classification (>10K tokens, like e.g. patent applications) instead of short text. At first sight it seems the simple composition of token embeddings in the document embedding module may be problematic, and may require at least something like a conv layer. Are there other components in the model architecture that you suspect would be a problem? Have you maybe tried the method on longer documents than product titles? I would be grateful for any insights you can share.

    Cheers!

    opened by trpstra 2
  • Input data format

    Input data format

    Hi,

    I am trying to apply your method to my dataset, and attempting to figure out the format of the input data from reading the code and looking at the example datasets (LF-AmazonTitles-131K). Some documentation for this would be greatly appreciated, but I can mostly figure it out. Only the filter_labels_train.txt and filter_labels_test.txt matrices are not clear to me. If you could briefly explain what these matrices are and why they are needed, that would be great.

    Thanks in advance!

    opened by trpstra 1
  • What if #labels in training set is different than #labels in test set?

    What if #labels in training set is different than #labels in test set?

    Hi,

    I realized that the datasets used for XMC models have the same number of labels in both training & test set. I'm wondering what happens if they are different. I tried to run ECLARE model on a custom dataset that satisfies the previous condition (#labels_train != #labels_test) and the mismatch dimension issue crops up.

    Why should #labels_train & #labels_test be equal? Do I need to tweak the code to resolve this problem? Many thanks,

    opened by hdm30 1
  • Error Dimensions

    Error Dimensions

    Hello,

    I tried ECLARE model on custom dataset for multilabel classification task. I encountered the following error after running the code:

    Traceback (most recent call last): File "/root/scratch/XC/programs/ECLARE/ECLARE/main.py", line 213, in main(args.params) File "/root/scratch/XC/programs/ECLARE/ECLARE/main.py", line 194, in main train(model, params) File "/root/scratch/XC/programs/ECLARE/ECLARE/main.py", line 58, in train model.fit( File "/root/scratch/XC/programs/ECLARE/ECLARE/libs/model_base.py", line 373, in fit self._fit(train_dataset, valid, model_dir, result_dir, validate_after) File "/root/scratch/XC/programs/ECLARE/ECLARE/libs/model_base.py", line 305, in _fit self._train_depth(train_ds, valid_ds, model_dir, File "/root/scratch/XC/programs/ECLARE/ECLARE/libs/model_base.py", line 288, in _train_depth tr_avg_loss = self._step(train_dl) File "/root/scratch/XC/programs/ECLARE/ECLARE/libs/model_base.py", line 196, in _step loss = self._compute_loss(out_ans, batch_data) File "/root/scratch/XC/programs/ECLARE/ECLARE/libs/model_base.py", line 183, in _compute_loss return self.criterion(out_ans, _true).to(device) File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/loss.py", line 713, in forward return F.binary_cross_entropy_with_logits(input, target, File "/usr/local/lib/python3.8/dist-packages/torch/nn/functional.py", line 3130, in binary_cross_entropy_with_logits raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size()))

    I have no idea why the target size mismatch the input size?! Many thanks in advance.

    opened by hdm30 6
Owner
Extreme Classification
Extreme Classification
This is the implementation of our work Deep Extreme Cut (DEXTR), for object segmentation from extreme points.

This is the implementation of our work Deep Extreme Cut (DEXTR), for object segmentation from extreme points.

Sergi Caelles 828 Jan 5, 2023
Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction".

GNN_PPI Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction". Lear

Ursa Zrimsek 2 Dec 14, 2022
Label Mask for Multi-label Classification

LM-MLC 一种基于完型填空的多标签分类算法 1 前言 本文主要介绍本人在全球人工智能技术创新大赛【赛道一】设计的一种基于完型填空(模板)的多标签分类算法:LM-MLC,该算法拟合能力很强能感知标签关联性,在多个数据集上测试表明该算法与主流算法无显著性差异,在该比赛数据集上的dev效果很好,但是由

null 52 Nov 20, 2022
Official implementation of "Open-set Label Noise Can Improve Robustness Against Inherent Label Noise" (NeurIPS 2021)

Open-set Label Noise Can Improve Robustness Against Inherent Label Noise NeurIPS 2021: This repository is the official implementation of ODNL. Require

Hongxin Wei 12 Dec 7, 2022
A PyTorch implementation of ICLR 2022 Oral paper PiCO: Contrastive Label Disambiguation for Partial Label Learning

PiCO: Contrastive Label Disambiguation for Partial Label Learning This is a PyTorch implementation of ICLR 2022 Oral paper PiCO; also see our Project

王皓波 83 May 11, 2022
Some tentative models that incorporate label propagation to graph neural networks for graph representation learning in nodes, links or graphs.

Some tentative models that incorporate label propagation to graph neural networks for graph representation learning in nodes, links or graphs.

zshicode 1 Nov 18, 2021
Official implementation of AAAI-21 paper "Label Confusion Learning to Enhance Text Classification Models"

Description: This is the official implementation of our AAAI-21 accepted paper Label Confusion Learning to Enhance Text Classification Models. The str

null 101 Nov 25, 2022
Official Pytorch Implementation of: "Semantic Diversity Learning for Zero-Shot Multi-label Classification"(2021) paper

Semantic Diversity Learning for Zero-Shot Multi-label Classification Paper Official PyTorch Implementation Avi Ben-Cohen, Nadav Zamir, Emanuel Ben Bar

null 28 Aug 29, 2022
General Multi-label Image Classification with Transformers

General Multi-label Image Classification with Transformers Jack Lanchantin, Tianlu Wang, Vicente Ordóñez Román, Yanjun Qi Conference on Computer Visio

QData 154 Dec 21, 2022
Official implementation of paper "Query2Label: A Simple Transformer Way to Multi-Label Classification".

Introdunction This is the official implementation of the paper "Query2Label: A Simple Transformer Way to Multi-Label Classification". Abstract This pa

Shilong Liu 274 Dec 28, 2022
Official implementation for the paper: "Multi-label Classification with Partial Annotations using Class-aware Selective Loss"

Multi-label Classification with Partial Annotations using Class-aware Selective Loss Paper | Pretrained models Official PyTorch Implementation Emanuel

null 99 Dec 27, 2022
PyTorch implementation of Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network

hierarchical-multi-label-text-classification-pytorch Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network Approach This

Mingu Kang 17 Dec 13, 2022
A benchmark dataset for mesh multi-label-classification based on cube engravings introduced in MeshCNN

Double Cube Engravings This script creates a dataset for multi-label mesh clasification, with an intentionally difficult setup for point cloud classif

Yotam Erel 1 Nov 30, 2021
This project aim to create multi-label classification annotation tool to boost annotation speed and make it more easier.

This project aim to create multi-label classification annotation tool to boost annotation speed and make it more easier.

null 4 Aug 2, 2022
A PyTorch implementation of "Semi-Supervised Graph Classification: A Hierarchical Graph Perspective" (WWW 2019)

SEAL ⠀⠀⠀ A PyTorch implementation of Semi-Supervised Graph Classification: A Hierarchical Graph Perspective (WWW 2019) Abstract Node classification an

Benedek Rozemberczki 202 Dec 27, 2022
On Size-Oriented Long-Tailed Graph Classification of Graph Neural Networks

On Size-Oriented Long-Tailed Graph Classification of Graph Neural Networks We provide the code (in PyTorch) and datasets for our paper "On Size-Orient

Zemin Liu 4 Jun 18, 2022
git《Investigating Loss Functions for Extreme Super-Resolution》(CVPR 2020) GitHub:

Investigating Loss Functions for Extreme Super-Resolution NTIRE 2020 Perceptual Extreme Super-Resolution Submission. Our method ranked first and secon

Sejong Yang 0 Oct 17, 2022
This is the official repository of XVFI (eXtreme Video Frame Interpolation)

XVFI This is the official repository of XVFI (eXtreme Video Frame Interpolation), https://arxiv.org/abs/2103.16206 Last Update: 20210607 We provide th

Jihyong Oh 195 Dec 29, 2022
A new data augmentation method for extreme lighting conditions.

Random Shadows and Highlights This repo has the source code for the paper: Random Shadows and Highlights: A new data augmentation method for extreme l

Osama Mazhar 35 Nov 26, 2022