Code for DeepXML: A Deep Extreme Multi-Label Learning Framework Applied to Short Text Documents

Overview

DeepXML

Code for DeepXML: A Deep Extreme Multi-Label Learning Framework Applied to Short Text Documents


Architectures and algorithms

DeepXML supports multiple feature architectures such as Bag-of-embedding/Astec, RNN, CNN etc. The code uses a json file to construct the feature architecture. Features could be computed using following encoders:

  • Bag-of-embedding/Astec: As used in the DeepXML paper [1].
  • RNN: RNN based sequential models. Support for RNN, GRU, and LSTM.
  • XML-CNN: CNN architecture as proposed in the XML-CNN paper [4].

Best Practices for features creation


  • Adding sub-words on top of unigrams to the vocabulary can help in training more accurate embeddings and classifiers.

Setting up


Expected directory structure

+-- 
   
    
|  +-- programs
|  |  +-- deepxml
|  |    +-- deepxml
|  +-- data
|    +-- 
    
     
|  +-- models
|  +-- results


    
   

Download data for Astec

* Download the (zipped file) BoW features from XML repository.  
* Extract the zipped file into data directory. 
* The following files should be available in 
   
    /data/
    
      for new datasets (ignore the next step)
    - trn_X_Xf.txt
    - trn_X_Y.txt
    - tst_X_Xf.txt
    - tst_X_Y.txt
    - fasttextB_embeddings_300d.npy or fasttextB_embeddings_512d.npy
* The following files should be available in 
     
      /data/
      
        if the dataset is in old format (please refer to next step to convert the data to new format)
    - train.txt
    - test.txt
    - fasttextB_embeddings_300d.npy or fasttextB_embeddings_512d.npy 

      
     
    
   

Convert to new data format

# A perl script is provided (in deepxml/tools) to convert the data into new format as expected by Astec
# Either set the $data_dir variable to the data directory of a particular dataset or replace it with the path
perl convert_format.pl $data_dir/train.txt $data_dir/trn_X_Xf.txt $data_dir/trn_X_Y.txt
perl convert_format.pl $data_dir/test.txt $data_dir/tst_X_Xf.txt $data_dir/tst_X_Y.txt

Example use cases


A single learner with DeepXML framework

The DeepXML framework can be utilized as follows. A json file is used to specify architecture and other arguments. Please refer to the full documentation below for more details.

./run_main.sh 0 DeepXML EURLex-4K 0 108

An ensemble of multiple learners with DeepXML framework

An ensemble can be trained as follows. A json file is used to specify architecture and other arguments.

./run_main.sh 0 DeepXML EURLex-4K 0 108,666,786

Full Documentation

./run_main.sh 
    
     
      
       
       
         * gpu_id: Run the program on this GPU. * framework - DeepXML: Divides the XML problems in 4 modules as proposed in the paper. - DeepXML-OVA: Train the architecture in 1-vs-all fashion [4][5], i.e., loss is computed for each label in each iteration. - DeepXML-ANNS: Train the architecture using a label shortlist. Support is available for a fixed graph or periodic training of the ANNS graph. * dataset - Name of the dataset. - Astec expects the following files in 
        
         /data/
         
           - trn_X_Xf.txt - trn_X_Y.txt - tst_X_Xf.txt - tst_X_Y.txt - fasttextB_embeddings_300d.npy or fasttextB_embeddings_512d.npy - You can set the 'embedding_dims' in config file to switch between 300d and 512d embeddings. * version - different runs could be managed by version and seed. - models and results are stored with this argument. * seed - seed value as used by numpy and PyTorch. - an ensemble is learned if multiple comma separated values are passed. 
         
        
       
      
     
    
   

Notes

* Other file formats such as npy, npz, pickle are also supported.
* Initializing with token embeddings (computed from FastText) leads to noticible accuracy gain in Astec. Please ensure that the token embedding file is available in data directory, if 'init=token_embeddings', otherwise it'll throw an error.
* Config files are made available in deepxml/configs/
   
    /
    
      for datasets in XC repository. You can use them when trying out Astec/DeepXML on new datasets.
* We conducted our experiments on a 24-core Intel Xeon 2.6 GHz machine with 440GB RAM with a single Nvidia P40 GPU. 128GB memory should suffice for most datasets.
* Astec make use of CPU (mainly for nmslib) as well as GPU. 

    
   

Cite as

@InProceedings{Dahiya21,
    author = "Dahiya, K. and Saini, D. and Mittal, A. and Shaw, A. and Dave, K. and Soni, A. and Jain, H. and Agarwal, S. and Varma, M.",
    title = "DeepXML: A Deep Extreme Multi-Label Learning Framework Applied to Short Text Documents",
    booktitle = "Proceedings of the ACM International Conference on Web Search and Data Mining",
    month = "March",
    year = "2021"
}

YOU MAY ALSO LIKE

References


[1] K. Dahiya, D. Saini, A. Mittal, A. Shaw, K. Dave, A. Soni, H. Jain, S. Agarwal, and M. Varma. Deepxml: A deep extreme multi-label learning framework applied to short text documents. In WSDM, 2021.

[2] pyxclib: https://github.com/kunaldahiya/pyxclib

[3] H. Jain, V. Balasubramanian, B. Chunduri and M. Varma, Slice: Scalable linear extreme classifiers trained on 100 million labels for related searches, In WSDM 2019.

[4] J. Liu, W.-C. Chang, Y. Wu and Y. Yang, XML-CNN: Deep Learning for Extreme Multi-label Text Classification, In SIGIR 2017.

[5] R. Babbar, and B. Schölkopf, DiSMEC - Distributed Sparse Machines for Extreme Multi-label Classification In WSDM, 2017.

[6] P., Bojanowski, E. Grave, A. Joulin, and T. Mikolov. Enriching word vectors with subword information. In TACL, 2017.

Comments
  • fasttextB_embeddings_300d.npy file

    fasttextB_embeddings_300d.npy file

    hi, I am running the basic learner ./run_main.sh 0 DeepXML EURLex-4k 0 108 and everything is going fine except I don't have the fasttext embeddings file.

    The error output is Embedding File not found. Check path or set 'init' to null. Where/how was the .npy embeddings file created? Is it from the pretrained word vectors on fasttext's website?

    would appreciate any info to illuminate this issue! thanks

    opened by cairomo 6
  • Reproducing WikiSeeAlsoTitles-350K

    Reproducing WikiSeeAlsoTitles-350K

    Hello When I implement the astec in WikiSeeAlsoTitles-350K, I got the following data error: File "../runner.py", line 433, in config=json.load(open(config))) File "../runner.py", line 376, in run_one return run_deepxml(work_dir, version, seed, config) File "../runner.py", line 127, in run_deepxml _train_time, _ = main(args) File "/home/zx22/xml/baselines/astec/programs/deepxml/deepxml/main.py", line 425, in main output = train(model, params) File "/home/zx22/xml/baselines/astec/programs/deepxml/deepxml/main.py", line 104, in train surrogate_mapping=params.surrogate_mapping) File "/home/zx22/xml/baselines/astec/programs/deepxml/deepxml/libs/model_base.py", line 529, in fit surrogate_mapping=surrogate_mapping) File "/home/zx22/xml/baselines/astec/programs/deepxml/deepxml/libs/model_base.py", line 157, in _create_dataset _type=_type) File "/home/zx22/xml/baselines/astec/programs/deepxml/deepxml/libs/dataset.py", line 28, in construct_dataset normalize_labels, num_clf_partitions, feature_type, surrogate_mapping) File "/home/zx22/xml/baselines/astec/programs/deepxml/deepxml/libs/dataset.py", line 102, in init self._process_labels(model_dir, surrogate_mapping) File "/home/zx22/xml/baselines/astec/programs/deepxml/deepxml/libs/dataset.py", line 138, in _process_labels axis=1) File "/home/zx22/.local/lib/python3.7/site-packages/xclib-0.97-py3.7-linux-x86_64.egg/xclib/utils/sparse.py", line 350, in _map return _map_cols(X, mapping, shape, oformat) File "/home/zx22/.local/lib/python3.7/site-packages/xclib-0.97-py3.7-linux-x86_64.egg/xclib/utils/sparse.py", line 323, in _map_cols col_idx = func(col_idx) File "/home/zx22/.conda/envs/xml/lib/python3.7/site-packages/numpy/lib/function_base.py", line 2108, in call return self._vectorize_call(func=func, args=vargs) File "/home/zx22/.conda/envs/xml/lib/python3.7/site-packages/numpy/lib/function_base.py", line 2192, in _vectorize_call outputs = ufunc(*inputs) File "/home/zx22/.local/lib/python3.7/site-packages/xclib-0.97-py3.7-linux-x86_64.egg/xclib/utils/sparse.py", line 322, in func = np.vectorize(lambda x: mapping[x]) KeyError: 351990

    I use the bow feature in the website http://manikvarma.org/downloads/XC/XMLRepository.html and the running command is: ./run_main.sh 0 DeepXML WikiSeeAlsoTitles-350K 0 108

    Thanks!

    opened by Melon-Xu 3
  • header shape mis-match

    header shape mis-match

    when running train mode, it gives a header shape mis-match error during the surrogate task (clustering). I've tested it on the EURlex-4k dataset and my own custom dataset.

    after some investigation, i believe it might be due to some words that do not occur in the train data (but are in test data and therefore part of the vocabulary), so when the sparse file for features is read, the max index it finds is not necessarily the same as the number of features provided in the header, thus throwing the shape mismatch issue.

    when providing the exact same data for train and test, it does not raise the header shape mis-match error.

    here is the output:

    Loading training data.
    Surrogate mapping is not None, mapping labels
    Loading validation data.
    /home/chmo/.local/lib/python3.7/site-packages/xclib-0.97-py3.7-linux-x86_64.egg/xclib/data/data_utils.py:263: UserWarning: Header mis-match from inferred shape!
      warnings.warn("Header mis-match from inferred shape!")
    Surrogate mapping is not None, mapping labels
    

    there is also a similar possibly related error during evaluation for the extreme task. model loaded from checkpoint is a slightly different dimension than the current model:

    Error(s) in loading state_dict for DeepXMLf:
    	size mismatch for classifier.weight: copying a param with shape torch.Size([17965, 300]) from checkpoint, the shape in current model is torch.Size([17963, 300]). 
    

    I think this comes from when valid labels are being selected, some criteria creates a difference of 2 (regardless of dataset, the size mismatch is always 2).

    not sure what the best fix would be --i see the force_header option is there so maybe it should be the default. during evaluation, i can bypass the issue by manually updating the number of labels in the params.json, but this doesn't seem like a solution.

    opened by cairomo 3
  • Could it be used for binary classification

    Could it be used for binary classification

    Hey! you really did a great job! I would like to use it for binary classification tasks, and after I read the code, I change the config json below:

    {
        "global": {
            **"dataset": "IMDB",** 
            "feature_type": "sparse",
            **"num_labels": 2,**
            "arch": "Astec",
            "A": 0.55,
            "B": 1.5,
            "use_reranker": false,
            "surrogate_threshold": 1024,
            **"surrogate_method": 0,**
            "embedding_dims": 300,
            **"top_k": 2,**
    	    "beta": 0.3,
            "save_predictions": true,
            "trn_label_fname": "trn_X_Y.txt",
            "val_label_fname": "tst_X_Y.txt",
            "tst_label_fname": "tst_X_Y.txt",
            "trn_feat_fname": "trn_X_Xf.txt",
            "val_feat_fname": "tst_X_Xf.txt",
            "tst_feat_fname": "tst_X_Xf.txt"
        },
        "surrogate": {
            "num_epochs": 20,
            "dlr_factor": 0.5,
            "learning_rate": 0.01,
            "batch_size": 255,
            "dlr_step": 14,
            "normalize": true,
            "init": null,
            "optim": "Adam",
            "embeddings": "fasttextB_embeddings_300d.npy",
            "validate": true,
            "save_intermediate": true
        },
        "extreme": {
            "num_epochs": 20,
            "dlr_factor": 0.5,
            "learning_rate": 0.007,
            "batch_size": 255,
            "dlr_step": 14,
            "ns_method": "ensemble",
            "num_centroids": 1,
            "efC": 300,
            "efS": 400,
            "M": 100,
            "num_nbrs": 500,
            "ann_threads": 18,
            "beta": 0.5,
            "surrogate_mapping": null,
            "num_clf_partitions": 1,
            "optim": "Adam",
            "freeze_intermediate": true,
            "validate": true,
            **"model_method": "full",**
            "normalize": true,
            "shortlist_method": "hybrid",
            "init": null,
            "use_shortlist": false,
            "use_intermediate_for_shorty": true
        },
        "reranker": {
            "num_epochs": 15,
            "dlr_factor": 0.5,
            "learning_rate": 0.005,
            "batch_size": 255,
            "dlr_step": 10,
            "beta": 0.6,
            "num_clf_partitions": 1,
            "optim": "Adam",
            "validate": true,
            "model_method": "reranker",
            "shortlist_method": "static",
            "surrogate_mapping": null,
            "normalize": true,
            "use_shortlist": true,
            "init": null,
            "save_intermediate": false,
            "keep_invalid": true,
            "freeze_intermediate": false,
            "update_shortlist": false,
            "use_pretrained_shortlist": true,
            "embeddings": "fasttextB_embeddings_300d.npy"
        }
    }
    

    and I change items with bold, cause it seems label clusters and shortlists can't work well in such situation, so I block them. When I get the tst_predictions.npz file, I found it's all 0 or 1 after sigmoid or softmax(I don't konw how you handle that) , So my seeting is wrong? or Astec couldn't do binary classification ? Thanks!

    opened by yszd577 2
  • Dummy example/tutorial how to use deepxml

    Dummy example/tutorial how to use deepxml

    It seems you guys have developed a great lib. But I think you could improve the documentation even more, from highest level (a tutorial how to use deepxml) to lowest level (baby steps in a jupyter nootebook how to use the lib, and/or put some docstring in the functions). This way, lot of people will meet, use, and share this cool lib.

    opened by fernandojunior 2
  • config for Mediamill

    config for Mediamill

    Hello! When I run astec in Mediamill, it seems go wrong in ANN process. I got the following data error:

    Caught ValueError in DataLoader worker process 0.
    Original Traceback (most recent call last):
      File "/home/pclab/Tools/anaconda/envs/deepxml/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
        data = fetcher.fetch(index)
      File "/home/pclab/Tools/anaconda/envs/deepxml/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
        data = [self.dataset[idx] for idx in possibly_batched_index]
      File "/home/pclab/Tools/anaconda/envs/deepxml/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
        data = [self.dataset[idx] for idx in possibly_batched_index]
      File "/home/pclab/cyx/extreme_classification/programs/deepxml/deepxml/libs/dataset.py", line 298, in __getitem__
        y = self.get_shortlist(index)
      File "/home/pclab/cyx/extreme_classification/programs/deepxml/deepxml/libs/dataset.py", line 279, in get_shortlist
        return self.shortlist.get_shortlist(index, pos_labels)
      File "/home/pclab/cyx/extreme_classification/programs/deepxml/deepxml/libs/shortlist_handler.py", line 173, in get_shortlist
        return self._get_sl_one(index, pos_labels)
      File "/home/pclab/cyx/extreme_classification/programs/deepxml/deepxml/libs/shortlist_handler.py", line 133, in _get_sl_one
        shortlist, sim = self.query(index)
      File "/home/pclab/cyx/extreme_classification/programs/deepxml/deepxml/libs/shortlist_handler.py", line 325, in query
        _ind, _sim = self.shortlist_dynamic.query(1)
      File "/home/pclab/cyx/extreme_classification/programs/deepxml/deepxml/libs/sampling.py", line 44, in query
        return self._query()
      File "/home/pclab/cyx/extreme_classification/programs/deepxml/deepxml/libs/sampling.py", line 38, in _query
        return (self.index(size=self.num_samples), [1.0]*self.num_samples)
      File "_generator.pyx", line 769, in numpy.random._generator.Generator.choice
    ValueError: Cannot take a larger sample than population when replace is False
      File "/home/pclab/cyx/extreme_classification/programs/deepxml/deepxml/libs/model_base.py", line 257, in _step
        for batch_data in pbar:
      File "/home/pclab/cyx/extreme_classification/programs/deepxml/deepxml/libs/model.py", line 305, in _fit
        tr_avg_loss = self._step(
      File "/home/pclab/cyx/extreme_classification/programs/deepxml/deepxml/libs/model.py", line 518, in fit
        self._fit(
      File "/home/pclab/cyx/extreme_classification/programs/deepxml/deepxml/main.py", line 70, in train
        train_time, model_size = model.fit(
      File "/home/pclab/cyx/extreme_classification/programs/deepxml/deepxml/main.py", line 420, in main
        output = train(model, params)
      File "/home/pclab/cyx/extreme_classification/programs/deepxml/deepxml/runner.py", line 156, in run_deepxml
        _train_time, _model_size = main(args)
      File "/home/pclab/cyx/extreme_classification/programs/deepxml/deepxml/runner.py", line 386, in run_one
        return run_deepxml(work_dir, version, seed, config)
      File "/home/pclab/cyx/extreme_classification/programs/deepxml/deepxml/runner.py", line 438, in <module>
        run_one(
    

    I use the data in http://manikvarma.org/downloads/XC/XMLRepository.html and split accoriding to split file. Here's my config json:

    {
        "global": {
            **"dataset": "Mediamill",**
            "feature_type": "sparse",
            **"num_labels": 101,**
            "arch": "Astec",
            "A": 0.55,
            "B": 1.5,
            "use_reranker": true,
            **"surrogate_threshold": 32,**
            "surrogate_method": 1,
            **"embedding_dims": 100,**
            **"top_k": 20,**
    	"beta": 0.3,
            "save_predictions": true,
            "trn_label_fname": "trn_X_Y.txt",
            "val_label_fname": "tst_X_Y.txt",
            "tst_label_fname": "tst_X_Y.txt",
            "trn_feat_fname": "trn_X_Xf.txt",
            "val_feat_fname": "tst_X_Xf.txt",
            "tst_feat_fname": "tst_X_Xf.txt"
        },
        "surrogate": {
            "num_epochs": 20,
            "dlr_factor": 0.5,
            "learning_rate": 0.01,
            "batch_size": 255,
            "dlr_step": 14,
            "normalize": true,
            **"init": null,**
            "optim": "Adam",
            "embeddings": "fasttextB_embeddings_300d.npy",
            "validate": true,
            "save_intermediate": true
        },
        "extreme": {
            "num_epochs": 20,
            "dlr_factor": 0.5,
            "learning_rate": 0.007,
            "batch_size": 255,
            "dlr_step": 14,
            "ns_method": "ensemble",
            "num_centroids": 1,
            "efC": 300,
            **"efS": 20,**
            "M": 100,
            "num_nbrs": 500,
            "ann_threads": 18,
            "beta": 0.5,
            "surrogate_mapping": true,
            "num_clf_partitions": 1,
            "optim": "Adam",
            "freeze_intermediate": true,
            "validate": true,
            "model_method": "shortlist",
            "normalize": true,
            "shortlist_method": "hybrid",
            "init": "intermediate",
            "use_shortlist": true,
            "use_intermediate_for_shorty": true
        },
        "reranker": {
            "num_epochs": 15,
            "dlr_factor": 0.5,
            "learning_rate": 0.005,
            "batch_size": 255,
            "dlr_step": 10,
            "beta": 0.6,
            "num_clf_partitions": 1,
            "optim": "Adam",
            "validate": true,
            "model_method": "reranker",
            "shortlist_method": "static",
            "surrogate_mapping": null,
            "normalize": true,
            "use_shortlist": true,
            "init": null,
            "save_intermediate": false,
            "keep_invalid": true,
            "freeze_intermediate": false,
            "update_shortlist": false,
            "use_pretrained_shortlist": true,
            "embeddings": "fasttextB_embeddings_300d.npy"
        }
    }
    

    my command is ./run_main.sh 0 DeepXML Mediamill 1 108

    opened by yszd577 1
  • Training the model on a custom dataset, how to?

    Training the model on a custom dataset, how to?

    Hello, thank you very much for the code that you provides to easily use deepxml. I am planning to train the deepxml/astec model on a custom dataset. I wonder how should I proceed to do this?

    My main concern is how to prepare the BoW features required. What type of approach is used? Thank you!

    opened by StefanusAgus 1
  • what is surrogate_mapping in the Model class?

    what is surrogate_mapping in the Model class?

    when creating a model, there is an optional surrogate_mapping argument. It accepts a string, not a .npz file or what I would expect (the model from the surrogate step). can you provide an example of what a surrogate_mapping parameter would look like? I am currently using labels_params.pkl in the model directory/surrogate folder.

    opened by cairomo 1
  • The problems without the npy file and no errors occured

    The problems without the npy file and no errors occured

    image image Hello, I have run the model provided on Github, but it seens that the preaprations do not support the need. I checked the web you gave before to another person. And the web is invalid now. With no errors occured, i could not figure out what i miss, maybe for files, for the packages. It will be my pleasure if you could take a look at the problem we provide and help us.
    opened by Liuhm0710 6
  • Predict on a single example

    Predict on a single example

    Hey, I understand that this may not be the intended use of your model, but I would like to be able to predict labels on a single example. Do you think that would be possible to implement? If yes, maybe you could give me some directions on how to do it?

    Thanks!

    opened by mechanicpanic 1
  • training shortlist on large-ish dataset

    training shortlist on large-ish dataset

    hello, I have gotten this model working on training custom data. it works through the surrogate and extreme steps, but during the calculation of the ANN/shortlist, it seems to be trying to pickle an extremely large file. it throws this error:

    File "/data/ebay/notebooks/chmo/astec/programs/deepxml/deepxml/main.py", line 428, in main
        output = train(model, params)
      File "/data/ebay/notebooks/chmo/astec/programs/deepxml/deepxml/main.py", line 107, in train
        surrogate_mapping=params.surrogate_mapping)
      File "/data/ebay/notebooks/chmo/astec/programs/deepxml/deepxml/libs/model.py", line 528, in fit
        use_intermediate_for_shorty, precomputed_intermediate)
      File "/data/ebay/notebooks/chmo/astec/programs/deepxml/deepxml/libs/model.py", line 337, in _fit
        self.save_checkpoint(model_dir, epoch+1)
      File "/data/ebay/notebooks/chmo/astec/programs/deepxml/deepxml/libs/model.py", line 695, in save_checkpoint
        model_dir, self.tracking.saved_checkpoints[-1]['ANN']))
      File "/data/ebay/notebooks/chmo/astec/programs/deepxml/deepxml/libs/shortlist.py", line 105, in save
        self.knn.save(fname+'.knn')
      File "/home/chmo/.local/lib/python3.7/site-packages/xclib-0.97-py3.7-linux-x86_64.egg/xclib/utils/shortlist.py", line 396, in save
        'space': self.space}, open(fname+".metadata", 'wb'))
    OverflowError: cannot serialize a bytes object larger than 4 GiB
    

    is this where it is saving the knn checkpoint? why is the object so large that it is > 4GB? I am wondering because in the paper it is reported that the model was trained on a dataset with a label space of 62 million. I am also running this training with p100 and 61 workers

    stats on my dataset: number of features: 111,786 number of labels: 699,001

    and my config:

    {
        "global": {
            "dataset": "leaf",
            "feature_type": "sparse",
            "num_labels": 699001,
            "arch": "Astec",
            "A": 0.55,
            "B": 1.5,
            "use_reranker": true,
            "surrogate_threshold": 512,
            "surrogate_method": 1,
            "embedding_dims": 391,
            "top_k": 100,
            "beta": 0.3,
            "save_predictions": true,
            "trn_label_fname": "trn_X_Y.txt",
            "val_label_fname": "tst_X_Y.txt",
            "tst_label_fname": "tst_X_Y.txt",
            "trn_feat_fname": "trn_X_Xf.txt",
            "val_feat_fname": "tst_X_Xf.txt",
            "tst_feat_fname": "tst_X_Xf.txt"
        },
        "surrogate": {
            "num_epochs": 20,
            "dlr_factor": 0.5,
            "learning_rate": 0.01,
            "batch_size": 255,
            "dlr_step": 14,
            "normalize": true,
            "init": "token_embeddings",
            "optim": "Adam",
            "embeddings": "fasttextB_embeddings_391d.npy",
            "validate": true,
            "save_intermediate": true
        },
        "extreme": {
            "num_epochs": 20,
            "dlr_factor": 0.5,
            "learning_rate": 0.007,
            "batch_size": 255,
            "dlr_step": 14,
            "ns_method": "ensemble",
            "num_centroids": 1,
            "efC": 300,
            "efS": 400,
            "M": 100,
            "num_nbrs": 500,
            "ann_threads": 18,
            "beta": 0.5,
            "surrogate_mapping": true,
            "num_clf_partitions": 1,
            "optim": "Adam",
            "freeze_intermediate": true,
            "validate": true,
            "model_method": "shortlist",
            "normalize": true,
            "shortlist_method": "hybrid",
            "init": "intermediate",
            "use_shortlist": true,
            "use_intermediate_for_shorty": true
        },
        "reranker": {
            "num_epochs": 20,
            "dlr_factor": 0.5,
            "learning_rate": 0.005,
            "batch_size": 255,
            "dlr_step": 10,
            "beta": 0.6,
            "num_clf_partitions": 1,
            "optim": "Adam",
            "validate": true,
            "model_method": "reranker",
            "shortlist_method": "static",
            "surrogate_mapping": true,
            "normalize": true,
            "use_shortlist": true,
            "init": "token_embeddings",
            "save_intermediate": false,
            "keep_invalid": true,
            "freeze_intermediate": false,
            "update_shortlist": false,
            "use_pretrained_shortlist": true,
            "embeddings": "fasttextB_embeddings_391d.npy"
        }
    }
    
    opened by cairomo 10
Owner
Extreme Classification
Extreme Classification
DECAF: Deep Extreme Classification with Label Features

DECAF DECAF: Deep Extreme Classification with Label Features @InProceedings{Mittal21, author = "Mittal, A. and Dahiya, K. and Agrawal, S. and Sain

null 46 Nov 6, 2022
We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.

Multi-Modal Self-Supervision using GDT and StiCa This is an official pytorch implementation of papers: Multi-modal Self-Supervision from Generalized D

Facebook Research 42 Dec 9, 2022
Label Mask for Multi-label Classification

LM-MLC 一种基于完型填空的多标签分类算法 1 前言 本文主要介绍本人在全球人工智能技术创新大赛【赛道一】设计的一种基于完型填空(模板)的多标签分类算法:LM-MLC,该算法拟合能力很强能感知标签关联性,在多个数据集上测试表明该算法与主流算法无显著性差异,在该比赛数据集上的dev效果很好,但是由

null 52 Nov 20, 2022
A PyTorch implementation of ICLR 2022 Oral paper PiCO: Contrastive Label Disambiguation for Partial Label Learning

PiCO: Contrastive Label Disambiguation for Partial Label Learning This is a PyTorch implementation of ICLR 2022 Oral paper PiCO; also see our Project

王皓波 83 May 11, 2022
Official implementation of "Open-set Label Noise Can Improve Robustness Against Inherent Label Noise" (NeurIPS 2021)

Open-set Label Noise Can Improve Robustness Against Inherent Label Noise NeurIPS 2021: This repository is the official implementation of ODNL. Require

Hongxin Wei 12 Dec 7, 2022
PocketNet: Extreme Lightweight Face Recognition Network using Neural Architecture Search and Multi-Step Knowledge Distillation

PocketNet This is the official repository of the paper: PocketNet: Extreme Lightweight Face Recognition Network using Neural Architecture Search and M

Fadi Boutros 40 Dec 22, 2022
Multi-Person Extreme Motion Prediction

Multi-Person Extreme Motion Prediction Implementation for paper Wen Guo, Xiaoyu Bie, Xavier Alameda-Pineda, Francesc Moreno-Noguer, Multi-Person Extre

GUO-W 38 Nov 15, 2022
Deep Learning applied to Integral data analysis

DeepIntegralCompton Deep Learning applied to Integral data analysis Module installation Move to the root directory of the project and execute : pip in

Thomas Vuillaume 1 Dec 10, 2021
PyTorch implementation of Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network

hierarchical-multi-label-text-classification-pytorch Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network Approach This

Mingu Kang 17 Dec 13, 2022
Framework that uses artificial intelligence applied to mathematical models to make predictions

LiconIA Framework that uses artificial intelligence applied to mathematical models to make predictions Interface Overview Table of contents [TOC] 1 Ar

null 4 Jun 20, 2021
Official code of ICCV2021 paper "Residual Attention: A Simple but Effective Method for Multi-Label Recognition"

CSRA This is the official code of ICCV 2021 paper: Residual Attention: A Simple But Effective Method for Multi-Label Recoginition Demo, Train and Vali

null 163 Dec 22, 2022
Official Pytorch Implementation of: "Semantic Diversity Learning for Zero-Shot Multi-label Classification"(2021) paper

Semantic Diversity Learning for Zero-Shot Multi-label Classification Paper Official PyTorch Implementation Avi Ben-Cohen, Nadav Zamir, Emanuel Ben Bar

null 28 Aug 29, 2022
Shared Attention for Multi-label Zero-shot Learning

Shared Attention for Multi-label Zero-shot Learning Overview This repository contains the implementation of Shared Attention for Multi-label Zero-shot

dathuynh 26 Dec 14, 2022
[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Discriminative Region-based Multi-Label Zero-Shot Learning (ICCV 2021) [arXiv][Project page >> coming soon] Sanath Narayan*, Akshita Gupta*, Salman Kh

Akshita Gupta 54 Nov 21, 2022
[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Discriminative Region-based Multi-Label Zero-Shot Learning (ICCV 2021) [arXiv][Project page >> coming soon] Sanath Narayan*, Akshita Gupta*, Salman Kh

Akshita Gupta 54 Nov 21, 2022
Simple and Robust Loss Design for Multi-Label Learning with Missing Labels

Simple and Robust Loss Design for Multi-Label Learning with Missing Labels Official PyTorch Implementation of the paper Simple and Robust Loss Design

Xinyu Huang 28 Oct 27, 2022
A deep learning based semantic search platform that computes similarity scores between provided query and documents

semanticsearch This is a deep learning based semantic search platform that computes similarity scores between provided query and documents. Documents

null 1 Nov 30, 2021
Bachelor's Thesis in Computer Science: Privacy-Preserving Federated Learning Applied to Decentralized Data

federated is the source code for the Bachelor's Thesis Privacy-Preserving Federated Learning Applied to Decentralized Data (Spring 2021, NTNU) Federat

Dilawar Mahmood 25 Nov 30, 2022