This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"

Overview

Pattern-Exploiting Training (PET)

This repository contains the code for Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference and It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners. The papers introduce pattern-exploiting training (PET), a semi-supervised training procedure that reformulates input examples as cloze-style phrases. In low-resource settings, PET and iPET significantly outperform regular supervised training, various semi-supervised baselines and even GPT-3 despite requiring 99.9% less parameters. The iterative variant of PET (iPET) trains multiple generations of models and can even be used without any training data.

#Examples Training Mode Yelp (Full) AG's News Yahoo Questions MNLI
0 unsupervised 33.8 69.5 44.0 39.1
iPET 56.7 87.5 70.7 53.6
100 supervised 53.0 86.0 62.9 47.9
PET 61.9 88.3 69.2 74.7
iPET 62.9 89.6 71.2 78.4

Note: To exactly reproduce the above results, make sure to use v1.1.0 (--branch v1.1.0).

📑 Contents

🔧 Setup

💬 CLI Usage

💻 API Usage

🐶 Train your own PET

📕 Citation

🔧 Setup

All requirements for PET can be found in requirements.txt. You can install all required packages with pip install -r requirements.txt.

💬 CLI Usage

The command line interface cli.py in this repository currently supports three different training modes (PET, iPET, supervised training), two additional evaluation methods (unsupervised and priming) and 13 different tasks. For Yelp Reviews, AG's News, Yahoo Questions, MNLI and X-Stance, see the original paper for further details. For the 8 SuperGLUE tasks, see this paper.

PET Training and Evaluation

To train and evaluate a PET model for one of the supported tasks, simply run the following command:

python3 cli.py \
--method pet \
--pattern_ids $PATTERN_IDS \
--data_dir $DATA_DIR \
--model_type $MODEL_TYPE \
--model_name_or_path $MODEL_NAME_OR_PATH \
--task_name $TASK \
--output_dir $OUTPUT_DIR \
--do_train \
--do_eval

where

  • $PATTERN_IDS specifies the PVPs to use. For example, if you want to use all patterns, specify PATTERN_IDS 0 1 2 3 4 for AG's News and Yahoo Questions or PATTERN_IDS 0 1 2 3 for Yelp Reviews and MNLI.
  • $DATA_DIR is the directory containing the train and test files (check tasks.py to see how these files should be named and formatted for each task).
  • $MODEL_TYPE is the name of the model being used, e.g. albert, bert or roberta.
  • $MODEL_NAME is the name of a pretrained model (e.g., roberta-large or albert-xxlarge-v2) or the path to a pretrained model.
  • $TASK_NAME is the name of the task to train and evaluate on.
  • $OUTPUT_DIR is the name of the directory in which the trained model and evaluation results are saved.

You can additionally specify various training parameters for both the ensemble of PET models corresponding to individual PVPs (prefix --pet_) and for the final sequence classification model (prefix --sc_). For example, the default parameters used for our SuperGLUE evaluation are:

--pet_per_gpu_eval_batch_size 8 \
--pet_per_gpu_train_batch_size 2 \
--pet_gradient_accumulation_steps 8 \
--pet_max_steps 250 \
--pet_max_seq_length 256 \
--pet_repetitions 3 \
--sc_per_gpu_train_batch_size 2 \
--sc_per_gpu_unlabeled_batch_size 2 \
--sc_gradient_accumulation_steps 8 \
--sc_max_steps 5000 \
--sc_max_seq_length 256 \
--sc_repetitions 1

For each pattern $P and repetition $I, running the above command creates a directory $OUTPUT_DIR/p$P-i$I that contains the following files:

  • pytorch_model.bin: the finetuned model, possibly along with some model-specific files (e.g, spiece.model, special_tokens_map.json)
  • wrapper_config.json: the configuration of the model being used
  • train_config.json: the configuration used for training
  • eval_config.json: the configuration used for evaluation
  • logits.txt: the model's predictions on the unlabeled data
  • eval_logits.txt: the model's prediction on the evaluation data
  • results.json: a json file containing results such as the model's final accuracy
  • predictions.jsonl: a prediction file for the evaluation set in the SuperGlue format

The final (distilled) model for each repetition $I can be found in $OUTPUT_DIR/final/p0-i$I, which contains the same files as described above.

🚨 If your GPU runs out of memory during training, you can try decreasing both the pet_per_gpu_train_batch_size and the sc_per_gpu_unlabeled_batch_size while increasing both pet_gradient_accumulation_steps and sc_gradient_accumulation_steps.

iPET Training and Evaluation

To train and evaluate an iPET model for one of the supported tasks, simply run the same command as above, but replace --method pet with --method ipet. There are various additional iPET parameters that you can modify; all of them are prefixed with --ipet_.

For each generation $G, pattern $P and iteration $I, this creates a directory $OUTPUT_DIR/g$G/p$P-i$I that is structured as for regular PET. The final (distilled) model can again be found in $OUTPUT_DIR/final/p0-i$I.

🚨 If you use iPET with zero training examples, you need to specify how many examples for each label should be chosen in the first generation and you need to change the reduction strategy to mean: --ipet_n_most_likely 100 --reduction mean.

Supervised Training and Evaluation

To train and evaluate a regular sequence classifier in a supervised fashion, simply run the same command as above, but replace --method pet with --method sequence_classifier. There are various additional parameters for the sequence classifier that you can modify; all of them are prefixed with --sc_.

Unsupervised Evaluation

To evaluate a pretrained language model with the default PET patterns and verbalizers, but without fine-tuning, remove the argument --do_train and add --no_distillation so that no final distillation is performed.

Priming

If you want to use priming, remove the argument --do_train and add the arguments --priming --no_distillation so that all training examples are used for priming and no final distillation is performed.

🚨 Remember that you may need to increase the maximum sequence length to a much larger value, e.g. --pet_max_seq_length 5000. This only works with language models that support such long sequences, e.g. XLNet. For using XLNet, you can specify --model_type xlnet --model_name_or_path xlnet-large-cased --wrapper_type plm.

💻 API Usage

Instead of using the command line interface, you can also directly use the PET API, most of which is defined in pet.modeling. By including import pet, you can access methods such as train_pet, train_ipet and train_classifier. Check out their documentation for more information.

🐶 Train your own PET

To use PET for custom tasks, you need to define two things:

  • a DataProcessor, responsible for loading training and test data. See examples/custom_task_processor.py for an example.
  • a PVP, responsible for applying patterns to inputs and mapping labels to natural language verbalizations. See examples/custom_task_pvp.py for an example.

After having implemented the DataProcessor and the PVP, you can train a PET model using the command line as described above. Below, you can find additional information on how to define the two components of a PVP, verbalizers and patterns.

Verbalizers

Verbalizers are used to map task labels to words in natural language. For example, in a binary sentiment classification task, you could map the positive label (+1) to the word good and the negative label (-1) to the word bad. Verbalizers are realized through a PVP's verbalize() method. The simplest way of defining a verbalizer is to use a dictionary:

VERBALIZER = {"+1": ["good"], "-1": ["bad"]}
    
def verbalize(self, label) -> List[str]:
    return self.VERBALIZER[label]       

Importantly, in PET's current version, verbalizers are by default restricted to single tokens in the underlying LMs vocabulary (for using more than one token, see below). Given a language model's tokenizer, you can easily check whether a word corresponds to a single token by verifying that len(tokenizer.tokenize(word)) == 1.

You can also define multiple verbalizations for a single label. For example, if you are unsure which words best represent the labels in a binary sentiment classification task, you could define your verbalizer as follows:

VERBALIZER = {"+1": ["great", "good", "wonderful", "perfect"], "-1": ["bad", "terrible", "horrible"]}

Patterns

Patterns are used to make the language model understand a given task; they must contain exactly one <MASK> token which is to be filled using the verbalizer. For binary sentiment classification based on a review's summary (<A>) and body (<B>), a suitable pattern may be <A>. <B>. Overall, it was <MASK>. Patterns are realized through a PVP's get_parts() method, which returns a pair of text sequences (where each sequence is represented by a list of strings):

def get_parts(self, example: InputExample):
    return [example.text_a, '.', example.text_b, '.'], ['Overall, it was ', self.mask]

If you do not want to use a pair of sequences, you can simply leave the second sequence empty:

def get_parts(self, example: InputExample):
    return [example.text_a, '.', example.text_b, '. Overall, it was ', self.mask], []

If you want to define several patterns, simply use the PVPs pattern_id attribute:

def get_parts(self, example: InputExample):
    if self.pattern_id == 1:
        return [example.text_a, '.', example.text_b, '.'], ['Overall, it was ', self.mask]
    elif self.pattern_id == 2:
        return ['It was just ', self.mask, '!', example.text_a, '.', example.text_b, '.'], []

When training the model using the command line, specify all patterns to be used (e.g., --pattern_ids 1 2).

Importantly, if a sequence is longer than the specified maximum sequence length of the underlying LM, PET must know which parts of the input can be shortened and which ones cannot (for example, the mask token must always be there). Therefore, PVP provides a shortenable() method to indicate that a piece of text can be shortened:

def get_parts(self, example: InputExample):
    text_a = self.shortenable(example.text_a)
    text_b = self.shortenable(example.text_b)
    return [text_a, '.', text_b, '. Overall, it was ', self.mask], []

PET with Multiple Masks

By default, the current implementation of PET and iPET only supports a fixed set of labels that is shared across all examples and verbalizers that correspond to a single token. However, for some tasks it may be necessary to use verbalizers that correspond to multiple tokens (as described here). To do so, you simply need the following two modifications:

  1. Add the following lines in your task's DataProcessor (see examples/custom_task_processor.py):

    from pet.tasks import TASK_HELPERS
    from pet.task_helpers import MultiMaskTaskHelper
    TASK_HELPERS['my_task'] = MultiMaskTaskHelper

    where 'my_task' is the name of your task.

  2. In your PVP, make sure that the get_parts() method always inserts the maximum number of mask tokens required for any verbalization. For example, if your verbalizer maps +1 to "really awesome" and -1 to "terrible" and if those are tokenized as ["really", "awe", "##some"] and ["terrible"], respectively, your get_parts() method should always return a sequence that contains exactly 3 mask tokens.

With this modification, you can now use verbalizers consisting of multiple tokens:

VERBALIZER = {"+1": ["really good"], "-1": ["just bad"]}

However, there are several limitations to consider:

  • When using a MultiMaskTaskHelper, the maximum batch size for evaluation is 1.
  • As using multiple masks requires multiple forward passes during evaluation, the time required for evaluation scales about linearly with the length of the longest verbalizer. If you require verbalizers that consist of 10 or more tokens, using a generative LM might be a better approach.
  • The MultiMaskTaskHelper class is an experimental feature that is not thoroughly tested. In particular, this feature has only been tested for PET and not for iPET. If you observe something strange, please raise an issue.

For more flexibility, you can also write a custom TaskHelper. As a starting point, you can check out the classes CopaTaskHelper, WscTaskHelper and RecordTaskHelper in pet/task_helpers.py.

📕 Citation

If you make use of the code in this repository, please cite the following papers:

@article{schick2020exploiting,
  title={Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference},
  author={Timo Schick and Hinrich Schütze},
  journal={Computing Research Repository},
  volume={arXiv:2001.07676},
  url={http://arxiv.org/abs/2001.07676},
  year={2020}
}

@article{schick2020small,
  title={It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners},
  author={Timo Schick and Hinrich Schütze},
  journal={Computing Research Repository},
  volume={arXiv:2009.07118},
  url={http://arxiv.org/abs/2009.07118},
  year={2020}
}
Comments
  • Training PET on a new task

    Training PET on a new task

    Hi. I want to train PET on a new task for which I prepared custom_task_processor.py and custom_task_pvp.py. My question is how should we run/tell the program to read our customized files (instead of the main files) and run the registered new task? It seems that just running the commands under the PET Training and Evaluation section does not do the task.

    opened by Mahhos 11
  • Code paper

    Code paper "Few-Shot Text Generation with Pattern-Exploiting Training"

    Dear @timoschick

    I have read and found you paper "Few-Shot Text Generation with Pattern-Exploiting Training" really interesting. Can you share with us the code you use to conduct experiments in your paper?

    opened by nguyentthong 10
  • Training PET on data which is too large to fit in RAM

    Training PET on data which is too large to fit in RAM

    I am training a pet model on 500gb of text. I have properly processed the data, but I can't load all my data into a variable since I don't have nearly enough RAM to do that.

    opened by ghost 8
  • Replicating SuperGLUE benchmark

    Replicating SuperGLUE benchmark

    Hello! I'm trying to replicate your work, and I'm currently comparing the performance of my replication to your implementation. Just to be sure, could you please provide me the exact commands you used for training the SuperGLUE tasks? I would be very grateful. Thank you!

    opened by Shamdan17 8
  • Example of usage

    Example of usage

    Would you mind sharing exactly what to do to download the training data and do training and evaluation for one example? I'm having trouble figuring out exactly how things should be arranged and what to put for the command line parameters. Any example would do, as long as I can just follow the instructions.

    opened by summerstay 7
  • Roberta-large using BPE tokenizer generates multi tokens.

    Roberta-large using BPE tokenizer generates multi tokens.

    Roberta-large uses byte-level Byte-Pair-Encoding. It avoids the common PET training.

    For example, Verbalization "Society" does not correspond to a single token, got ['Soc', 'iety']

    Now I just comment the code assert ( # len(ids) == 1 in utils.py to enforce using the first tokenizer.

    But I don't know whether it will affect the accuracy. So is there any alternative since PET uses Roberta-large by default?

    Thanks~

    opened by caidongqi 6
  • how to run sequence_classifier method with pre-trained logits

    how to run sequence_classifier method with pre-trained logits

    Hi,

    During my PET training, I found it is helpful to tune the last sequence classifier stage individually. What I plan to do is: 1) run PET method as normal; 2) In separate programs, load the merged unlabeled_logits.txt and tune sequence classifier parameters. I'd like to do step2 individually without reruning full PET.

    I hit an issue when I tried to do the above step 2. While expected accuracy for my dataset is ~50%, I only got 1.5%. I used the same parameters for sequence classifier as I did for PET.

    The following is what I did:

    1. change this cli.py line to use_logits=True;
    2. add an option in modeling.py train_classifier() so that it will load given logits file (unlabeled_logits.txt) like this line, then assign the logits to unlabeled data.

    My questions are: what am I missing? What's the proper way to run sequence classifier with pre-trained/merged logits file?

    Thanks.

    opened by LiweiPeng 6
  • How to choose unlabelled data

    How to choose unlabelled data

    Hi @timoschick, thanks very much for your work, I have a question about how you decide the unlabelled data for each task.

    In the paper you say

    Similarly, we construct the set D of unlabeled examples by selecting 10000 examples per label and removing all labels

    Taking agnews as an example, I assume it means you take 40,000 examples (it has 4 classes in total) from training in total and there will be 10,000 examples for each class. However, in your code, it seems that you are not following the 10,000 examples per label thing by just shuffling and picking first 40,000 examples.

    I am little bit confused about this, any clarification would be helpful.

    opened by Punchwes 6
  • Problem using personalized task

    Problem using personalized task

    Hello, @timoschick. First of all, I want to compliment you for the great work you did with PET. I think this is amazing and I can't stand the idea to try it to solve some of my data problems. I am quite new to transformers, so probably I'm doing something terribly naive. First of all I created my personalized task using `class MyTaskDataProcessor(DataProcessor): """ Example for a data processor. """

    # Set this to the name of the task
    TASK_NAME = "illlegal-detection"
    
    # Set this to the name of the file containing the train examples
    TRAIN_FILE_NAME = "train.csv"
    
    # Set this to the name of the file containing the dev examples
    DEV_FILE_NAME = "dev.csv"
    
    # Set this to the name of the file containing the test examples
    TEST_FILE_NAME = "test.csv"
    
    # Set this to the name of the file containing the unlabeled examples
    UNLABELED_FILE_NAME = "unlabeled.csv"
    
    # Set this to a list of all labels in the train + test data
    LABELS = ["0", "1"]
    
    # Set this to the column of the train/test csv files containing the input's text a
    TEXT_A_COLUMN = 0
    
    # Set this to the column of the train/test csv files containing the input's text b or to -1 if there is no text b
    TEXT_B_COLUMN = -1
    
    # Set this to the column of the train/test csv files containing the input's gold label
    LABEL_COLUMN = 1
    
    def get_train_examples(self, data_dir: str) -> List[InputExample]:
        """
        This method loads train examples from a file with name `TRAIN_FILE_NAME` in the given directory.
        :param data_dir: the directory in which the training data can be found
        :return: a list of train examples
        """
        return self._create_examples(os.path.join(data_dir, MyTaskDataProcessor.TRAIN_FILE_NAME), "train")
    
    def get_dev_examples(self, data_dir: str) -> List[InputExample]:
        """
        This method loads dev examples from a file with name `DEV_FILE_NAME` in the given directory.
        :param data_dir: the directory in which the dev data can be found
        :return: a list of dev examples
        """
        return self._create_examples(os.path.join(data_dir, MyTaskDataProcessor.DEV_FILE_NAME), "dev")
    
    def get_test_examples(self, data_dir) -> List[InputExample]:
        """
        This method loads test examples from a file with name `TEST_FILE_NAME` in the given directory.
        :param data_dir: the directory in which the test data can be found
        :return: a list of test examples
        """
        return self._create_examples(os.path.join(data_dir, MyTaskDataProcessor.TEST_FILE_NAME), "test")
    
    def get_unlabeled_examples(self, data_dir) -> List[InputExample]:
        """
        This method loads unlabeled examples from a file with name `UNLABELED_FILE_NAME` in the given directory.
        :param data_dir: the directory in which the unlabeled data can be found
        :return: a list of unlabeled examples
        """
        return self._create_examples(os.path.join(data_dir, MyTaskDataProcessor.UNLABELED_FILE_NAME), "unlabeled")
    
    def get_labels(self) -> List[str]:
        """This method returns all possible labels for the task."""
        return MyTaskDataProcessor.LABELS
    
    def _create_examples(self, path, set_type, max_examples=-1, skip_first=0):
        """Creates examples for the training and dev sets."""
        examples = []
    
        with open(path) as f:
            reader = csv.reader(f, delimiter=',')
            for idx, row in enumerate(reader):
                guid = "%s-%s" % (set_type, idx)
                label = row[MyTaskDataProcessor.LABEL_COLUMN]
                text_a = row[MyTaskDataProcessor.TEXT_A_COLUMN]
                text_b = row[MyTaskDataProcessor.TEXT_B_COLUMN] if MyTaskDataProcessor.TEXT_B_COLUMN >= 0 else None
                example = InputExample(guid=guid, text_a=text_a, text_b=text_b, label=label)
                examples.append(example)
    
        return examples
    

    PROCESSORS[MyTaskDataProcessor.TASK_NAME] = MyTaskDataProcessor`

    Ok, now I want to use it to solve my problem. So I'm using this command line: cmd = """python cli.py --method pet --data_dir .../comments_class/code/pet-master/pet --model_type roberta --model_name_or_path roberta --task_name --output_dir ...comments_class/data/output -- pattern_ids 0 1 --do_train --do_eval """ The problem is: how to communicate to the cli.py the new task I've created? Sorry if I'm being too naive. I think this is an easy one and maybe it will be useful for future noobs too. Thank again for your work!

    opened by JohnPFL 6
  • Annotating an unlabeled set

    Annotating an unlabeled set

    Hi. Thanks for the great repo. I have got a question regarding the PET training and annotating an unlabeled set (as mentioned in the paper examples from D). I assume that it would be done using the command in the PET Training and Evaluation section in the repo. However, I am not sure where to put the unlabeled set and where to get the predicted labels? Would you please let me know how we should get the predicted labels for the unlabeled set? Thank you.

    opened by Mahhos 6
  •  train ipet with zero training examples

    train ipet with zero training examples

    Hi, I am training ipet with zero training examples, I run the following command. python3 cli.py --method ipet --pattern_ids 0 1 2 3 4 --data_dir /share/home/zqzeng/wmni/data/ag_news_csv/ag_news_csv --model_type roberta --model_name_or_path /share/home/zqzeng/transformers/roberta-large --task_name agnews --output_dir /share/home/zqzeng/wmni/data/output/unsupervised-ipet --do_train --do_eval --pet_repetitions 1 --ipet_n_most_likely 100 --reduction mean --train_examples 0 And I got the following result: 2021-11-09 20:22:31,904 - INFO - tasks - Creating features from dataset file at ag_news_csv/ (num_examples=0, set_type=train) 2021-11-09 20:22:34,978 - INFO - tasks - Returning 120000 train examples with label dist.: [('3', 30000), ('4', 30000), ('2', 30000), ('1', 30000)] I followed the flow of the program and found that the whole train examples(120000) was uesd to train each individual model. When I used "--train_examples 10", it's normal, as shown below: 2021-11-09 20:19:13,402 - INFO - tasks - Creating features from dataset file at ag_news_csv/ (num_examples=10, set_type=train) 2021-11-09 20:19:16,127 - INFO - tasks - Returning 10 train examples with label dist.: [('1', 3), ('4', 4), ('2', 2), ('3', 1)] Does the zero training examples don't work? I would be grateful for your prompt reply.

    opened by EneruMin 5
  • Clarification on how to interpret PET's results

    Clarification on how to interpret PET's results

    When running PET on my task I obtain two "results_text" files with different metric values. One is in the main directory, while the other is in the "final" directory. I assumed that the one in the main directory is from all epochs averaged, while the one in "final" regards the last epoch. Can you please confirm this or explain how to interpret the two files?

    opened by aliromagnoli 0
  • How to reproduce results of the paper?

    How to reproduce results of the paper?

    I read the paper and downloaded the AG news dataset,and tested PET model on it,but there is a great margin between my results and the author's results. I set parameters as below. To be specifically, I just used 10 examples for train(10 shot),model type is roberta,model_name_or_path is roberta-large,and I used all patterns for AG news. I did not change other parameters. And here is my results: ==============my results=============== acc-p0: 0.6450877192982456 +- 0.053859095516898825 acc-p1: 0.7874561403508772 +- 0.01841603941808791 acc-p2: 0.5642543859649123 +- 0.06912621607498706 acc-p3: 0.6119298245614034 +- 0.09528808314997761 acc-p4: 0.7537719298245614 +- 0.07473549078343446 acc-all-p: 0.6725 +- 0.10462149351553651 ===============parameters setting=========== parser.add_argument("--train_examples", default=10, type=int, help="The total number of train examples to use, where -1 equals all examples.") parser.add_argument("--method", required=False, default='pet', choices=['pet', 'ipet', 'sequence_classifier'], help="The training method to use. Either regular sequence classification, PET or iPET.") parser.add_argument("--data_dir", default="./agnews/", type=str, required=False, help="The input data dir. Should contain the data files for the task.") parser.add_argument("--model_type", default="roberta", type=str, required=False, choices=MODEL_CLASSES.keys(), help="The type of the pretrained language model to use") parser.add_argument("--model_name_or_path", default="roberta-large", type=str, required=False, help="Path to the pre-trained model or shortcut name") parser.add_argument("--task_name", default="agnews", type=str, required=False, choices=PROCESSORS.keys(), help="The name of the task to train/evaluate on") parser.add_argument("--output_dir", default="./output/", type=str, required=False, help="The output directory where the model predictions and checkpoints will be written")

    # PET-specific optional parameters
    parser.add_argument("--wrapper_type", default="mlm", choices=WRAPPER_TYPES,
                        help="The wrapper type. Set this to 'mlm' for a masked language model like BERT or to 'plm' "
                             "for a permuted language model like XLNet (only for PET)")
    parser.add_argument("--pattern_ids", default=[0,1,2,3,4], type=int, nargs='+',
                        help="The ids of the PVPs to be used (only for PET)")
    
    opened by Devil-Ideal 1
  • Random seed parameter for iterations

    Random seed parameter for iterations

    Hi @timoschick ,

    thank you very much for pushing your amazing PET tool to this repo. I am currently trying to investigate, how the random seed works, during different iteration per pattern.

    If I understand correctly, PET trains and evaluates for each pattern n=repetitions independent models (stored in the folders final/p0-iY). This is done, to calculate later std deviation for the results.

    You are describing that in your paper, too: "each model is trained three times using different seeds and average results are reported" https://arxiv.org/pdf/2001.07676v3.pdf

    This would happend in pet/modeling.py in line: 326 and 327

        set_seed(seed)
    
        for pattern_id in pattern_ids:
            for iteration in range(repetitions):
    

    Do I interpret this correctly? I am currently struggeling to understand how the different seeds for the model initialization per iteration is handled. We are setting a single seed in line 324. Are the models initialized with diferent weights per iteration? Are there other seeds set?

    Thank youfso much in advance for your help!

    opened by MaviccPRP 0
  • Bump numpy from 1.19 to 1.22.0

    Bump numpy from 1.19 to 1.22.0

    Bumps numpy from 1.19 to 1.22.0.

    Release notes

    Sourced from numpy's releases.

    v1.22.0

    NumPy 1.22.0 Release Notes

    NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

    • Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.
    • A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.
    • NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.
    • New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.
    • A new configurable allocator for use by downstream projects.

    These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

    The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

    Expired deprecations

    Deprecated numeric style dtype strings have been removed

    Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

    (gh-19539)

    Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

    numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

    (gh-19615)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
Owner
Timo Schick
NLP Researcher @ SulzerGmbH , PhD Student @ CIS, LMU Munich
Timo Schick
null 189 Jan 2, 2023
This repository contains the code for EMNLP-2021 paper "Word-Level Coreference Resolution"

Word-Level Coreference Resolution This is a repository with the code to reproduce the experiments described in the paper of the same name, which was a

null 79 Dec 27, 2022
This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems

Proteno This is the data release associated with the corresponding NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deploymen

null 37 Dec 4, 2022
This repository contains Python scripts for extracting linguistic features from Filipino texts.

Filipino Text Linguistic Feature Extractors This repository contains scripts for extracting linguistic features from Filipino texts. The scripts were

Joseph Imperial 1 Oct 5, 2021
This repository contains examples of Task-Informed Meta-Learning

Task-Informed Meta-Learning This repository contains examples of Task-Informed Meta-Learning (paper). We consider two tasks: Crop Type Classification

null 10 Dec 19, 2022
KLUE-baseline contains the baseline code for the Korean Language Understanding Evaluation (KLUE) benchmark.

KLUE Baseline Korean(한국어) KLUE-baseline contains the baseline code for the Korean Language Understanding Evaluation (KLUE) benchmark. See our paper fo

null 74 Dec 13, 2022
Contains descriptions and code of the mini-projects developed in various programming languages

TexttoSpeechAndLanguageTranslator-project introduction A pleasant application where the client will be given buttons like play,reset and exit. The cli

Adarsh Reddy 1 Dec 22, 2021
DomainWordsDict, Chinese words dict that contains more than 68 domains, which can be used as text classification、knowledge enhance task

DomainWordsDict, Chinese words dict that contains more than 68 domains, which can be used as text classification、knowledge enhance task。涵盖68个领域、共计916万词的专业词典知识库,可用于文本分类、知识增强、领域词汇库扩充等自然语言处理应用。

liuhuanyong 357 Dec 24, 2022
This repo contains simple to use, pretrained/training-less models for speaker diarization.

PyDiar This repo contains simple to use, pretrained/training-less models for speaker diarization. Supported Models Binary Key Speaker Modeling Based o

null 12 Jan 20, 2022
Repository to hold code for the cap-bot varient that is being presented at the SIIC Defence Hackathon 2021.

capbot-siic Repository to hold code for the cap-bot varient that is being presented at the SIIC Defence Hackathon 2021. Problem Inspiration A plethora

Aryan Kargwal 19 Feb 17, 2022
This repository will contain the code for the CVPR 2021 paper "GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields"

GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields Project Page | Paper | Supplementary | Video | Slides | Blog | Talk If

null 1.1k Dec 27, 2022
Officile code repository for "A Game-Theoretic Perspective on Risk-Sensitive Reinforcement Learning"

CvarAdversarialRL Official code repository for "A Game-Theoretic Perspective on Risk-Sensitive Reinforcement Learning". Initial setup Create a virtual

Mathieu Godbout 1 Nov 19, 2021
Code repository for "It's About Time: Analog clock Reading in the Wild"

it's about time Code repository for "It's About Time: Analog clock Reading in the Wild" Packages required: pytorch (used 1.9, any reasonable version s

null 52 Nov 10, 2022
Repository of the Code to Chatbots, developed in Python

Description In this repository you will find the Code to my Chatbots, developed in Python. I'll explain the structure of this Repository later. Requir

Li-am K. 0 Oct 25, 2022
This project is part of Eleuther AI's quest to create a massive repository of high quality text data for training language models.

This project is part of Eleuther AI's quest to create a massive repository of high quality text data for training language models.

EleutherAI 42 Dec 13, 2022
This repository describes our reproducible framework for assessing self-supervised representation learning from speech

LeBenchmark: a reproducible framework for assessing SSL from speech Self-Supervised Learning (SSL) using huge unlabeled data has been successfully exp

null 49 Aug 24, 2022
NL-Augmenter 🦎 → 🐍 A Collaborative Repository of Natural Language Transformations

NL-Augmenter ?? → ?? The NL-Augmenter is a collaborative effort intended to add transformations of datasets dealing with natural language. Transformat

null 684 Jan 9, 2023