Few-shot NLP benchmark for unified, rigorous eval

AI2

Last update: Dec 3, 2022

Related tags

Deep Learning flex

Overview

FLEX

FLEX is a benchmark and framework for unified, rigorous few-shot NLP evaluation. FLEX enables:

First-class NLP support
Support for meta-training
Reproducible fewshot evaluations
Extensible benchmark creation (benchmarks defined using HuggingFace Datasets)
Advanced sampling functions for creating episodes with class imbalance, etc.

For more context, see our arXiv preprint.

Together with FLEX, we also released a simple yet strong few-shot model called UniFew. For more details, see our preprint.

Leaderboards

These instructions are geared towards users of the first benchmark created with this framework. The benchmark has two leaderboards, for the Pretraining-Only and Meta-Trained protocols described in Section 4.2 of our paper:

FLEX (Pretraining-Only): for models that do not use meta-training data related to the test tasks (do not follow the Model Training section below).
FLEX-META (Meta-Trained): for models that use only the provided meta-training and meta-validation data (please do see the Model Training section below).

Installation

Clone the repository: git clone [email protected]:allenai/flex.git
Create a Python 3 environment (3.7 or greater), eg using conda create --name flex python=3.9
Activate the environment: conda activate flex
Install the package locally with pip install -e .

Data Preparation

Creating the data for the flex challenge for the first time takes about 10 minutes (using a recent Macbook Pro on a broadband connection) and requires 3GB of disk space. You can initiate this process by running

python -c "import fewshot; fewshot.make_challenge('flex');"

You can control the location of the cached data by setting the environment variable HF_DATASETS_CACHE. If you have not set this variable, the location should default to ~/.cache/huggingface/datasets/. See the HuggingFace docs for more details.

Model Evaluation

"Challenges" are datasets of sampled tasks for evaluation. They are defined in fewshot/challenges/__init__.py.

To evaluate a model on challenge flex (our first challenge), you should write a program that produces a predictions.json, for example:

#!/usr/bin/env python3
import random
from typing import Iterable, Dict, Any, Sequence
import fewshot


class YourModel(fewshot.Model):
    def fit_and_predict(
        self,
        support_x: Iterable[Dict[str, Any]],
        support_y: Iterable[str],
        target_x: Iterable[Dict[str, Any]],
        metadata: Dict[str, Any]
    ) -> Sequence[str]:
        """Return random label predictions for a fewshot task."""
        train_x = [d['txt'] for d in support_x]
        train_y = support_y
        test_x = [d['txt'] for d in target_x]
        test_y = [random.choice(metadata['labels']) for _ in test_x]
        # >>> print(test_y)
        # ['some', 'list', 'of', 'label', 'predictions']
        return test_y


if __name__ == '__main__':
    evaluator = fewshot.make_challenge("flex")
    model = YourModel()
    evaluator.save_model_predictions(model=model, save_path='/path/to/predictions.json')

Warning: Calling fewshot.make_challenge("flex") above requires some time to prepare all the necessary data (see "Data preparation" section).

Running the above script produces /path/to/predictions.json with contents formatted as:

{
    "[QUESTION_ID]": {
        "label": "[CLASS_LABEL]",  # Currently an integer converted to a string
        "score": float  # Only used for ranking tasks
    },
    ...
}

Each [QUESTION_ID] is an ID for a test example in a few-shot problem.

[Optional] Parallelizing Evaluation

Two options are available for parallelizing evaluation.

First, one can restrict evaluation to a subset of tasks with indices from [START] to [STOP] (exclusive) via

evaluator.save_model_predictions(model=model, start_task_index=[START], stop_task_index=[STOP])

Notes:

You may use stop_task_index=None (or omit it) to avoid specifying an end.
You can find the total number of tasks in the challenge with fewshot.get_challenge_spec([CHALLENGE]).num_tasks.
To merge partial evaluation outputs into a complete predictions.json file, use fewshot merge partial1.json partial2.json ... predictions.json.

The second option will call your model's .fit_and_predict() method with batches of [BATCH_SIZE] tasks, via

evaluator.save_model_predictions(model=model, batched=True, batch_size=[BATCH_SIZE])

Result Validation and Scoring

To validate the contents of your predictions, run:

fewshot validate --challenge_name flex --predictions /path/to/predictions.json

This validates all the inputs and takes some time. Substitute flex for another challenge to evaluate on a different challenge.

(There is also a score CLI command which should not be used on the final challenge except when reporting final results.)

Model Training

For the meta-training protocol (e.g., the FLEX-META leaderboard), challenges come with a set of related training and validation data. This data is most easily accessible in one of two formats:

Iterable from sampled episodes. fewshot.get_challenge_spec('flex').get_sampler(split='[SPLIT]') returns an iterable that samples datasets and episodes from meta-training or meta-validation datasets, via [SPLIT]='train' or [SPLIT]='val', respectively. The sampler defaults to the fewshot.samplers.Sample2WayMax8ShotCfg sampler configuration (for the fewshot.samplers.sample.Sampler class), but can be reconfigured.
Raw dataset stores. This option is for directly accessing the raw data. fewshot.get_challenge_spec('flex').get_stores(split='[SPLIT']) returns a mapping from dataset names to fewshot.datasets.store.Store instances. Each Store instance has a Store.store attribute containing a raw HuggingFace Dataset instance. The Store instance has a Store.label attribute with the Dataset object key for accessing the target label (e.g., via Store.store[Store.label]) and the FLEX-formatted text available at the flex.txt key (e.g., via Store.store['flex.txt']).

Two examples of these respective approaches are available at:

The UniFew model repository. For more details on Unifew, see also the FLEX Arxiv paper.
The baselines/bao/ directory, for training and evaluating the approach described in the following paper:

Yujia Bao*, Menghua Wu*, Shiyu Chang, and Regina Barzilay. Few-shot Text Classification with Distributional Signatures. In International Conference on Learning Representations 2020

Benchmark Construction and Optimization

To add a new benchmark (challenge) named [NEW_CHALLENGE], you must edit fewshot/challenges/__init__.py or otherwise add it to the registry. The above usage instructions would change to substitute [NEW_CHALLENGE] in place of flex when calling fewshot.get_challenge_spec('[NEW_CHALLENGE]') and fewshot.make_challenge('[NEW_CHALLENGE]').

For an example of how to optimize the sample size of the challenge, see scripts/README-sample-size.md.

Attribution

If you make use of our framework, benchmark, or model, please cite our preprint:

@misc{bragg2021flex,
      title={FLEX: Unifying Evaluation for Few-Shot NLP},
      author={Jonathan Bragg and Arman Cohan and Kyle Lo and Iz Beltagy},
      year={2021},
      eprint={2107.07170},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

You might also like...

git《FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding》(CVPR 2021) GitHub: [fig8]

FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding (CVPR 2021) This repo contains the implementation of our state-of-the-art fewshot ob

233 Dec 29, 2022

Library of various Few-Shot Learning frameworks for text classification

FewShotText This repository contains code for the paper A Neural Few-Shot Text Classification Reality Check Environment setup # Create environment pyt

47 Jan 3, 2023

Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)

Introduction Pytorch implementation of Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Expert. | paper Song Park1

97 Dec 23, 2022

Official PyTorch Implementation of Hypercorrelation Squeeze for Few-Shot Segmentation, arXiv 2021

Hypercorrelation Squeeze for Few-Shot Segmentation This is the implementation of the paper "Hypercorrelation Squeeze for Few-Shot Segmentation" by Juh

165 Dec 28, 2022

Implementation of Cross Transformer for spatially-aware few-shot transfer, in Pytorch

Cross Transformers - Pytorch (wip) Implementation of Cross Transformer for spatially-aware few-shot transfer, in Pytorch Install $ pip install cross-t

40 Dec 22, 2022

Official repository for Few-shot Image Generation via Cross-domain Correspondence (CVPR '21)

Few-shot Image Generation via Cross-domain Correspondence Utkarsh Ojha, Yijun Li, Jingwan Lu, Alexei A. Efros, Yong Jae Lee, Eli Shechtman, Richard Zh

251 Dec 11, 2022

[CVPR 2021] Few-shot 3D Point Cloud Semantic Segmentation

Few-shot 3D Point Cloud Semantic Segmentation Created by Na Zhao from National University of Singapore Introduction This repository contains the PyTor

117 Dec 27, 2022

Few-Shot Graph Learning for Molecular Property Prediction

Few-shot Graph Learning for Molecular Property Prediction Introduction This is the source code and dataset for the following paper: Few-shot Graph Lea

94 Dec 12, 2022

Few-shot Relation Extraction via Bayesian Meta-learning on Relation Graphs

Few-shot Relation Extraction via Bayesian Meta-learning on Relation Graphs This is an implemetation of the paper Few-shot Relation Extraction via Baye

36 Nov 22, 2022

Comments

Error downloading News-Category-Dataset-v2.json

Error datasets/builder.py fails to reach https://www.researchgate.net/profile/Rishabh-Misra/publication/332141218_News_Category_Dataset/data/5ca2da43a6fdccab2f67c89b/News-Category-Dataset-v2.json , causing dataset preparation to fail.

Details OS: MacOS Monterey 12.0 env: Anaconda - conda 4.10.1 Python: 3.9.12 error message:

HF_DATASETS_CACHE=../data/flex python -c "import fewshot; fewshot.make_challenge('flex');"                                                                                                                                                                                                              main

Using custom data configuration flex-ac8d318a269483f2
Downloading and preparing dataset flex_challenge/flex (download: Unknown size, generated: Unknown size, post-processed: Unknown size, total: Unknown size) to ../data/flex/flex_challenge/flex-ac8d318a269483f2/0.0.1/e5706b643506c30eaea9d75cf6d7cccccd7b2a87583e02892efab5b10291f493...
0 examples [00:00, ? examples/s]Downloading and preparing dataset flex/newsgroupbao (download: Unknown size, generated: Unknown size, post-processed: Unknown size, total: Unknown size) to ../data/flex/flex/newsgroupbao/0.0.1/4d897199532a37859555f12a74cee1cabc88d46d86afafa9cecbedfd2cf992b5...
Downloading: 5.58kB [00:00, 2.19MB/s]
Downloading: 88.5kB [00:00, 57.8MB/s]
Downloading and preparing dataset newsgroups/18828_sci.space (download: 13.99 MiB, generated: 1.73 MiB, post-processed: Unknown size, total: 15.72 MiB) to ../data/flex/newsgroups/18828_sci.space/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1...                    | 0.00/2.13k [00:00<?, ?B/s]
Downloading: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 14.7M/14.7M [00:06<00:00, 2.23MB/s]
                                Dataset newsgroups downloaded and prepared to ../data/flex/newsgroups/18828_sci.space/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1. Subsequent calls will reuse this data.
Downloading: 100%|██████████████████Downloading and preparing dataset newsgroups/18828_sci.crypt (download: 13.99 MiB, generated: 1.96 MiB, post-processed: Unknown size, total: 15.94 MiB) to ../data/flex/newsgroups/18828_sci.crypt/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1...0, 1.56MB/s]
1 examples [00:10, 10.83s/ exampDataset newsgroups downloaded and prepared to ../data/flex/newsgroups/18828_sci.crypt/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1. Subsequent calls will reuse this data.
                                      Downloading and preparing dataset newsgroups/18828_rec.autos (download: 13.99 MiB, generated: 1.24 MiB, post-processed: Unknown size, total: 15.22 MiB) to ../data/flex/newsgroups/18828_rec.autos/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1...
988 examples [00:11,  7.58s/ exaDataset newsgroups downloaded and prepared to ../data/flex/newsgroups/18828_rec.autos/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1. Subsequent calls will reuse this data.
                                       Downloading and preparing dataset newsgroups/18828_sci.med (download: 13.99 MiB, generated: 1.80 MiB, post-processed: Unknown size, total: 15.79 MiB) to ../data/flex/newsgroups/18828_sci.med/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1...
1979 examples [00:12,  5.31s/ exDataset newsgroups downloaded and prepared to ../data/flex/newsgroups/18828_sci.med/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1. Subsequent calls will reuse this data.
                                       Downloading and preparing dataset newsgroups/18828_sci.electronics (download: 13.99 MiB, generated: 1.18 MiB, post-processed: Unknown size, total: 15.17 MiB) to ../data/flex/newsgroups/18828_sci.electronics/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1...9 examples [00:14,  3.72s/ examples]
                                Dataset newsgroups downloaded and prepared to ../data/flex/newsgroups/18828_sci.electronics/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1. Subsequent calls will reuse this data.
                                       Downloading and preparing dataset newsgroups/18828_rec.sport.hockey (download: 13.99 MiB, generated: 1.68 MiB, post-processed: Unknown size, total: 15.66 MiB) to ../data/flex/newsgroups/18828_rec.sport.hockey/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1...examples [00:15,  2.60s/ examples]
                                        Dataset newsgroups downloaded and prepared to ../data/flex/newsgroups/18828_rec.sport.hockey/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1. Subsequent calls will reuse this data.
                                       Downloading and preparing dataset newsgroups/18828_rec.motorcycles (download: 13.99 MiB, generated: 1.15 MiB, post-processed: Unknown size, total: 15.14 MiB) to ../data/flex/newsgroups/18828_rec.motorcycles/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1...0 examples [00:16,  1.82s/ examples]
                                        Dataset newsgroups downloaded and prepared to ../data/flex/newsgroups/18828_rec.motorcycles/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1. Subsequent calls will reuse this data.
                                       Downloading and preparing dataset newsgroups/18828_rec.sport.baseball (download: 13.99 MiB, generated: 1.31 MiB, post-processed: Unknown size, total: 15.29 MiB) to ../data/flex/newsgroups/18828_rec.sport.baseball/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1...ples [00:17,  1.28s/ examples]
                                        Dataset newsgroups downloaded and prepared to ../data/flex/newsgroups/18828_rec.sport.baseball/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1. Subsequent calls will reuse this data.
                                Downloading and preparing dataset newsgroups/18828_comp.graphics (download: 13.99 MiB, generated: 1.58 MiB, post-processed: Unknown size, total: 15.57 MiB) to ../data/flex/newsgroups/18828_comp.graphics/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1...
0 examples [00:00, ? examples/s]Dataset newsgroups downloaded and prepared to ../data/flex/newsgroups/18828_comp.graphics/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1. Subsequent calls will reuse this data.
                                    Downloading and preparing dataset newsgroups/18828_comp.windows.x (download: 13.99 MiB, generated: 1.79 MiB, post-processed: Unknown size, total: 15.78 MiB) to ../data/flex/newsgroups/18828_comp.windows.x/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1...
1 examples [00:00,  1.02 exampleDataset newsgroups downloaded and prepared to ../data/flex/newsgroups/18828_comp.windows.x/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1. Subsequent calls will reuse this data.
                                      Downloading and preparing dataset newsgroups/18828_comp.os.ms-windows.misc (download: 13.99 MiB, generated: 2.27 MiB, post-processed: Unknown size, total: 16.26 MiB) to ../data/flex/newsgroups/18828_comp.os.ms-windows.misc/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1...2,  1.46 examples/s]
                                Dataset newsgroups downloaded and prepared to ../data/flex/newsgroups/18828_comp.os.ms-windows.misc/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1. Subsequent calls will reuse this data.
                                       Downloading and preparing dataset newsgroups/18828_comp.sys.mac.hardware (download: 13.99 MiB, generated: 1.01 MiB, post-processed: Unknown size, total: 15.00 MiB) to ../data/flex/newsgroups/18828_comp.sys.mac.hardware/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1...00:03,  2.08 examples/s]
                                Dataset newsgroups downloaded and prepared to ../data/flex/newsgroups/18828_comp.sys.mac.hardware/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1. Subsequent calls will reuse this data.
                                       Downloading and preparing dataset newsgroups/18828_comp.sys.ibm.pc.hardware (download: 13.99 MiB, generated: 1.13 MiB, post-processed: Unknown size, total: 15.12 MiB) to ../data/flex/newsgroups/18828_comp.sys.ibm.pc.hardware/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1...  2.97 examples/s]
                                Dataset newsgroups downloaded and prepared to ../data/flex/newsgroups/18828_comp.sys.ibm.pc.hardware/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1. Subsequent calls will reuse this data.
                                Downloading and preparing dataset newsgroups/18828_talk.politics.mideast (download: 13.99 MiB, generated: 2.78 MiB, post-processed: Unknown size, total: 16.76 MiB) to ../data/flex/newsgroups/18828_talk.politics.mideast/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1...es [00:00, ? examples/s]
                                        Dataset newsgroups downloaded and prepared to ../data/flex/newsgroups/18828_talk.politics.mideast/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1. Subsequent calls will reuse this data.
                                    Downloading and preparing dataset newsgroups/18828_misc.forsale (download: 13.99 MiB, generated: 903.68 KiB, post-processed: Unknown size, total: 14.87 MiB) to ../data/flex/newsgroups/18828_misc.forsale/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1...
1 examples [00:01,  1.46s/ exampDataset newsgroups downloaded and prepared to ../data/flex/newsgroups/18828_misc.forsale/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1. Subsequent calls will reuse this data.
                                      Downloading and preparing dataset newsgroups/18828_talk.politics.misc (download: 13.99 MiB, generated: 2.01 MiB, post-processed: Unknown size, total: 15.99 MiB) to ../data/flex/newsgroups/18828_talk.politics.misc/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1...ples [00:02,  1.02s/ examples]
                                Dataset newsgroups downloaded and prepared to ../data/flex/newsgroups/18828_talk.politics.misc/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1. Subsequent calls will reuse this data.
                                       Downloading and preparing dataset newsgroups/18828_talk.politics.guns (download: 13.99 MiB, generated: 1.83 MiB, post-processed: Unknown size, total: 15.82 MiB) to ../data/flex/newsgroups/18828_talk.politics.guns/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1...ples [00:03,  1.40 examples/s]
                                Dataset newsgroups downloaded and prepared to ../data/flex/newsgroups/18828_talk.politics.guns/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1. Subsequent calls will reuse this data.
                                       Downloading and preparing dataset newsgroups/18828_talk.religion.misc (download: 13.99 MiB, generated: 1.31 MiB, post-processed: Unknown size, total: 15.30 MiB) to ../data/flex/newsgroups/18828_talk.religion.misc/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1...ples [00:04,  2.00 examples/s]
                                Dataset newsgroups downloaded and prepared to ../data/flex/newsgroups/18828_talk.religion.misc/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1. Subsequent calls will reuse this data.
                                       Downloading and preparing dataset newsgroups/18828_alt.atheism (download: 13.99 MiB, generated: 1.59 MiB, post-processed: Unknown size, total: 15.58 MiB) to ../data/flex/newsgroups/18828_alt.atheism/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1...
3598 examples [00:05,  2.85 examDataset newsgroups downloaded and prepared to ../data/flex/newsgroups/18828_alt.atheism/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1. Subsequent calls will reuse this data.
                                       Downloading and preparing dataset newsgroups/18828_soc.religion.christian (download: 13.99 MiB, generated: 2.20 MiB, post-processed: Unknown size, total: 16.19 MiB) to ../data/flex/newsgroups/18828_soc.religion.christian/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1...:06,  4.06 examples/s]
                                Dataset newsgroups downloaded and prepared to ../data/flex/newsgroups/18828_soc.religion.christian/3.0.0/e6e5083c29aede4dcb47b3eb525f4cb2b34be7c1d24579e8d3c7921c275d04f1. Subsequent calls will reuse this data.
                                       Dataset flex downloaded and prepared to ../data/flex/flex/newsgroupbao/0.0.1/4d897199532a37859555f12a74cee1cabc88d46d86afafa9cecbedfd2cf992b5. Subsequent calls will reuse this data.
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6021/6021 [00:00<00:00, 34493.84ex/s]
Downloading and preparing dataset flex/reutersbao (download: Unknown size, generated: Unknown size, post-processed: Unknown size, total: Unknown size) to ../data/flex/flex/reutersbao/0.0.1/4d897199532a37859555f12a74cee1cabc88d46d86afafa9cecbedfd2cf992b5...                     | 3481/6021 [00:00<00:00, 34808.62ex/s]
Downloading: 16.7kB [00:00, 3.04MB/s]
Downloading: 19.9kB [00:00, 7.84MB/s]
Downloading and preparing dataset reuters21578/ModApte (download: 7.77 MiB, generated: 12.48 MiB, post-processed: Unknown size, total: 20.25 MiB) to ../data/flex/reuters21578/ModApte/1.0.0/db07a538280c8bed1b46d585df036b84c9293e3fe69a423355c77929cd4c8072...                                | 0.00/4.18k [00:00<?, ?B/s]
Downloading: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8.15M/8.15M [00:03<00:00, 2.14MB/s]
                                      Dataset reuters21578 downloaded and prepared to ../data/flex/reuters21578/ModApte/1.0.0/db07a538280c8bed1b46d585df036b84c9293e3fe69a423355c77929cd4c8072. Subsequent calls will reuse this data.
Downloading: 100%|█████████████████████Reusing dataset reuters21578 (../data/flex/reuters21578/ModApte/1.0.0/db07a538280c8bed1b46d585df036b84c9293e3fe69a423355c77929cd4c8072)████████████████████████████████████████████████████████████████████████████████████████████████████████▋| 8.14M/8.15M [00:03<00:00, 1.96MB/s]
5052 examples [00:08,  1.49 examReusing dataset reuters21578 (../data/flex/reuters21578/ModApte/1.0.0/db07a538280c8bed1b46d585df036b84c9293e3fe69a423355c77929cd4c8072)
0 examples [00:00, ? examples/s]      Reusing dataset reuters21578 (../data/flex/reuters21578/ModApte/1.0.0/db07a538280c8bed1b46d585df036b84c9293e3fe69a423355c77929cd4c8072)
352 examples [00:01, 11.35 exampReusing dataset reuters21578 (../data/flex/reuters21578/ModApte/1.0.0/db07a538280c8bed1b46d585df036b84c9293e3fe69a423355c77929cd4c8072)
0 examples [00:00, ? examples/s]      Reusing dataset reuters21578 (../data/flex/reuters21578/ModApte/1.0.0/db07a538280c8bed1b46d585df036b84c9293e3fe69a423355c77929cd4c8072)
568 examples [00:01, 10.69 examples/s]Dataset flex downloaded and prepared to ../data/flex/flex/reutersbao/0.0.1/4d897199532a37859555f12a74cee1cabc88d46d86afafa9cecbedfd2cf992b5. Subsequent calls will reuse this data.
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 835/835 [00:00<00:00, 32325.82ex/s]
Using custom data configuration huffpostbao-814d632a0c092409                                                                                                                                                                                                                                        | 0/835 [00:00<?, ?ex/s]
Downloading and preparing dataset flex/huffpostbao (download: Unknown size, generated: Unknown size, post-processed: Unknown size, total: Unknown size) to ../data/flex/flex/huffpostbao-814d632a0c092409/0.0.1/4d897199532a37859555f12a74cee1cabc88d46d86afafa9cecbedfd2cf992b5...
                                Using custom data configuration default
Downloading and preparing dataset huff_post/default (download: Unknown size, generated: Unknown size, post-processed: Unknown size, total: Unknown size) to ../data/flex/huff_post/default/0.0.2/4d0d3813bfdbad5ed9ec7da463465833b8f1692cb723a71a1a203433f571b3da...
Traceback (most recent call last):
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/datasets/builder.py", line 652, in _download_and_prepare
    self._prepare_split(split_generator, **prepare_split_kwargs)
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/datasets/builder.py", line 986, in _prepare_split
    for key, record in utils.tqdm(
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/tqdm/std.py", line 1133, in __iter__
    for obj in iterable:
  File "/Users/hyperbolicjb/.cache/huggingface/modules/datasets_modules/datasets/flex/4d897199532a37859555f12a74cee1cabc88d46d86afafa9cecbedfd2cf992b5/flex.py", line 171, in _generate_examples
    dataset = datasets.load_dataset(**load_dataset_kwargs)
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/datasets/load.py", line 742, in load_dataset
    builder_instance.download_and_prepare(
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/datasets/builder.py", line 574, in download_and_prepare
    self._download_and_prepare(
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/datasets/builder.py", line 630, in _download_and_prepare
    split_generators = self._split_generators(dl_manager, **split_generators_kwargs)
  File "/Users/hyperbolicjb/.cache/huggingface/modules/datasets_modules/datasets/huffpost/4d0d3813bfdbad5ed9ec7da463465833b8f1692cb723a71a1a203433f571b3da/huffpost.py", line 26, in _split_generators
    path = dl_manager.download(_URL)
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/datasets/utils/download_manager.py", line 195, in download
    downloaded_path_or_paths = map_nested(
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/datasets/utils/py_utils.py", line 195, in map_nested
    return function(data_struct)
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/datasets/utils/download_manager.py", line 218, in _download
    return cached_path(url_or_filename, download_config=download_config)
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/datasets/utils/file_utils.py", line 281, in cached_path
    output_path = get_from_cache(
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/datasets/utils/file_utils.py", line 623, in get_from_cache
    raise ConnectionError("Couldn't reach {}".format(url))
ConnectionError: Couldn't reach https://www.researchgate.net/profile/Rishabh-Misra/publication/332141218_News_Category_Dataset/data/5ca2da43a6fdccab2f67c89b/News-Category-Dataset-v2.json

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 62, in _call_target
    return target(*args, **kwargs)
  File "/Users/hyperbolicjb/Projects/flex/fewshot/datasets/store.py", line 73, in __init__
    self.store = load_dataset(**hf_load_kwargs)
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/datasets/load.py", line 742, in load_dataset
    builder_instance.download_and_prepare(
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/datasets/builder.py", line 574, in download_and_prepare
    self._download_and_prepare(
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/datasets/builder.py", line 654, in _download_and_prepare
    raise OSError(
OSError: Cannot find data file.
Original error:
Couldn't reach https://www.researchgate.net/profile/Rishabh-Misra/publication/332141218_News_Category_Dataset/data/5ca2da43a6fdccab2f67c89b/News-Category-Dataset-v2.json

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/datasets/builder.py", line 652, in _download_and_prepare
    self._prepare_split(split_generator, **prepare_split_kwargs)
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/datasets/builder.py", line 986, in _prepare_split
    for key, record in utils.tqdm(
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/tqdm/std.py", line 1133, in __iter__
    for obj in iterable:
  File "/Users/hyperbolicjb/.cache/huggingface/modules/datasets_modules/datasets/challenge/e5706b643506c30eaea9d75cf6d7cccccd7b2a87583e02892efab5b10291f493/challenge.py", line 109, in _generate_examples
    sampler = instantiate(challenge.metadatasampler)
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 180, in instantiate
    return instantiate_node(config, *args, recursive=_recursive_, convert=_convert_)
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 245, in instantiate_node
    value = instantiate_node(
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 223, in instantiate_node
    items = [
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 224, in <listcomp>
    instantiate_node(item, convert=convert, recursive=recursive)
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 245, in instantiate_node
    value = instantiate_node(
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 249, in instantiate_node
    return _call_target(target, *args, **kwargs)
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 64, in _call_target
    raise type(e)(
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 62, in _call_target
    return target(*args, **kwargs)
  File "/Users/hyperbolicjb/Projects/flex/fewshot/datasets/store.py", line 73, in __init__
    self.store = load_dataset(**hf_load_kwargs)
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/datasets/load.py", line 742, in load_dataset
    builder_instance.download_and_prepare(
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/datasets/builder.py", line 574, in download_and_prepare
    self._download_and_prepare(
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/datasets/builder.py", line 654, in _download_and_prepare
    raise OSError(
OSError: Error instantiating 'fewshot.datasets.store.Store' : Cannot find data file.
Original error:
Couldn't reach https://www.researchgate.net/profile/Rishabh-Misra/publication/332141218_News_Category_Dataset/data/5ca2da43a6fdccab2f67c89b/News-Category-Dataset-v2.json

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/hyperbolicjb/Projects/flex/fewshot/challenges/registration.py", line 13, in make
    return registry.make(id, **evaluator_kwargs)
  File "/Users/hyperbolicjb/Projects/flex/fewshot/challenges/registration.py", line 108, in make
    return self.get_spec(id).make(**evaluator_kwargs)
  File "/Users/hyperbolicjb/Projects/flex/fewshot/challenges/registration.py", line 29, in make
    return Evaluator(config_name=self.id, hash=self.hash, **evaluator_kwargs)
  File "/Users/hyperbolicjb/Projects/flex/fewshot/challenges/eval.py", line 89, in __init__
    self.dataset = load_dataset(
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/datasets/load.py", line 742, in load_dataset
    builder_instance.download_and_prepare(
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/datasets/builder.py", line 574, in download_and_prepare
    self._download_and_prepare(
  File "/Users/hyperbolicjb/opt/anaconda3/envs/flex/lib/python3.9/site-packages/datasets/builder.py", line 654, in _download_and_prepare
    raise OSError(
OSError: Cannot find data file.
Original error:
Error instantiating 'fewshot.datasets.store.Store' : Cannot find data file.
Original error:
Couldn't reach https://www.researchgate.net/profile/Rishabh-Misra/publication/332141218_News_Category_Dataset/data/5ca2da43a6fdccab2f67c89b/News-Category-Dataset-v2.json

I have installed the requirements by first pip install -r requirements.txt, then pip install -r ., as suggested in the previously resolved issue. No connection problem accessing https://www.researchgate.net/profile/Rishabh-Misra/publication/332141218_News_Category_Dataset/data/5ca2da43a6fdccab2f67c89b/News-Category-Dataset-v2.json in the browser.

opened by junbohuang 5

Data Preparation Problem for conll2003

Hi authors, First of all, I would like to thank for your great works. I've install the packages according to the requirement file, and I run the command for data preparation as: python -c "import fewshot; fewshot.make_challenge('flex');"

However, an error shows when downloading the conll2003 dataset:

Downloading and preparing dataset conll2003/conll2003 (download: 4.63 MiB, generated: 9.78 MiB, post-processed: Unknown size, total: 14.41 MiB) to /home/yisyuan/.cache/huggingface/datasets/conll2003/conll2003/1.0.0/40e7cb6bcc374f7c349c83acd1e9352a4f09474eb691f64f364ee62eb65d0ca6... Traceback (most recent call last): File "", line 1, in | 0/3 [00:00<?, ?it/s] File "/home/yisyuan/Workspace_2_250GB_SSD/researches/flex/fewshot/challenges/registration.py", line 13, in make return registry.make(id, **evaluator_kwargs) File "/home/yisyuan/Workspace_2_250GB_SSD/researches/flex/fewshot/challenges/registration.py", line 108, in make return self.get_spec(id).make(**evaluator_kwargs) File "/home/yisyuan/Workspace_2_250GB_SSD/researches/flex/fewshot/challenges/registration.py", line 29, in make return Evaluator(config_name=self.id, hash=self.hash, **evaluator_kwargs) File "/home/yisyuan/Workspace_2_250GB_SSD/researches/flex/fewshot/challenges/eval.py", line 93, in init split='test', File "/home/yisyuan/Venv/flex/lib/python3.7/site-packages/datasets/load.py", line 1707, in load_dataset use_auth_token=use_auth_token, File "/home/yisyuan/Venv/flex/lib/python3.7/site-packages/datasets/builder.py", line 595, in download_and_prepare dl_manager=dl_manager, verify_infos=verify_infos, **download_and_prepare_kwargs File "/home/yisyuan/Venv/flex/lib/python3.7/site-packages/datasets/builder.py", line 690, in _download_and_prepare ) from None OSError: Cannot find data file. Original error: Error instantiating 'fewshot.datasets.store.Store' : Couldn't find file at https://github.com/davidsbatista/NER-datasets/raw/master/CONLL2003/train.txt

It seems like the error comes from the Huggingface datasets since there is a related issue which has been solved. (https://github.com/huggingface/datasets/issues/3582) However, I've changed the version of datasets from 1.8.0 (as requirements) to 1.18.3 (current version) but the error still happens. Also, I've tried to download the conll2003 dataset directly with 1.18.3 version and it just goes well: datasets.load_dataset("conll2003") So, I'm not quite sure what goes wrong. It would be really appreciated if you could provide any suggestion. Thank you!

opened by YiSyuanChen 3

Error running `python -c "import fewshot; fewshot.make_challenge('flex');"`

Hello,

thank you very much for making the FLEX code available.

I installed it according to https://github.com/allenai/flex#installation (running pip install -e . in the flex dir) and tried to run python -c "import fewshot; fewshot.make_challenge('flex');" but I get the following error on Ubuntu and MacOSX:

Using custom data configuration flex-ead21b2c6fc7f994
Downloading and preparing dataset flex_challenge/flex to /home/mmp/.cache/huggingface/datasets/flex_challenge/flex-ead21b2c6fc7f994/0.0.1/e5706
b643506c30eaea9d75cf6d7cccccd7b2a87583e02892efab5b10291f493...
Generating test split: 0 examples [00:00, ? examples/s]Using custom data configuration newsgroupbao-255d26ad5737e61d
Downloading and preparing dataset flex/newsgroupbao to .cache/huggingface/datasets/flex/newsgroupbao-255d26ad5737e61d/0.0.1/4d8971995
32a37859555f12a74cee1cabc88d46d86afafa9cecbedfd2cf992b5...
Traceback (most recent call last):
  File "miniconda3/envs/flex/lib/python3.9/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 86, in _call_target
    return _target_(*args, **kwargs)
  File "flex/fewshot/datasets/store.py", line 73, in __init__
    self.store = load_dataset(**hf_load_kwargs)
  File "miniconda3/envs/flex/lib/python3.9/site-packages/datasets/load.py", line 1691, in load_dataset
    builder_instance.download_and_prepare(                                                                                                       File "miniconda3/envs/flex/lib/python3.9/site-packages/datasets/builder.py", line 605, in download_and_prepare
    self._download_and_prepare(
  File "miniconda3/envs/flex/lib/python3.9/site-packages/datasets/builder.py", line 1104, in _download_and_prepare
    super()._download_and_prepare(dl_manager, verify_infos, check_duplicate_keys=verify_infos)
  File "miniconda3/envs/flex/lib/python3.9/site-packages/datasets/builder.py", line 694, in _download_and_prepare
    self._prepare_split(split_generator, **prepare_split_kwargs)
  File "miniconda3/envs/flex/lib/python3.9/site-packages/datasets/builder.py", line 1087, in _prepare_split
    for key, record in logging.tqdm(
  File "miniconda3/envs/flex/lib/python3.9/site-packages/tqdm/std.py", line 1195, in __iter__
    for obj in iterable:
  File ".cache/huggingface/modules/datasets_modules/datasets/flex/4d897199532a37859555f12a74cee1cabc88d46d86afafa9cecbedfd2cf992b5/fl
ex.py", line 154, in _generate_examples
    dataset = datasets.load_dataset(**load_dataset_kwargs)
  File "miniconda3/envs/flex/lib/python3.9/site-packages/datasets/load.py", line 1664, in load_dataset
    builder_instance = load_dataset_builder(
  File "miniconda3/envs/flex/lib/python3.9/site-packages/datasets/load.py", line 1516, in load_dataset_builder
    builder_instance: DatasetBuilder = builder_cls(
  File "miniconda3/envs/flex/lib/python3.9/site-packages/datasets/builder.py", line 1031, in __init__
    super().__init__(*args, **kwargs)
  File "miniconda3/envs/flex/lib/python3.9/site-packages/datasets/builder.py", line 265, in __init__
    self.config, self.config_id = self._create_builder_config(
  File "miniconda3/envs/flex/lib/python3.9/site-packages/datasets/builder.py", line 375, in _create_builder_config
    raise ValueError(f"BuilderConfig {builder_config} doesn't have a '{key}' key.")
ValueError: BuilderConfig NewsgroupConfig(name='18828_sci.space', version=3.0.0, data_dir=None, data_files=None, description='does not include
cross-posts and includes only the "From" and "Subject" headers.') doesn't have a 'script_version' key.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "flex/fewshot/challenges/registration.py", line 13, in make
    return registry.make(id, **evaluator_kwargs)
  File "flex/fewshot/challenges/registration.py", line 108, in make
    return self.get_spec(id).make(**evaluator_kwargs)
  File "flex/fewshot/challenges/registration.py", line 29, in make
    return Evaluator(config_name=self.id, hash=self.hash, **evaluator_kwargs)
  File "flex/fewshot/challenges/eval.py", line 89, in __init__
    self.dataset = load_dataset(
  File "miniconda3/envs/flex/lib/python3.9/site-packages/datasets/load.py", line 1691, in load_dataset
    builder_instance.download_and_prepare(
  File "miniconda3/envs/flex/lib/python3.9/site-packages/datasets/builder.py", line 605, in download_and_prepare
    self._download_and_prepare(
  File "miniconda3/envs/flex/lib/python3.9/site-packages/datasets/builder.py", line 1104, in _download_and_prepare
    super()._download_and_prepare(dl_manager, verify_infos, check_duplicate_keys=verify_infos)
  File "miniconda3/envs/flex/lib/python3.9/site-packages/datasets/builder.py", line 694, in _download_and_prepare
    self._prepare_split(split_generator, **prepare_split_kwargs)
  File "miniconda3/envs/flex/lib/python3.9/site-packages/datasets/builder.py", line 1087, in _prepare_split
    for key, record in logging.tqdm(
  File "miniconda3/envs/flex/lib/python3.9/site-packages/tqdm/std.py", line 1195, in __iter__
    for obj in iterable:
  File ".cache/huggingface/modules/datasets_modules/datasets/challenge/e5706b643506c30eaea9d75cf6d7cccccd7b2a87583e02892efab5b10291f4
93/challenge.py", line 109, in _generate_examples
    sampler = instantiate(challenge.metadatasampler)
  File "miniconda3/envs/flex/lib/python3.9/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 218, in instantiate
    return instantiate_node(
  File "miniconda3/envs/flex/lib/python3.9/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 326, in instantiate_node
    value = instantiate_node(
  File "miniconda3/envs/flex/lib/python3.9/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 304, in instantiate_node
    items = [
  File "miniconda3/envs/flex/lib/python3.9/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 305, in <listcomp>
    instantiate_node(item, convert=convert, recursive=recursive)
  File "miniconda3/envs/flex/lib/python3.9/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 326, in instantiate_node
    value = instantiate_node(
  File "miniconda3/envs/flex/lib/python3.9/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 331, in instantiate_node
    return _call_target(_target_, partial, args, kwargs, full_key)
  File "miniconda3/envs/flex/lib/python3.9/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 91, in _call_target
    raise InstantiationException(msg) from e
hydra.errors.InstantiationException: Error in call to target 'fewshot.datasets.store.Store':
ValueError('BuilderConfig NewsgroupConfig(name=\'18828_sci.space\', version=3.0.0, data_dir=None, data_files=None, description=\'does not inclu
de cross-posts and includes only the "From" and "Subject" headers.\') doesn\'t have a \'script_version\' key.')
full_key: datasets[0].labeled_store

I'm trying to track down the error but any help is highly appreciated. :-)