Few-NERD: Not Only a Few-shot NER Dataset

Overview

Few-NERD: Not Only a Few-shot NER Dataset

This is the source code of the ACL-IJCNLP 2021 paper: Few-NERD: A Few-shot Named Entity Recognition Dataset. Check out the website of Few-NERD.

Contents

Overview

Few-NERD is a large-scale, fine-grained manually annotated named entity recognition dataset, which contains 8 coarse-grained types, 66 fine-grained types, 188,200 sentences, 491,711 entities and 4,601,223 tokens. Three benchmark tasks are built, one is supervised: Few-NERD (SUP) and the other two are few-shot: Few-NERD (INTRA) and Few-NERD (INTER).

The schema of Few-NERD is:

Few-NERD is manually annotated based on the context, for example, in the sentence "London is the fifth album by the British rock band…", the named entity London is labeled as Art-Music.

Requirements

 Run the following script to install the remaining dependencies,

pip install -r requirements.txt

Few-NERD Dataset

Get the Data

  • Few-NERD contains 8 coarse-grained types, 66 fine-grained types, 188,200 sentences, 491,711 entities and 4,601,223 tokens.
  • We have splitted the data into 3 training mode. One for supervised setting-supervised, theo ther two for few-shot setting inter and intra. Each contains three files train.txtdev.txttest.txtsuperviseddatasets are randomly split. inter datasets are randomly split within coarse type, i.e. each file contains all 8 coarse types but different fine-grained types. intra datasets are randomly split by coarse type.
  • The splitted dataset can be downloaded automatically once you run the model. If you want to download the data manually, run data/download.sh, remember to add parameter supervised/inter/intra to indicte the type of the dataset

To obtain the three benchmarks datasets of Few-NERD, simply run the bash file data/download.sh

bash data/download.sh supervised

Data Format

The data are pre-processed into the typical NER data forms as below (token\tlabel).

Between	O
1789	O
and	O
1793	O
he	O
sat	O
on	O
a	O
committee	O
reviewing	O
the	O
administrative	MISC-law
constitution	MISC-law
of	MISC-law
Galicia	MISC-law
to	O
little	O
effect	O
.	O

Structure

The structure of our project is:

--util
| -- framework.py
| -- data_loader.py
| -- viterbi.py             # viterbi decoder for structshot only
| -- word_encoder
| -- fewshotsampler.py

-- proto.py                 # prototypical model
-- nnshot.py                # nnshot model

-- train_demo.py            # main training script

Key Implementations

Sampler

As established in our paper, we design an N way K~2K shot sampling strategy in our work , the implementation is sat util/fewshotsampler.py.

ProtoBERT

Prototypical nets with BERT is implemented in model/proto.py.

How to Run

Run train_demo.py. The arguments are presented below. The default parameters are for proto model on intermode dataset.

-- mode                 training mode, must be inter, intra, or supervised
-- trainN               N in train
-- N                    N in val and test
-- K                    K shot
-- Q                    Num of query per class
-- batch_size           batch size
-- train_iter           num of iters in training
-- val_iter             num of iters in validation
-- test_iter            num of iters in testing
-- val_step             val after training how many iters
-- model                model name, must be proto, nnshot or structshot
-- max_length           max length of tokenized sentence
-- lr                   learning rate
-- weight_decay         weight decay
-- grad_iter            accumulate gradient every x iterations
-- load_ckpt            path to load model
-- save_ckpt            path to save model
-- fp16                 use nvidia apex fp16
-- only_test            no training process, only test
-- ckpt_name            checkpoint name
-- seed                 random seed
-- pretrain_ckpt        bert pre-trained checkpoint
-- dot                  use dot instead of L2 distance in distance calculation
-- use_sgd_for_bert     use SGD instead of AdamW for BERT.
# only for structshot
-- tau                  StructShot parameter to re-normalizes the transition probabilities
  • For hyperparameter --tau in structshot, we use 0.32 in 1-shot setting, 0.318 for 5-way-5-shot setting, and 0.434 for 10-way-5-shot setting.

  • Take structshot model on inter dataset for example, the expriments can be run as follows.

5-way-1~5-shot

python3 train_demo.py  --train data/mydata/train-inter.txt \
--val data/mydata/val-inter.txt --test data/mydata/test-inter.txt \
--lr 1e-3 --batch_size 2 --trainN 5 --N 5 --K 1 --Q 1 \
--train_iter 10000 --val_iter 500 --test_iter 5000 --val_step 1000 \
--max_length 60 --model structshot --tau 0.32

5-way-5~10-shot

python3 train_demo.py  --train data/mydata/train-inter.txt \
--val data/mydata/val-inter.txt --test data/mydata/test-inter.txt \
--lr 1e-3 --batch_size 2 --trainN 5 --N 5 --K 5 --Q 5 \
--train_iter 10000 --val_iter 500 --test_iter 5000 --val_step 1000 \
--max_length 60 --model structshot --tau 0.318

10-way-1~5-shot

python3 train_demo.py  --train data/mydata/train-inter.txt \
--val data/mydata/val-inter.txt --test data/mydata/test-inter.txt \
--lr 1e-3 --batch_size 2 --trainN 10 --N 10 --K 1 --Q 1 \
--train_iter 10000 --val_iter 500 --test_iter 5000 --val_step 1000 \
--max_length 60 --model structshot --tau 0.32

10-way-5~10-shot

python3 train_demo.py  --train data/mydata/train-inter.txt \
--val data/mydata/val-inter.txt --test data/mydata/test-inter.txt \
--lr 1e-3 --batch_size 2 --trainN 5 --N 5 --K 5 --Q 1 \
--train_iter 10000 --val_iter 500 --test_iter 5000 --val_step 1000 \
--max_length 60 --model structshot --tau 0.434

Citation

If you use Few-NERD in your work, please cite our paper:

@inproceedings{ding2021few,
title={Few-NERD: A Few-Shot Named Entity Recognition Dataset},
author={Ding, Ning and Xu, Guangwei and Chen, Yulin, and Wang, Xiaobin and Han, Xu and Xie, Pengjun and Zheng, Hai-Tao and Liu, Zhiyuan},
booktitle={ACL-IJCNLP},
year={2021}
}

Connection

If you have any questions, feel free to contact

Comments
  • How to do inference on my custom data after training on FewNERD data?

    How to do inference on my custom data after training on FewNERD data?

    How do I make use of this FewNERD model to do inferencing on my data-set after training on FewNERD data. The idea is to see how it performs on my custom -data?

    Step1: I train the proto model using FEWNerd data as mentioned:

    python3 train_demo.py --mode inter --lr 1e-4 --batch_size 8 --trainN 5 --N 5 --K 1 --Q 1 --train_iter 10000 --val_iter 500 --test_iter 5000 --val_step 1000 --max_length 64 --model proto --tau 0.32 Once training is complete:

    I have data-set which is in this format and has entities as ['dispute_amount', 'dispute_date', 'competitors]

    My test-data is in this format?

    
    {"
    word": 
              [ 
                  ["Dispute", "Case", "ID", "MM-Z-*********", "the", "amount", "of", "$99.99", "should", "be", "AU", "dollars", "total", "is", "$86.85", "US", "dollars."], 
                   ["8:27am,", "I", "started", "a", "claim", "for", "which", "I", "was", "refunded", "for", "one", "item,", "but", "not", "for", "the", "other,", "from", "the", "same", "seller."] 
              ], 
    
    "label": [ 
    
                     ["O", "O", "O", "O", "O", "O", "O", "dispute_amount", "O", "O", "O", "O", "O", "O", "dispute_amount", "dispute_amount", "dispute_amount"], 
                     ["dispute_date", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O"] 
                  ],  
    
    "types": [["dispute_amount"], ["dispute_date"]]
    
    }
    

    How can I test and print outputs on my test data?

    @cyl628 @ningding97

    opened by pratikchhapolika 13
  • Inference

    Inference

    Hello, thanks for this project. I was able to correctly train a structshot using the train script. Could you show how to correctly run the inference for an input sequence? In my understanding, the loading would look like

    import os
    import torch
    
    from fewnerd.util.word_encoder import BERTWordEncoder
    from fewnerd.model.proto import Proto
    from fewnerd.model.nnshot import NNShot
    
    # cache dir
    cache_dir = os.getenv("cache_dir", "../../models")
    model_path = 'structshot-inter-5-5-seed0.pth.tar'
    model_name = 'structshot'
    pretrain_ckpt = 'bert-base-uncased'
    max_length = 100
    
    # BERT word encoder
    word_encoder = BERTWordEncoder(
            pretrain_ckpt,
            max_length)
    
    if model_name == 'proto':
        # use dot instead of L2 distance for proto
        model = Proto(word_encoder, dot=True)
    elif model_name == 'nnshot':
        model = NNShot(word_encoder, dot=False)
    elif model_name == 'structshot':
        model = NNShot(word_encoder, dot=False)
    
    model.load_state_dict(torch.load(os.path.join(cache_dir, model_path)))
    

    but I get some errors on the state dicts (RuntimeError: Error(s) in loading state_dict for NNShot:) in this way...

    Thank you in advance!

    opened by loretoparisi 9
  • What is the difference between episode-data model training and non-episode-data.

    What is the difference between episode-data model training and non-episode-data.

    Inside data/episode-data/inter/train_5_5.jsonl I see lot of training, test and dev data. They are like support and query .

    In the data/inter/train.txt I see a different form of training data without support and query .

    What is the purpose to keep these 2 formats of data?

    Does both are used to train few-shot learning?

    @cyl628

    opened by pratikchhapolika 5
  • Regarding Data-set in inter/intra folder

    Regarding Data-set in inter/intra folder

    Inside data/episode-data/inter/ I see lot of training, test and dev data. I may be asking few silly questions , please pardon.

    I was exploring train_5_5.jsonl. What does train_5_5.jsonl signifies? Does it has to do anything with support and query set?

    Here is one example:

    I see support has 14 sentences and Query has 15 sentences..

    So, this one example mentioned below : support and query is fed into model as single example? Support is used to train the model? Then why Query is used?? I am seeing this training data structure for the first time. Could you give me more insights in lay man terms how training happens?

    {
      "support":
              {"word":
                      [
                        ["averostra", ",", "or", "``", "bird", "snouts", "''", ",", "is", "a", "clade", "that", "includes", "most", "theropod", "dinosaurs", "that", "have", "a", "promaxillary", "fenestra", "(", "``", "fenestra", "promaxillaris", "``", ")", ",", "an", "extra", "opening", "in", "the", "front", "outer", "side", "of", "the", "maxilla", ",", "the", "bone", "that", "makes", "up", "the", "upper", "jaw", "."],
                        ["since", "that", "time", ",", "the", "squadron", "made", "several", "extended", "indian", "ocean", ",", "mediterranean", "sea", ",", "and", "north", "atlantic", "deployments", "as", "part", "of", "cvw-1", "/", "cv-66", ",", "until", "the", "decommissioning", "of", "uss", "``", "america", "''", "in", "1996", "."],
                        ["the", "alpha-gal", "allergy", "is", "believed", "to", "result", "from", "tick", "bites", "."],
                        ["interaction", "was", "shown", "to", "occur", "with", "the", "dna", "-directed", "rna", "polymerase", "ii", "subunit", ",", "rpb1", ",", "of", "rna", "polymerase", "ii", "during", "both", "mitosis", "and", "interphase", "."],
                        ["he", "is", "also", "responsible", "for", "programming", "on", "diablo", "ii", ",", "the", "development", "of", "the", "battle.net", "game", "server", "network", ",", "and", "the", "quake", "2", "mod", "loki", "'s", "minions", "capture", "the", "flag", "."],
                        ["minix", "was", "first", "released", "in", "1987", ",", "with", "its", "complete", "source", "code", "made", "available", "to", "universities", "for", "study", "in", "courses", "and", "research", "."],
                        ["terminal", "island", "is", "a", "low", "snow-covered", "island", "off", "the", "north", "tip", "of", "alexander", "island", ",", "in", "the", "bellingshausen", "sea", "west", "of", "palmer", "land", ",", "antarctic", "peninsula", "."],
                        ["among", "these", "were", "net/one", ",", "3+", ",", "banyan", "vines", "and", "novell", "'s", "ipx", "/", "spx", "."],
                        ["in", "1933\u20131970", ",", "a", "summer", "camp", "on", "south", "bass", "island", "operated", "for", "episcopal", "and", "anglican", "choristers", "."],
                        ["she", "is", "also", "the", "only", "cam", "ship", "whose", "fighter", "pilot", "died", "in", "action", "after", "his", "aircraft", "was", "launched", "from", "the", "ship", "."],
                        ["the", "department", "of", "social", "welfare", "and", "development", "(", "dswd", ")", "has", "distributed", "relief", "goods", "to", "residents", "of", "boracay", "while", "the", "island", "is", "closed", "to", "tourists", "."],
                        ["``", "rainbow", "``", "was", "scrapped", "in", "1940", "."],
                        ["it", "is", "the", "leading", "firm", "for", "the", "charlotte", "douglas", "international", "airport", "airfield", "expansion", ",", "the", "new", "dallas", "fort", "worth", "international", "airport", "southwest", "end-around", "taxiway", ",", "and", "master", "plan", "updates", "at", "philadelphia", "international", "airport", "and", "san", "antonio", "international", "airport", "."],
                        ["the", "event", "held", "at", "solberg-hunterdon", "airport", "is", "the", "largest", "summertime", "hot", "air", "balloon", "festival", "in", "north", "america", "."]
                      ],
    
              "label":
                      [
                        ["O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "other-biologything", "other-biologything", "O", "O", "other-biologything", "other-biologything", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "other-biologything", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O"],
                        ["O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "product-ship", "O", "product-ship", "O", "O", "O", "O", "O", "product-ship", "product-ship", "product-ship", "product-ship", "O", "O", "O"],
                        ["O", "O", "O", "O", "O", "O", "O", "O", "other-biologything", "O", "O"],
                        ["O", "O", "O", "O", "O", "O", "O", "other-biologything", "O", "other-biologything", "other-biologything", "other-biologything", "O", "O", "other-biologything", "O", "O", "other-biologything", "other-biologything", "other-biologything", "O", "O", "O", "O", "O", "O"],
                        ["O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "product-software", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O"],
                        ["product-software", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O"],
                        ["location-island", "location-island", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "location-island", "location-island", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "location-island", "location-island", "O"],
                        ["O", "O", "O", "product-software", "O", "product-software", "O", "product-software", "product-software", "O", "product-software", "product-software", "product-software", "O", "product-software", "O"],
                        ["O", "O", "O", "O", "O", "O", "O", "location-island", "location-island", "location-island", "O", "O", "O", "O", "O", "O", "O"],
                        ["O", "O", "O", "O", "O", "product-ship", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O"],
                        ["O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "location-island", "O", "O", "O", "O", "O", "O", "O", "O"],
                        ["O", "product-ship", "O", "O", "O", "O", "O", "O"],
                        ["O", "O", "O", "O", "O", "O", "O", "building-airport", "building-airport", "building-airport", "building-airport", "O", "O", "O", "O", "O", "building-airport", "building-airport", "building-airport", "building-airport", "building-airport", "O", "O", "O", "O", "O", "O", "O", "O", "O", "building-airport", "building-airport", "building-airport", "O", "building-airport", "building-airport", "building-airport", "building-airport", "O"],
                        ["O", "O", "O", "O", "building-airport", "building-airport", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O"]
                      ]
                  },
    
      "query":
              {"word":
                      [
                        ["the", "final", "significant", "change", "in", "the", "life", "of", "the", "coco", "2", "(", "models", "26-3134b", ",", "26-3136b", ",", "and", "26-3127b", ";", "16", "kb", "standard", ",", "16", "kb", "extended", ",", "and", "64", "kb", "extended", "respectively", ")", "was", "to", "use", "the", "enhanced", "vdg", ",", "the", "mc6847t1", ",", "allowing", "lowercase", "characters", "and", "changing", "the", "text", "screen", "border", "color", "."],
                        ["the", "reno-tahoe", "international", "airport", "reno-tahoe", "international", "airport", "(", "formerly", "known", "as", "the", "reno", "cannon", "international", "airport", ")", "is", "the", "other", "major", "airport", "in", "the", "state", "."],
                        ["it", "was", "built", "by", "cole", "palen", "for", "flight", "in", "his", "weekend", "airshows", "as", "early", "as", "1967", "and", "actively", "flown", "(", "mostly", "by", "cole", "palen", ")", "within", "the", "weekend", "airshows", "at", "old", "rhinebeck", "until", "the", "late", "1980s", "."],
                        ["lambert", "land", "is", "bounded", "in", "the", "north", "by", "the", "nioghalvfjerd", "fjord", ",", "in", "the", "east", "by", "the", "greenland", "sea", "and", "in", "the", "south", "by", "the", "zachariae", "isstrom", "."],
                        ["started", "police", "operations", "with", "4", "cessna", "cu", "206g", "officially", "on", "7", "april", "1980", "with", "operations", "focused", "in", "peninsula", "of", "malaysi", "a", "."],
                        ["mysore", "airport", "is", "away", ",", "followed", "by", "kozhikode", "international", "airport", "at", "and", "bengaluru", "international", "airport", "at", "."],
                        ["the", "egg-shaped", "qaqaarissorsuaq", "island", "is", "located", "in", "tasiusaq", "bay", ",", "in", "the", "central", "part", "of", "upernavik", "archipelago", "."],
                        ["where", "they", "inserted", "nife", "hydrogenase", "into", "polypyrrole", "films", "and", "to", "provide", "proper", "contact", "to", "the", "electrode", ",", "there", "were", "redox", "mediators", "entrapped", "into", "the", "film", "."],
                        ["the", "nt-3", "protein", "is", "found", "within", "the", "thymus", ",", "spleen", ",", "intestinal", "epithelium", "but", "its", "role", "in", "the", "function", "of", "each", "organ", "is", "still", "unknown", "."],
                        ["ted", "insists", "that", "he", "will", "have", "a", "better", "chance", "at", "winning", "since", "the", "guest", "judge", ",", "tv", "presenter", "henry", "sellers", ",", "is", "staying", "at", "the", "craggy", "island", "parochial", "house", "."],
                        ["mdm2", "binds", "and", "ubiquitinates", "p53", ",", "facilitating", "it", "for", "degradation", "."],
                        ["neuraminidase", "inhibitors", "for", "human", "neuraminidase", "(", "hneu", ")", "have", "the", "potential", "to", "be", "useful", "drugs", "as", "the", "enzyme", "plays", "a", "role", "in", "several", "signaling", "pathways", "in", "cells", "and", "is", "implicated", "in", "diseases", "such", "as", "diabetes", "and", "cancer", "."], ["at", "it", "was", "long", "enough", "to", "accommodate", "the", "belle", "steamers", "that", "carried", "trippers", "along", "the", "coast", "at", "that", "time", "."],
                        ["these", "guerrilla", "sub", "missions", "originated", "at", "brisbane", "'s", ",", "capricorn", "wharf", "or", "mios", "woendi", "."],
                        ["because", "it", "was", "originally", "an", "island", "well", "within", "lake", "texcoco", ",", "iztacalco", "was", "settled", "by", "humans", "later", "than", "the", "rest", "of", "the", "valley", "of", "mexico", "."],
                        ["the", "nordic", "countries", "had", "developed", "the", "skerry", "cruiser", "classes", "and", "the", "international", "rule", "classes", "had", "adopted", "in", "1919", "a", "new", "edition", "of", "the", "rule", "which", "was", "not", "yet", "implemented", "in", "the", "countries", "."]
                      ],
    
              "label": [["O", "O", "O", "O", "O", "O", "O", "O", "O", "product-software", "product-software", "O", "O", "product-software", "O", "product-software", "O", "O", "product-software", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "product-software", "O", "O", "product-software", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O"], ["building-airport", "building-airport", "building-airport", "building-airport", "building-airport", "building-airport", "building-airport", "O", "O", "O", "O", "building-airport", "building-airport", "building-airport", "building-airport", "building-airport", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O"], ["O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "building-airport", "building-airport", "O", "O", "O", "O", "O"], ["location-island", "location-island", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O"], ["O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "location-island", "location-island", "location-island", "O", "O"], ["building-airport", "building-airport", "O", "O", "O", "O", "O", "building-airport", "building-airport", "building-airport", "O", "O", "building-airport", "building-airport", "building-airport", "O", "O"], ["O", "O", "location-island", "location-island", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O"], ["O", "O", "O", "other-biologything", "other-biologything", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O"], ["O", "other-biologything", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O"], ["O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "location-island", "location-island", "O", "O", "O"], ["other-biologything", "O", "O", "O", "other-biologything", "O", "O", "O", "O", "O", "O"], ["other-biologything", "other-biologything", "other-biologything", "other-biologything", "other-biologything", "O", "other-biologything", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O"], ["O", "O", "O", "O", "O", "O", "O", "O", "product-ship", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O"], ["O", "O", "O", "O", "O", "O", "product-ship", "product-ship", "O", "product-ship", "product-ship", "O", "product-ship", "product-ship", "O"], ["O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "location-island", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O"], ["O", "O", "O", "O", "O", "O", "product-ship", "product-ship", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O"]]},
    
      "types": ["other-biologything", "building-airport", "location-island", "product-ship", "product-software"]}
    
    
    opened by pratikchhapolika 4
  • Bug in few-shot sampling process

    Bug in few-shot sampling process

    Hi! It seems like function
    https://github.com/thunlp/Few-NERD/blob/3935544c428a5ed0e78adaf29a9332cc91e8bda9/util/fewshotsampler.py#L62 has a bug, which affects sampling process.

    You can see "for loop" and 4 "if" inside it. If flag isvalid is set to True or False in some iterations, it can be overwritten in next iterations of the loop. So, for example, first class breaks the rule shots_count <= 2*K, but second class is new and flag isvalid will be returned as True from __valid_sample__, but sample breaks validation rules.

    Thus, with this bug, we can sample samples, which are breaking sampling rules, e.g. with shots_count > 2*K.

    What do you think? Does this bug affects results somehow? Maybe in the good way even.

    opened by prohor33 4
  • OOM problem encountered with FewShotNERDatasetWithRandomSampling

    OOM problem encountered with FewShotNERDatasetWithRandomSampling

    When I try:

    python train_demo.py --mode inter --lr 1e-4 --batch_size 8 --trainN 5 --N 5 --K 1 --Q 1 --train_iter 10000 --val_iter 500 --test_iter 5000 --val_step 1000 --max_length 64 --model structshot --tau 0.32

    the program gets stuck on line 399 of framework.py where it tries to get next self.train_data_loader. After a minute, it raises OOM problem.

    I search for a whole night and find the len function of FewShotNERDatasetWithRandomSampling returns 100000000000, which causes OOM error.

    When I change it to return len(self.samples), this problem disappears. Please fix it.

    opened by RyanWangZf 3
  • Tokenize the input word

    Tokenize the input word

    Hi, thank you for sharing the data and code!

    I just found that it seems that an input word is not correctly tokenized by the word tokenizer:

    in the word_tokenizer.py file Each word is directly converted to token id

    for raw_tokens in raw_tokens_list:
         indexed_tokens = self.tokenizer.convert_tokens_to_ids(tokens)
    

    However, a word could be tokenized into word pieces by

    for raw_tokens in raw_tokens_list:
         for word in raw_tokens:
                word_tokens = self.tokenizer.tokenize(word)
    

    Directly converting word to token id will lead to lots of [UNK] and make the performance drop a lot.

    opened by mtt1998 3
  • Question about the F1 score

    Question about the F1 score

    In the following ccode https://github.com/thunlp/Few-NERD/blob/d95dba90a4c736a5310b7275d839ef7b786a46b7/util/framework.py#L537 average the F1 between batches. But this will make the evaluation results depend on the batch size chosen to conduct evaluation, right?

    opened by yhcc 3
  • Using custom dataset

    Using custom dataset

    Hi, I attempted to use custom datasets by replacing respective data (under data/intra/train | test | dev.txt). However, I am encountering the error where I get either:

    ZeroDivisionError when using proto model, or image

    Cannot perform max on tensor with no elements image

    Is there any other areas I should be amending in the data/code in order to use a custom dataset for it?

    Thanks!

    opened by homerjack 2
  • UnboundLocalError: local variable 'label' referenced before assignment

    UnboundLocalError: local variable 'label' referenced before assignment

    If I set the code to run with cpu instead of gpu, I get the above error.

    This happens because in train function of FewShotNERFramework in framework.py, label is only defined if torch.cuda.isavailable is true. So then there is an assert statement a few lines later that is outside of that if block which uses label, that is assert logits.shape[0] == label.shape[0], print(logits.shape, label.shape). But further lines also depend upon label being present.

    I get a RuntimeError: CUDA error: out of memory should I try to run with the GPU.

    The else case for cuda not available should be dealt with better other than this assert statement in my opinon even if running this code with cpu would not be supported.

    opened by demongolem-biz 2
  • the few-shot data and results.

    the few-shot data and results.

    Hi, thanks for your nice work.

    However, I just find the few-shot results of different versions of arXiv Papers are different. There are three groups of results: {v1,v2} , {v3,v4,v5}, {v6}.

    I notice some issues mentioned that : (1) there are some imperfect implementations (such as hyperparameter selection and tokenization bug). (2) The sampled size was larger than 2K due to the data sampling bug.

    May I ask if (1) is the cause of the result from {v1,v2} to {v3,v4,v5}? And is (2) (different sampled few-shot datasets) the cause of the results from {v2, V4, V5} to {v6}?
    Or there are some other reasons.

    This is really nice work, and thanks again for your open source and hard work : ). Look forward to your reply.

    Thanks.

    opened by Wangpeiyi9979 2
  • How to create few shot episode-data for training and test from the general custom NER data?

    How to create few shot episode-data for training and test from the general custom NER data?

    How can we leverage the script to create episode-data for training and test from the general custom NER data.

    Though the module has the code but its a bit complex to go through it make it as utility to do this.

    It would be useful to have a simple utility for 2 things:

    1. Generate episode-train/test data from custom data in required format.
    2. Inference script

    My data is in this format:

    
    [
    {'text': 
    'TN: ***************\nYour item was delivered at the front door or porch at 09:18\nam on Jan 15, 1991 in **********.\n
    ABCD Tracking ADT® Available\nStatus\n✔ Delivered, Front Door \nJan 25, 1991 at 10:24 am\******************\nGet Updates', 
    
    'spans': [{'start': 17, 'end': 39, 'label': 'TN', 'ngram': '*******************'}, 
            {'start': 142, 'end': 161, 'label': 'Carrier', 'ngram': 'ABCD Tracking ADT®'}, 
            {'start': -1, 'end': 5, 'label': 'seller', 'ngram': 'nannan'}, 
            {'start': -1, 'end': 5, 'label': 'cust', 'ngram': 'nannan'}, 
            {'start': 210, 'end': 233, 'label': 'DOS', 'ngram': '09:18\nam on Jan 15, 1991'}
            ]}
    ]
    

    Please let me know if anyone has written Inference script and code to generate episode-train/test data?

    @cyl628

    opened by pratikchhapolika 0
Owner
THUNLP
Natural Language Processing Lab at Tsinghua University
THUNLP
NeRD: Neural Reflectance Decomposition from Image Collections

NeRD: Neural Reflectance Decomposition from Image Collections Project Page | Video | Paper | Dataset Implementation for NeRD. A novel method which dec

Computergraphics (University of Tübingen) 195 Dec 29, 2022
Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"

T-Few This repository contains the official code for the paper: "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learni

null 220 Dec 31, 2022
This repository contains the code for using the H3DS dataset introduced in H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction

H3DS Dataset This repository contains the code for using the H3DS dataset introduced in H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction Access

Crisalix 72 Dec 10, 2022
N-Omniglot is a large neuromorphic few-shot learning dataset

N-Omniglot [Paper] || [Dataset] N-Omniglot is a large neuromorphic few-shot learning dataset. It reconstructs strokes of Omniglot as videos and uses D

null 11 Dec 5, 2022
Official implementation of "Not only Look, but also Listen: Learning Multimodal Violence Detection under Weak Supervision" ECCV2020

XDVioDet Official implementation of "Not only Look, but also Listen: Learning Multimodal Violence Detection under Weak Supervision" ECCV2020. The proj

peng 64 Dec 12, 2022
Convert openmmlab (not only mmdetection) series model to tensorrt

MMDet to TensorRT This project aims to convert the mmdetection model to TensorRT model end2end. Focus on object detection for now. Mask support is exp

JinTian 4 Dec 17, 2021
Source codes for the paper "Local Additivity Based Data Augmentation for Semi-supervised NER"

LADA This repo contains codes for the following paper: Jiaao Chen*, Zhenghui Wang*, Ran Tian, Zichao Yang, Diyi Yang: Local Additivity Based Data Augm

GT-SALT 36 Dec 2, 2022
A Unified Generative Framework for Various NER Subtasks.

This is the code for ACL-ICJNLP2021 paper A Unified Generative Framework for Various NER Subtasks. Install the package in the requirements.txt, then u

null 177 Jan 5, 2023
Preprocessed Datasets for our Multimodal NER paper

Unified Multimodal Transformer (UMT) for Multimodal Named Entity Recognition (MNER) Two MNER Datasets and Codes for our ACL'2020 paper: Improving Mult

null 76 Dec 21, 2022
Robust Self-augmentation for NER with Meta-reweighting

Robust Self-augmentation for NER with Meta-reweighting

Lam chi 17 Nov 22, 2022
An elaborate and exhaustive paper list for Named Entity Recognition (NER)

Named-Entity-Recognition-NER-Papers by Pengfei Liu, Jinlan Fu and other contributors. An elaborate and exhaustive paper list for Named Entity Recognit

Pengfei Liu 388 Dec 18, 2022
Code for our method RePRI for Few-Shot Segmentation. Paper at http://arxiv.org/abs/2012.06166

Region Proportion Regularized Inference (RePRI) for Few-Shot Segmentation In this repo, we provide the code for our paper : "Few-Shot Segmentation Wit

Malik Boudiaf 138 Dec 12, 2022
CharacterGAN: Few-Shot Keypoint Character Animation and Reposing

CharacterGAN Implementation of the paper "CharacterGAN: Few-Shot Keypoint Character Animation and Reposing" by Tobias Hinz, Matthew Fisher, Oliver Wan

Tobias Hinz 181 Dec 27, 2022
Few-shot Learning of GPT-3

Few-shot Learning With Language Models This is a codebase to perform few-shot "in-context" learning using language models similar to the GPT-3 paper.

Tony Z. Zhao 224 Dec 28, 2022
Ready-to-use code and tutorial notebooks to boost your way into few-shot image classification.

Easy Few-Shot Learning Ready-to-use code and tutorial notebooks to boost your way into few-shot image classification. This repository is made for you

Sicara 399 Jan 8, 2023
git《FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding》(CVPR 2021) GitHub: [fig8]

FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding (CVPR 2021) This repo contains the implementation of our state-of-the-art fewshot ob

null 233 Dec 29, 2022
Library of various Few-Shot Learning frameworks for text classification

FewShotText This repository contains code for the paper A Neural Few-Shot Text Classification Reality Check Environment setup # Create environment pyt

Thomas Dopierre 47 Jan 3, 2023
Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)

Introduction Pytorch implementation of Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Expert. | paper Song Park1

Clova AI Research 97 Dec 23, 2022