Codes for "Template-free Prompt Tuning for Few-shot NER".

Last update: Dec 27, 2022

Related tags

Deep Learning EntLM

Overview

EntLM

The source codes for EntLM.

Dependencies:

Cuda 10.1, python 3.6.5

To install the required packages by following commands:

$ pip3 install -r requirements.txt

To download the pretrained bert-base-cased model:

$ cd pretrained/bert-base-cased/
$ sh download_bert.sh

Few-shot Experiment

Run the few-shot experiments on CoNLL 5-shot with:

sh scripts/run_conll.sh

By default, this runs 4 rounds of experiments for each of the sampled datasets. You can also run 10/20/50-shot experiments by editing the line FILE_PATH=dataset/conll/5shot/ in scripts/run_conll.sh .

Label word selection

You can run the label word selection process by:

sh scripts/count_freq.sh

This will build a label_map file such as dataset/conll/label_map_timesup_ratio0.6_multitoken_top6.json in the dataset path.

You can try different method by changing "--sort_method" to ["LM", "data", "timesup"].

Or you can try different ratio/virtual_number by changing "--filter_ratio" and "--top_k_num".

You might also like...

Codes for ACL-IJCNLP 2021 Paper "Zero-shot Fact Verification by Claim Generation"

Zero-shot-Fact-Verification-by-Claim-Generation This repository contains code and models for the paper: Zero-shot Fact Verification by Claim Generatio

47 Jan 1, 2023

The official codes of "Semi-supervised Models are Strong Unsupervised Domain Adaptation Learners".

SSL models are Strong UDA learners Introduction This is the official code of paper "Semi-supervised Models are Strong Unsupervised Domain Adaptation L

26 Dec 26, 2022

Source codes of CenterTrack++ in 2021 ICME Workshop on Big Surveillance Data Processing and Analysis

MOT Tracked object bounding box association (CenterTrack++) New association method based on CenterTrack. Two new branches (Tracked Size and IOU) are a

36 Oct 4, 2022

The codes and models in 'Gaze Estimation using Transformer'.

GazeTR We provide the code of GazeTR-Hybrid in "Gaze Estimation using Transformer". We recommend you to use data processing codes provided in GazeHub.

65 Dec 27, 2022

codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification

DLCF-DCA codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification. submitted t

15 Aug 30, 2022

The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"

Swin-Unet The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"(https://arxiv.org/abs/2105.05537). A validatio

869 Jan 7, 2023

The source codes for ACL 2021 paper 'BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data'

BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data This repository provides the implementation details for

124 Dec 27, 2022

This is the official repo for TransFill: Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transformations at CVPR'21. According to some product reasons, we are not planning to release the training/testing codes and models. However, we will release the dataset and the scripts to prepare the dataset.

TransFill-Reference-Inpainting This is the official repo for TransFill: Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transf

80 Dec 8, 2022

Codes for paper "Towards Diverse Paragraph Captioning for Untrimmed Videos". CVPR 2021

Towards Diverse Paragraph Captioning for Untrimmed Videos This repository contains PyTorch implementation of our paper Towards Diverse Paragraph Capti

61 Oct 11, 2022

Comments

GPT2 got very low score

I tried GPT2 on conell 5 shot,batchsize =2 ,and got: {'LOC_precision': 0.00749063670411985, 'LOC_recall': 0.002408187838651415, 'LOC_f1': 0.003644646924829157, 'LOC_number': 1661, 'MISC_precision': 0.2647058823529412, 'MISC_recall': 0.012987012987012988, 'MISC_f1': 0.02475928473177442, 'MISC_number': 693, 'ORG_precision': 0.22037914691943128, 'ORG_recall': 0.05615942028985507, 'ORG_f1': 0.08950914340712222, 'ORG_number': 1656, 'PER_precision': 0.08044554455445545, 'PER_recall': 0.12066831683168316, 'PER_f1': 0.09653465346534652, 'PER_number': 1616, 'overall_precision': 0.08816637375512595, 'overall_recall': 0.05350159971560611, 'overall_f1': 0.0665929203539823, 'overall_accuracy': 0.7978459881529348}

Any suggestions? what may be wrong?

opened by AlexWang1900 1
执行raw_datasets = load_dataset(extension, data_files=data_files) 报错,你是如何解决的?

Traceback (most recent call last): File "", line 1, in File "/root/.jupyter/Pytorch-1.8.1/lib/python3.8/site-packages/datasets/load.py", line 711, in load_dataset module_path, hash, resolved_file_path = prepare_module( File "/root/.jupyter/Pytorch-1.8.1/lib/python3.8/site-packages/datasets/load.py", line 354, in prepare_module raise FileNotFoundError( FileNotFoundError: Couldn't find file locally at lhoestq/demo1/demo1.py, or remotely at https://huggingface.co/datasets/lhoestq/demo1/resolve/main/demo1.py. Please provide a valid dataset name

raw_datasets = load_dataset('json', data_files='/usr/local/notebook_dir/EntLM/dataset/conll/5shot/1.json') Using custom data configuration default-eaef8de8e7268fcc Downloading and preparing dataset json/default (download: Unknown size, generated: Unknown size, post-processed: Unknown size, total: Unknown size) to /root/.cache/huggingface/datasets/json/default-eaef8de8e7268fcc/0.0.0/83d5b3a2f62630efc6b5315f00f20209b4ad91a00ac586597caee3a4da0bef02... Traceback (most recent call last): File "", line 1, in File "/root/.jupyter/Pytorch-1.8.1/lib/python3.8/site-packages/datasets/load.py", line 745, in load_dataset builder_instance.download_and_prepare( File "/root/.jupyter/Pytorch-1.8.1/lib/python3.8/site-packages/datasets/builder.py", line 574, in download_and_prepare self._download_and_prepare( File "/root/.jupyter/Pytorch-1.8.1/lib/python3.8/site-packages/datasets/builder.py", line 630, in _download_and_prepare split_generators = self._split_generators(dl_manager, **split_generators_kwargs) File "/root/.jupyter/Pytorch-1.8.1/lib/python3.8/site-packages/datasets/packaged_modules/json/json.py", line 47, in _split_generators data_files = dl_manager.download_and_extract(self.config.data_files) File "/root/.jupyter/Pytorch-1.8.1/lib/python3.8/site-packages/datasets/utils/download_manager.py", line 287, in download_and_extract return self.extract(self.download(url_or_urls)) File "/root/.jupyter/Pytorch-1.8.1/lib/python3.8/site-packages/datasets/utils/download_manager.py", line 261, in extract extracted_paths = map_nested( File "/root/.jupyter/Pytorch-1.8.1/lib/python3.8/site-packages/datasets/utils/py_utils.py", line 195, in map_nested return function(data_struct) File "/root/.jupyter/Pytorch-1.8.1/lib/python3.8/site-packages/datasets/utils/file_utils.py", line 307, in cached_path and not tarfile.is_tarfile(output_path) File "/usr/local/python3/lib/python3.8/tarfile.py", line 2466, in is_tarfile t = open(name) File "/usr/local/python3/lib/python3.8/tarfile.py", line 1599, in open return func(name, "r", fileobj, **kwargs) File "/usr/local/python3/lib/python3.8/tarfile.py", line 1728, in xzopen t = cls.taropen(name, mode, fileobj, **kwargs) File "/usr/local/python3/lib/python3.8/tarfile.py", line 1647, in taropen return cls(name, mode, fileobj, **kwargs) File "/usr/local/python3/lib/python3.8/tarfile.py", line 1510, in init self.firstmember = self.next() File "/usr/local/python3/lib/python3.8/tarfile.py", line 2313, in next tarinfo = self.tarinfo.fromtarfile(self) File "/usr/local/python3/lib/python3.8/tarfile.py", line 1102, in fromtarfile buf = tarfile.fileobj.read(BLOCKSIZE) File "/usr/local/python3/lib/python3.8/lzma.py", line 206, in read return self._buffer.read(size) File "/usr/local/python3/lib/python3.8/_compression.py", line 68, in readinto data = self.read(len(byte_view)) File "/usr/local/python3/lib/python3.8/_compression.py", line 96, in read if self._decompressor.needs_input: AttributeError: '_lzma.LZMADecompressor' object has no attribute 'needs_input'

opened by apexg 1
How to run my own dataset with this code?

I find that here still needs two files, "label.json" and "label_map_timesup_ratio0.6_multitoken_top6.json". Could you please tell me how can I get these two specific files for my own dataset? Best wish!

opened by yiphingzhang 0

Owner

GitHub

Codes for "Template-free Prompt Tuning for Few-shot NER".

Related tags

Overview

EntLM

Dependencies:

Few-shot Experiment

Label word selection

You might also like...

Codes for ACL-IJCNLP 2021 Paper "Zero-shot Fact Verification by Claim Generation"

The official codes of "Semi-supervised Models are Strong Unsupervised Domain Adaptation Learners".

Source codes of CenterTrack++ in 2021 ICME Workshop on Big Surveillance Data Processing and Analysis

The codes and models in 'Gaze Estimation using Transformer'.

codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification

The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"

The source codes for ACL 2021 paper 'BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data'

Codes for paper "Towards Diverse Paragraph Captioning for Untrimmed Videos". CVPR 2021

Comments

GPT2 got very low score

执行raw_datasets = load_dataset(extension, data_files=data_files) 报错,你是如何解决的?

How to run my own dataset with this code?

Owner

Codes for NAACL 2021 Paper "Unsupervised Multi-hop Question Answering by Question Generation"

This is my codes that can visualize the psnr image in testing videos.

codes for Image Inpainting with External-internal Learning and Monochromic Bottleneck

Source codes for "Structure-Aware Abstractive Conversation Summarization via Discourse and Action Graphs"

Codes for our paper "SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge" (EMNLP 2020)

Source codes for the paper "Local Additivity Based Data Augmentation for Semi-supervised NER"

Python codes for Lite Audio-Visual Speech Enhancement.

Codes for our IJCAI21 paper: Dialogue Discourse-Aware Graph Model and Data Augmentation for Meeting Summarization

Official codes for the paper "Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech"

Pytorch codes for "Self-supervised Multi-view Stereo via Effective Co-Segmentation and Data-Augmentation"