Use Google's BERT for named entity recognition （CoNLL-2003 as the dataset）.

Kaiyinzhou

Last update: Dec 26, 2022

Related tags

Overview

For better performance, you can try NLPGNN, see NLPGNN for more details.

BERT-NER Version 2

Use Google's BERT for named entity recognition （CoNLL-2003 as the dataset）.

The original version （see old_version for more detail） contains some hard codes and lacks corresponding annotations,which is inconvenient to understand. So in this updated version,there are some new ideas and tricks （On data Preprocessing and layer design） that can help you quickly implement the fine-tuning model (you just need to try to modify crf_layer or softmax_layer).

Folder Description:

BERT-NER
|____ bert                          # need git from [here](https://github.com/google-research/bert)
|____ cased_L-12_H-768_A-12	    # need download from [here](https://storage.googleapis.com/bert_models/2018_10_18/cased_L-12_H-768_A-12.zip)
|____ data		            # train data
|____ middle_data	            # middle data (label id map)
|____ output			    # output (final model, predict results)
|____ BERT_NER.py		    # mian code
|____ conlleval.pl		    # eval code
|____ run_ner.sh    		    # run model and eval result

Usage:

bash run_ner.sh

What's in run_ner.sh:

python BERT_NER.py\
    --task_name="NER"  \
    --do_lower_case=False \
    --crf=False \
    --do_train=True   \
    --do_eval=True   \
    --do_predict=True \
    --data_dir=data   \
    --vocab_file=cased_L-12_H-768_A-12/vocab.txt  \
    --bert_config_file=cased_L-12_H-768_A-12/bert_config.json \
    --init_checkpoint=cased_L-12_H-768_A-12/bert_model.ckpt   \
    --max_seq_length=128   \
    --train_batch_size=32   \
    --learning_rate=2e-5   \
    --num_train_epochs=3.0   \
    --output_dir=./output/result_dir

perl conlleval.pl -d '\t' < ./output/result_dir/label_test.txt

Notice: cased model was recommened, according to this paper. CoNLL-2003 dataset and perl Script comes from here

RESULTS:(On test set)

Parameter setting:

do_lower_case=False
num_train_epochs=4.0
crf=False

accuracy:  98.15%; precision:  90.61%; recall:  88.85%; FB1:  89.72
              LOC: precision:  91.93%; recall:  91.79%; FB1:  91.86  1387
             MISC: precision:  83.83%; recall:  78.43%; FB1:  81.04  668
              ORG: precision:  87.83%; recall:  85.18%; FB1:  86.48  1191
              PER: precision:  95.19%; recall:  94.83%; FB1:  95.01  1311

Result description:

Here i just use the default paramaters, but as Google's paper says a 0.2% error is reasonable(reported 92.4%). Maybe some tricks need to be added to the above model.

reference:

[1] https://arxiv.org/abs/1810.04805

[2] https://github.com/google-research/bert

Comments

TypeError: eval_metric_ops[confusion_matrix] must be Operation or Tensor

How to solve this issue?

TypeError: eval_metric_ops[confusion_matrix] must be Operation or Tensor, given:<tf.Variable 'total_confusion_matrix: 0' shape=(5, 5) dtype=float64_ref>

opened by congchan 4

Question: training the model without init_checkpoint

INFO:tensorflow:Error recorded from training_loop: local variable 'initialized_variable_names' referenced before assignment INFO:tensorflow:training_loop marked as finished WARNING:tensorflow:Reraising captured error Traceback (most recent call last): File "BERT_NER.py", line 612, in tf.app.run() File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\platform\app.py", line 125, in run _sys.exit(main(argv)) File "BERT_NER.py", line 545, in main estimator.train(input_fn=train_input_fn, max_steps=num_train_steps) File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\tpu_estimator.py", line 2409, in train rendezvous.raise_errors() File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\error_handling.py", line 128, in raise_errors six.reraise(typ, value, traceback) File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\six.py", line 693, in reraise raise value File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\tpu_estimator.py", line 2403, in train saving_listeners=saving_listeners File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\estimator.py", line 354, in train loss = self._train_model(input_fn, hooks, saving_listeners) File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\estimator.py", line 1207, in _train_model return self._train_model_default(input_fn, hooks, saving_listeners) File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\estimator.py", line 1237, in _train_model_default features, labels, model_fn_lib.ModeKeys.TRAIN, self.config) File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\tpu_estimator.py", line 2195, in _call_model_fn features, labels, mode, config) File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\estimator.py", line 1195, in _call_model_fn model_fn_results = self._model_fn(features=features, **kwargs) File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\tpu_estimator.py", line 2479, in _model_fn features, labels, is_export_mode=is_export_mode) File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\tpu_estimator.py", line 1259, in call_without_tpu return self._call_model_fn(features, labels, is_export_mode=is_export_mode) File "C:\Users\Sudha\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\tpu\python\tpu\tpu_estimator.py", line 1533, in _call_model_fn estimator_spec = self._model_fn(features=features, **kwargs) File "BERT_NER.py", line 419, in model_fn if var.name in initialized_variable_names: UnboundLocalError: local variable 'initialized_variable_names' referenced before assignment

Training the model without using the init_checkpoint flag returns this error

opened by sudhanshu817 3
TPUEstimatorSpec.predictions must be dict of Tensors
When running predict on Google Colab (to use TPU) the code crashes with the following error:

TPUEstimatorSpec.predictions must be dict of Tensors.

To solve it one can place the following code in create_model

predict = tf.argmax(probabilities, axis=-1) predict_dict = {'predictions': predict} # this way it is not shot down by check in TPUEstimatorSpec return loss, per_example_loss, logits, predict_dict

This of course also means changing the interpretation of the result

result = estimator.predict(input_fn=predict_input_fn) result = list(result) result = [pred['predictions'] for pred in result]

Currently I'm unable to to pull request since that would mean looking into whether it really is a solution. Just posting it here for anyone who has the same problem.
opened by rikhuijzer 2
How to reproduce your results.
I use the same run command like yours, but I get worse results on dev dataset.

eval_f = 0.89656204 eval_precision = 0.90508 eval_recall = 0.88843685 global_step = 653 loss = 17.190592

I use "BERT-Base, Multilingual Cased: 104 languages, 12-layer, 768-hidden, 12-heads, 110M parameters" as checkpoint, which is public by google at November 23rd, 2018.
opened by ljch2018 1

--max_seq_length=128 -> 150

hi kyzhouhzau~

thank you for this project :) there is a minor error which i'd like to report.

def convert_single_example(ex_index, example, label_list, max_seq_length, tokenizer):
...
    input_ids = tokenizer.convert_tokens_to_ids(ntokens)
    input_mask = [1] * len(input_ids)
    while len(input_ids) < max_seq_length:
        input_ids.append(0)
        input_mask.append(0)
        segment_ids.append(0)
        label_ids.append(0)
    print('length check', len(input_ids), max_seq_length)
    assert len(input_ids) == max_seq_length  <-- error
    assert len(input_mask) == max_seq_length
    assert len(segment_ids) == max_seq_length
    assert len(label_ids) == max_seq_length
...

tokenizer.convert_tokens_to_ids(ntokens) would generate longer list than max_seq_length when we are using --max_seq_length=128.

so, i ran with --max_seq_length=150. it was fine.

opened by dsindex 1

absl.flags._exceptions.UnparsedFlagAccessError: Trying to access flag --do_train before flags were parsed.

When I trying to train, I meet error: absl.flags._exceptions.UnparsedFlagAccessError: Trying to access flag --do_train before flags were parsed. can anyone help me? thanks a lot.

opened by YijianLiu 0
How to use BERT for ENTITY extraction from a Sequence without classification in the NER task?

My requirement here is given a sentence(sequence), I would like to just extract the entities present in the sequence without classifying them to a type in the NER task. I see that BertForTokenClassification for NER does the classification. Can this be adapted for just the extraction?

Can you give me an idea of how to do entity extraction/identification using BERT?

opened by ManojPrabhakar 0
grpc error

when i use tensroflow-serving grpc, the “response = stub.Predict(request, timeout)“ has a error message: status = StatusCode.FAILED_PRECONDITION details = "Batched output tensor's 0th dimension does not equal the sum of the 0th dimension sizes of the input tensors" debug_error_string = "{"created":"@1558057036.708000000","description":"Error received from peer","file":"src/core/lib/surface/call.cc","file_line":1017,"grpc_message":"Batched output tensor's 0th dimension does not equal the sum of the 0th dimension sizes of the input tensors","grpc_status":9}"

opened by Reevens 0
Problems: runs very slowly when converting single example to feature

I found it cost too much time When running the convert_single_example function.

time0 = time.time() feature = convert_single_example(ex_index, example, label_list, max_seq_length, tokenizer,mode) time1 = time.time() print("time cost:", time1-time0)

the cost is up to a few seconds！ time cost: 4.020495414733887

In convert_single_example function we can fix it by add the following code ： if not os.path.exists('./output/label2id.pkl'):

in front of

with open('./output/label2id.pkl','wb') as w: pickle.dump(label_map, w)

opened by jiangpinglei 0
Uncased or Cased?

Thanks for sharing the code for NER task! May I know which model did you use? Cased or uncased? I am getting F1-dev 88.8 using Cased-model and F1-dev 92.6 using Uncased-model.

opened by donovanOng 0
A new public notebook NER git

I come across some problems when I use this code . So I make a colab(notebook) NER model. And put it in the [https://github.com/Hou-jing/NER-public] I think this will be easy to use

opened by Hou-jing 0
Use GPU for inference
Hi, I'd like to use BERT-NER for inference, mainly to recognise ORG. I have been able to do so with CPU, now I'd like to know two things:

Would GPU speed up inference ?

Does BERT-NER automatically use CPU? I tried it on Google Colab and I don't see any changes in inference time. Please advice how, thanks !
opened by gimseng 0
像您请教一下为什么训练的时候并没有出现loss以及epoch

训练的时候显示如下日志，如何查看每个epoch的loss过程我的训练参数是 python BERT_NER.py
--task_name="NER"
--do_lower_case=False
--crf=True
--do_train=True
--do_eval=True
--do_predict=True
--data_dir=data
--vocab_file=cased_L-12_H-768_A-12/vocab.txt
--bert_config_file=cased_L-12_H-768_A-12/bert_config.json
--init_checkpoint=cased_L-12_H-768_A-12/bert_model.ckpt
--max_seq_length=128
--train_batch_size=32
--learning_rate=2e-5
--num_train_epochs=4.0
--output_dir=./output/result_dir 如下训练日志： INFO:tensorflow:global_step/sec: 2.44863 I0326 15:59:02.876380 139718477698816 tpu_estimator.py:2307] global_step/sec: 2.44863 INFO:tensorflow:examples/sec: 78.3562 I0326 15:59:02.876857 139718477698816 tpu_estimator.py:2308] examples/sec: 78.3562 INFO:tensorflow:global_step/sec: 2.24293 I0326 15:59:03.322147 139718477698816 tpu_estimator.py:2307] global_step/sec: 2.24293 INFO:tensorflow:examples/sec: 71.7736 I0326 15:59:03.322539 139718477698816 tpu_estimator.py:2308] examples/sec: 71.7736 INFO:tensorflow:global_step/sec: 2.31614 I0326 15:59:03.753919 139718477698816 tpu_estimator.py:2307] global_step/sec: 2.31614 INFO:tensorflow:examples/sec: 74.1165 I0326 15:59:03.754283 139718477698816 tpu_estimator.py:2308] examples/sec: 74.1165 INFO:tensorflow:global_step/sec: 2.32764 I0326 15:59:04.183665 139718477698816 tpu_estimator.py:2307] global_step/sec: 2.32764 INFO:tensorflow:examples/sec: 74.4845 I0326 15:59:04.184118 139718477698816 tpu_estimator.py:2308] examples/sec: 74.4845 INFO:tensorflow:global_step/sec: 2.34524 I0326 15:59:04.610075 139718477698816 tpu_estimator.py:2307] global_step/sec: 2.34524 INFO:tensorflow:examples/sec: 75.0478 I0326 15:59:04.610802 139718477698816 tpu_estimator.py:2308] examples/sec: 75.0478 INFO:tensorflow:global_step/sec: 2.24035 I0326 15:59:05.056344 139718477698816 tpu_estimator.py:2307] global_step/sec: 2.24035 INFO:tensorflow:examples/sec: 71.6911 I0326 15:59:05.056849 139718477698816 tpu_estimator.py:2308] examples/sec: 71.6911 INFO:tensorflow:global_step/sec: 2.53696 I0326 15:59:05.450423 139718477698816 tpu_estimator.py:2307] global_step/sec: 2.53696 INFO:tensorflow:examples/sec: 81.1828 I0326 15:59:05.450799 139718477698816 tpu_estimator.py:2308] examples/sec: 81.1828 INFO:tensorflow:global_step/sec: 2.54311 I0326 15:59:05.843605 139718477698816 tpu_estimator.py:2307] global_step/sec: 2.54311 INFO:tensorflow:examples/sec: 81.3794 I0326 15:59:05.843933 139718477698816 tpu_estimator.py:2308] examples/sec: 81.3794 INFO:tensorflow:global_step/sec: 2.25988 I0326 15:59:06.286114 139718477698816 tpu_estimator.py:2307] global_step/sec: 2.25988 INFO:tensorflow:examples/sec: 72.316 I0326 15:59:06.286518 139718477698816 tpu_estimator.py:2308] examples/sec: 72.316 ^CINFO:tensorflow:training_loop marked as finished

opened by YijianLiu 1
BERT_NER.py#L450 same code in if-else in both branches

BERT_NER.py#L450

if FLAGS.crf: (total_loss, logits,predicts) = create_model(bert_config, is_training, input_ids, mask, segment_ids, label_ids,num_labels,use_one_hot_embeddings)`

else: (total_loss, logits, predicts) = create_model(bert_config, is_training, input_ids, mask, segment_ids, label_ids,num_labels, use_one_hot_embeddings)

The same code is for both conditions.

opened by natasasdj 0

Owner

Kaiyinzhou

Interested in machine learning, deep learning and knowledge graph. Familiar with basic machine learning algorithms, especially variational inference.

GitHub

PhoNLP: A BERT-based multi-task learning toolkit for part-of-speech tagging, named entity recognition and dependency parsing

PhoNLP is a multi-task learning model for joint part-of-speech (POS) tagging, named entity recognition (NER) and dependency parsing. Experiments on Vietnamese benchmark datasets show that PhoNLP produces state-of-the-art results, outperforming a single-task learning approach that fine-tunes the pre-trained Vietnamese language model PhoBERT for each task independently.

109 Dec 2, 2022

Pytorch-Named-Entity-Recognition-with-BERT

BERT NER Use google BERT to do CoNLL-2003 NER ! Train model using Python and Inference using C++ ALBERT-TF2.0 BERT-NER-TENSORFLOW-2.0 BERT-SQuAD Requi

1.1k Dec 25, 2022

RoNER is a Named Entity Recognition model based on a pre-trained BERT transformer model trained on RONECv2

RoNER RoNER is a Named Entity Recognition model based on a pre-trained BERT transformer model trained on RONECv2. It is meant to be an easy to use, hi

9 Nov 7, 2022

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

NeuroNER NeuroNER is a program that performs named-entity recognition (NER). Website: neuroner.com. This page gives step-by-step instructions to insta

1.6k Dec 27, 2022

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

NeuroNER NeuroNER is a program that performs named-entity recognition (NER). Website: neuroner.com. This page gives step-by-step instructions to insta

1.5k Feb 11, 2021

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

NeuroNER NeuroNER is a program that performs named-entity recognition (NER). Website: neuroner.com. This page gives step-by-step instructions to insta

1.5k Feb 17, 2021

Chinese named entity recognization (bert/roberta/macbert/bert_wwm with Keras)

2 Jul 5, 2022

Neural network models for joint POS tagging and dependency parsing (CoNLL 2017-2018)

Neural Network Models for Joint POS Tagging and Dependency Parsing Implementations of joint models for POS tagging and dependency parsing, as describe

152 Sep 2, 2022

Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks

NERDA Not only is NERDA a mesmerizing muppet-like character. NERDA is also a python package, that offers a slick easy-to-use interface for fine-tuning

141 Dec 30, 2022

CrossNER: Evaluating Cross-Domain Named Entity Recognition (AAAI-2021)

CrossNER is a fully-labeled collected of named entity recognition (NER) data spanning over five diverse domains (Politics, Natural Science, Music, Literature, and Artificial Intelligence) with specialized entity categories for different domains.

89 Nov 10, 2022

Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition, Part-of-Speech Tagging and so on.

anaGo anaGo is a Python library for sequence labeling(NER, PoS Tagging,...), implemented in Keras. anaGo can solve sequence labeling tasks such as nam

1.5k Dec 5, 2022

Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition, Part-of-Speech Tagging and so on.

anaGo anaGo is a Python library for sequence labeling(NER, PoS Tagging,...), implemented in Keras. anaGo can solve sequence labeling tasks such as nam

1.4k Feb 17, 2021

A text augmentation tool for named entity recognition.

neraug This python library helps you with augmenting text data for named entity recognition. Augmentation Example Reference from An Analysis of Simple

48 Oct 11, 2022

Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

Pytorch-NLU，一个中文文本分类、序列标注工具包，支持中文长文本、短文本的多类、多标签分类任务，支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

186 Dec 24, 2022

Tool to add main subject to items on Wikidata using a WMFs CirrusSearch for named entity recognition or a manually supplied list of QIDs

ItemSubjector Tool made to add main subject statements to items based on the title using a home-brewed CirrusSearch-based Named Entity Recognition alg

9 Nov 17, 2022

Implemented shortest-circuit disambiguation, maximum probability disambiguation, HMM-based lexical annotation and BiLSTM+CRF-based named entity recognition

0 Feb 13, 2022

Use Google's BERT for named entity recognition （CoNLL-2003 as the dataset）.

Related tags

Overview

For better performance, you can try NLPGNN, see NLPGNN for more details.

BERT-NER Version 2

Folder Description:

Usage:

What's in run_ner.sh:

RESULTS:(On test set)

Parameter setting:

Result description:

reference:

Comments

Owner

Kaiyinzhou

PhoNLP: A BERT-based multi-task learning toolkit for part-of-speech tagging, named entity recognition and dependency parsing

Pytorch-Named-Entity-Recognition-with-BERT

RoNER is a Named Entity Recognition model based on a pre-trained BERT transformer model trained on RONECv2

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

Chinese named entity recognization (bert/roberta/macbert/bert_wwm with Keras)

Neural network models for joint POS tagging and dependency parsing (CoNLL 2017-2018)

Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks

CrossNER: Evaluating Cross-Domain Named Entity Recognition (AAAI-2021)

Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition, Part-of-Speech Tagging and so on.

Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition, Part-of-Speech Tagging and so on.

A text augmentation tool for named entity recognition.

Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

Tool to add main subject to items on Wikidata using a WMFs CirrusSearch for named entity recognition or a manually supplied list of QIDs

Implemented shortest-circuit disambiguation, maximum probability disambiguation, HMM-based lexical annotation and BiLSTM+CRF-based named entity recognition

Named Entity Recognition API used by TEI Publisher

Nested Named Entity Recognition

Spacy-ginza-ner-webapi - Named Entity Recognition API with spaCy and GiNZA