Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning And private Server services

MaCan

Last update: Dec 29, 2022

Related tags

Deep Learning crf named-entity-recognition ner bert blstm bert-bilstm-crf

Overview

BERT-BiLSTM-CRF-NER

Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning

使用谷歌的BERT模型在BLSTM-CRF模型上进行预训练用于中文命名实体识别的Tensorflow代码'

中文文档请查看https://blog.csdn.net/macanv/article/details/85684284 如果对您有帮助，麻烦点个star,谢谢~~

Welcome to star this repository!

The Chinese training data($PATH/NERdata/) come from:https://github.com/zjy-ucas/ChineseNER

The CoNLL-2003 data($PATH/NERdata/ori/) come from:https://github.com/kyzhouhzau/BERT-NER

The evaluation codes come from:https://github.com/guillaumegenthial/tf_metrics/blob/master/tf_metrics/__init__.py

Try to implement NER work based on google's BERT code and BiLSTM-CRF network! This project may be more close to process Chinese data. but other language only need Modify a small amount of code.

THIS PROJECT ONLY SUPPORT Python3.
###################################################################

Download project and install

You can install this project by:

pip install bert-base==0.0.9 -i https://pypi.python.org/simple

git clone https://github.com/macanv/BERT-BiLSTM-CRF-NER
cd BERT-BiLSTM-CRF-NER/
python3 setup.py install

if you do not want to install, you just need clone this project and reference the file of to train the model or start the service.

UPDATE:

2020.2.6 add simple flask ner service code
2019.2.25 Fix some bug for ner service
2019.2.19: add text classification service
fix Missing loss error
add label_list params in train process, so you can using -label_list xxx to special labels in training process.

Train model:

You can use -help to view the relevant parameters of the training named entity recognition model, where data_dir, bert_config_file, output_dir, init_checkpoint, vocab_file must be specified.

bert-base-ner-train -help

train/dev/test dataset is like this:

海 O
钓 O
比 O
赛 O
地 O
点 O
在 O
厦 B-LOC
门 I-LOC
与 O
金 B-LOC
门 I-LOC
之 O
间 O
的 O
海 O
域 O
。 O

The first one of each line is a token, the second is token's label, and the line is divided by a blank line. The maximum length of each sentence is [max_seq_length] params.
You can get training data from above two git repos
You can training ner model by running below command:

bert-base-ner-train \
    -data_dir {your dataset dir}\
    -output_dir {training output dir}\
    -init_checkpoint {Google BERT model dir}\
    -bert_config_file {bert_config.json under the Google BERT model dir} \
    -vocab_file {vocab.txt under the Google BERT model dir}

like my init_checkpoint:

init_checkpoint = F:\chinese_L-12_H-768_A-12\bert_model.ckpt

you can special labels using -label_list params, the project get labels from training data.

# using , split
-labels 'B-LOC, I-LOC ...'
OR save label in a file like labels.txt, one line one label
-labels labels.txt

After training model, the NER model will be saved in {output_dir} which you special above cmd line.

My Training environment：Tesla P40 24G mem

As Service

Many server and client code comes from excellent open source projects: bert as service of hanxiao If my code violates any license agreement, please let me know and I will correct it the first time. ~~and NER server/client service code can be applied to other tasks with simple modifications, such as text categorization, which I will provide later.~~ this project private Named Entity Recognition and Text Classification server service. Welcome to submit your request or share your model, if you want to share it on Github or my work.

You can use -help to view the relevant parameters of the NER as Service: which model_dir, bert_model_dir is need

bert-base-serving-start -help

and than you can using below cmd start ner service:

bert-base-serving-start \
    -model_dir C:\workspace\python\BERT_Base\output\ner2 \
    -bert_model_dir F:\chinese_L-12_H-768_A-12
    -model_pb_dir C:\workspace\python\BERT_Base\model_pb_dir
    -mode NER

or text classification service:

bert-base-serving-start \
    -model_dir C:\workspace\python\BERT_Base\output\ner2 \
    -bert_model_dir F:\chinese_L-12_H-768_A-12
    -model_pb_dir C:\workspace\python\BERT_Base\model_pb_dir
    -mode CLASS
    -max_seq_len 202

as you see:
mode: If mode is NER/CLASS, then the service identified by the Named Entity Recognition/Text Classification will be started. If it is BERT, it will be the same as the [bert as service] project.
bert_model_dir: bert_model_dir is a BERT model, you can download from https://github.com/google-research/bert ner_model_dir: your ner model checkpoint dir model_pb_dir: model freeze save dir, after run optimize func, there will contains like ner_model.pb binary file

You can download my ner model from：https://pan.baidu.com/s/1m9VcueQ5gF-TJc00sFD88w, ex_code: guqq Or text classification model from: https://pan.baidu.com/s/1oFPsOUh1n5AM2HjDIo2XCw, ex_code: bbu8
Set ner_mode.pb/classification_model.pb to model_pb_dir, and set other file to model_dir(Different models need to be stored separately, you can set ner models label_list.pkl and label2id.pkl to model_dir/ner/ and set text classification file to model_dir/text_classification) , Text classification model can classify 12 categories of Chinese data： '游戏', '娱乐', '财经', '时政', '股票', '教育', '社会', '体育', '家居', '时尚', '房产', '彩票'

You can see below service starting info:

you can using below code test client:

1. NER Client

import time
from bert_base.client import BertClient

with BertClient(show_server_config=False, check_version=False, check_length=False, mode='NER') as bc:
    start_t = time.perf_counter()
    str = '1月24日，新华社对外发布了中央对雄安新区的指导意见，洋洋洒洒1.2万多字，17次提到北京，4次提到天津，信息量很大，其实也回答了人们关心的很多问题。'
    rst = bc.encode([str, str])
    print('rst:', rst)
    print(time.perf_counter() - start_t)

you can see this after run the above code: If you want to customize the word segmentation method, you only need to make the following simple changes on the client side code.

rst = bc.encode([list(str), list(str)], is_tokenized=True)

2. Text Classification Client

with BertClient(show_server_config=False, check_version=False, check_length=False, mode='CLASS') as bc:
    start_t = time.perf_counter()
    str1 = '北京时间2月17日凌晨，第69届柏林国际电影节公布主竞赛单元获奖名单，王景春、咏梅凭借王小帅执导的中国影片《地久天长》连夺最佳男女演员双银熊大奖，这是中国演员首次包揽柏林电影节最佳男女演员奖，为华语影片刷新纪录。与此同时，由青年导演王丽娜执导的影片《第一次的别离》也荣获了本届柏林电影节新生代单元国际评审团最佳影片，可以说，在经历数个获奖小年之后，中国电影在柏林影展再次迎来了高光时刻。'
    str2 = '受粤港澳大湾区规划纲要提振，港股周二高开，恒指开盘上涨近百点，涨幅0.33%，报28440.49点，相关概念股亦集体上涨，电子元件、新能源车、保险、基建概念多数上涨。粤泰股份、珠江实业、深天地A等10余股涨停；中兴通讯、丘钛科技、舜宇光学分别高开1.4%、4.3%、1.6%。比亚迪电子、比亚迪股份、光宇国际分别高开1.7%、1.2%、1%。越秀交通基建涨近2%，粤海投资、碧桂园等多股涨超1%。其他方面，日本软银集团股价上涨超0.4%，推动日经225和东证指数齐齐高开，但随后均回吐涨幅转跌东证指数跌0.2%，日经225指数跌0.11%，报21258.4点。受芯片制造商SK海力士股价下跌1.34％拖累，韩国综指下跌0.34％至2203.9点。澳大利亚ASX 200指数早盘上涨0.39％至6089.8点，大多数行业板块均现涨势。在保健品品牌澳佳宝下调下半财年的销售预期后，其股价暴跌超过23％。澳佳宝CEO亨弗里（Richard Henfrey）认为，公司下半年的利润可能会低于上半年，主要是受到销售额疲弱的影响。同时，亚市早盘澳洲联储公布了2月会议纪要，政策委员将继续谨慎评估经济增长前景，因前景充满不确定性的影响，稳定当前的利率水平比贸然调整利率更为合适，而且当前利率水平将有利于趋向通胀目标及改善就业，当前劳动力市场数据表现强势于其他经济数据。另一方面，经济增长前景亦令消费者消费意愿下滑，如果房价出现下滑，消费可能会进一步疲弱。在澳洲联储公布会议纪要后，澳元兑美元下跌近30点，报0.7120 。美元指数在昨日触及96.65附近的低点之后反弹至96.904。日元兑美元报110.56，接近上一交易日的低点。'
    str3 = '新京报快讯 据国家市场监管总局消息，针对媒体报道水饺等猪肉制品检出非洲猪瘟病毒核酸阳性问题，市场监管总局、农业农村部已要求企业立即追溯猪肉原料来源并对猪肉制品进行了处置。两部门已派出联合督查组调查核实相关情况，要求猪肉制品生产企业进一步加强对猪肉原料的管控，落实检验检疫票证查验规定，完善非洲猪瘟检测和复核制度，防止染疫猪肉原料进入食品加工环节。市场监管总局、农业农村部等部门要求各地全面落实防控责任，强化防控措施，规范信息报告和发布，对不按要求履行防控责任的企业，一旦发现将严厉查处。专家认为，非洲猪瘟不是人畜共患病，虽然对猪有致命危险，但对人没有任何危害，属于只传猪不传人型病毒，不会影响食品安全。开展猪肉制品病毒核酸检测，可为防控溯源工作提供线索。'
    rst = bc.encode([str1, str2, str3])
    print('rst:', rst)
    print('time used:{}'.format(time.perf_counter() - start_t))

you can see this after run the above code:

Note that it can not start NER service and Text Classification service together. but you can using twice command line start ner service and text classification with different port.

Flask server service

sometimes, multi thread deep learning model service may not use C/S service, you can useing simple http service replace that, like using flask. now you can reference code:bert_base/server/simple_flask_http_service.py，building your simple http server service

License

MIT.

The following tutorial is an old version and will be removed in the future.

How to train

1. Download BERT chinese model :

wget https://storage.googleapis.com/bert_models/2018_11_03/chinese_L-12_H-768_A-12.zip

2. create output dir

create output path in project path:

mkdir output

3. Train model

first method

  python3 bert_lstm_ner.py   \
                  --task_name="NER"  \ 
                  --do_train=True   \
                  --do_eval=True   \
                  --do_predict=True
                  --data_dir=NERdata   \
                  --vocab_file=checkpoint/vocab.txt  \ 
                  --bert_config_file=checkpoint/bert_config.json \  
                  --init_checkpoint=checkpoint/bert_model.ckpt   \
                  --max_seq_length=128   \
                  --train_batch_size=32   \
                  --learning_rate=2e-5   \
                  --num_train_epochs=3.0   \
                  --output_dir=./output/result_dir/

OR replace the BERT path and project path in bert_lstm_ner.py

if os.name == 'nt': #windows path config
   bert_path = '{your BERT model path}'
   root_path = '{project path}'
else: # linux path config
   bert_path = '{your BERT model path}'
   root_path = '{project path}'

Than Run:

python3 bert_lstm_ner.py

USING BLSTM-CRF OR ONLY CRF FOR DECODE!

Just alter bert_lstm_ner.py line of 450, the params of the function of add_blstm_crf_layer: crf_only=True or False

ONLY CRF output layer:

    blstm_crf = BLSTM_CRF(embedded_chars=embedding, hidden_unit=FLAGS.lstm_size, cell_type=FLAGS.cell, num_layers=FLAGS.num_layers,
                          dropout_rate=FLAGS.droupout_rate, initializers=initializers, num_labels=num_labels,
                          seq_length=max_seq_length, labels=labels, lengths=lengths, is_training=is_training)
    rst = blstm_crf.add_blstm_crf_layer(crf_only=True)

BiLSTM with CRF output layer

    blstm_crf = BLSTM_CRF(embedded_chars=embedding, hidden_unit=FLAGS.lstm_size, cell_type=FLAGS.cell, num_layers=FLAGS.num_layers,
                          dropout_rate=FLAGS.droupout_rate, initializers=initializers, num_labels=num_labels,
                          seq_length=max_seq_length, labels=labels, lengths=lengths, is_training=is_training)
    rst = blstm_crf.add_blstm_crf_layer(crf_only=False)

Result:

all params using default

In dev data set:

In test data set

entity leval result:

last two result are label level result, the entitly level result in code of line 796-798,this result will be output in predict process. show my entity level result :

my model can download from baidu cloud:
链接：https://pan.baidu.com/s/1GfDFleCcTv5393ufBYdgqQ 提取码：4cus
NOTE: My model is trained by crf_only params

ONLINE PREDICT

If model is train finished, just run

python3 terminal_predict.py

Using NER as Service

Service

Using NER as Service is simple, you just need to run the python script below in the project root path:

python3 runs.py \ 
    -mode NER
    -bert_model_dir /home/macan/ml/data/chinese_L-12_H-768_A-12 \
    -ner_model_dir /home/macan/ml/data/bert_ner \
    -model_pd_dir /home/macan/ml/workspace/BERT_Base/output/predict_optimizer \
    -num_worker 8

You can download my ner model from：https://pan.baidu.com/s/1m9VcueQ5gF-TJc00sFD88w, ex_code: guqq
Set ner_mode.pb to model_pd_dir, and set other file to ner_model_dir and than run last cmd

Client

The client using methods can reference client_test.py script

import time
from client.client import BertClient

ner_model_dir = 'C:\workspace\python\BERT_Base\output\predict_ner'
with BertClient( ner_model_dir=ner_model_dir, show_server_config=False, check_version=False, check_length=False, mode='NER') as bc:
    start_t = time.perf_counter()
    str = '1月24日，新华社对外发布了中央对雄安新区的指导意见，洋洋洒洒1.2万多字，17次提到北京，4次提到天津，信息量很大，其实也回答了人们关心的很多问题。'
    rst = bc.encode([str])
    print('rst:', rst)
    print(time.perf_counter() - start_t)

NOTE: input format you can sometime reference bert as service project.
Welcome to provide more client language code like java or others.

Using yourself data to train

if you want to use yourself data to train ner model,you just modify the get_labes func.

def get_labels(self):
       return ["O", "B-PER", "I-PER", "B-ORG", "I-ORG", "B-LOC", "I-LOC", "X", "[CLS]", "[SEP]"]

NOTE: "X", “[CLS]”, “[SEP]” These three are necessary, you just replace your data label to this return list.
Or you can use last code lets the program automatically get the label from training data

def get_labels(self):
        # 通过读取train文件获取标签的方法会出现一定的风险。
        if os.path.exists(os.path.join(FLAGS.output_dir, 'label_list.pkl')):
            with codecs.open(os.path.join(FLAGS.output_dir, 'label_list.pkl'), 'rb') as rf:
                self.labels = pickle.load(rf)
        else:
            if len(self.labels) > 0:
                self.labels = self.labels.union(set(["X", "[CLS]", "[SEP]"]))
                with codecs.open(os.path.join(FLAGS.output_dir, 'label_list.pkl'), 'wb') as rf:
                    pickle.dump(self.labels, rf)
            else:
                self.labels = ["O", 'B-TIM', 'I-TIM', "B-PER", "I-PER", "B-ORG", "I-ORG", "B-LOC", "I-LOC", "X", "[CLS]", "[SEP]"]
        return self.labels

NEW UPDATE

2019.1.30 Support pip install and command line control

2019.1.30 Add Service/Client for NER process

2019.1.9: Add code to remove the adam related parameters in the model, and reduce the size of the model file from 1.3GB to 400MB.

2019.1.3: Add online predict code

reference:

Any problem please open issue OR email me([email protected])

Comments

运行bert-base-serving-start，生成PB文件异常

您好，今天重新训练一个模型（语料都一样），再启动服务生成PB文件的时候异常，错误信息如下： Use standard file APIs to check for files with this prefix. E:NER_MODEL, Lodding...:[gra:opt:306]:fail to optimize the graph! Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [44,44] rhs shape= [45,45] [[node save/Assign_199 (defined at /usr/local/lib/python3.6/dist-packages/bert_base/server/graph.py:291) ]]

Caused by op 'save/Assign_199', defined at: File "/usr/local/bin/bert-base-serving-start", line 10, in sys.exit(start_server()) File "/usr/local/lib/python3.6/dist-packages/bert_base/runs/init.py", line 17, in start_server server = BertServer(args) File "/usr/local/lib/python3.6/dist-packages/bert_base/server/init.py", line 92, in init with Pool(processes=1) as pool: File "/usr/lib/python3.6/multiprocessing/pool.py", line 175, in init self._repopulate_pool() File "/usr/lib/python3.6/multiprocessing/pool.py", line 236, in _repopulate_pool self._wrap_exception) File "/usr/lib/python3.6/multiprocessing/pool.py", line 255, in _repopulate_pool_static w.start() File "/usr/lib/python3.6/multiprocessing/process.py", line 105, in start self._popen = self._Popen(self) File "/usr/lib/python3.6/multiprocessing/context.py", line 277, in _Popen return Popen(process_obj) File "/usr/lib/python3.6/multiprocessing/popen_fork.py", line 19, in init self._launch(process_obj) File "/usr/lib/python3.6/multiprocessing/popen_fork.py", line 73, in _launch code = process_obj._bootstrap() File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker result = (True, func(*args, **kwds)) File "/usr/local/lib/python3.6/dist-packages/bert_base/server/graph.py", line 291, in optimize_ner_model saver = tf.train.Saver() File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 832, in init self.build() File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 844, in build self._build(self._filename, build_save=True, build_restore=True) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 881, in _build build_save=build_save, build_restore=build_restore) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 513, in _build_internal restore_sequentially, reshape) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 354, in _AddRestoreOps assign_ops.append(saveable.restore(saveable_tensors, shapes)) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saving/saveable_object_util.py", line 73, in restore self.op.get_shape().is_fully_defined()) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/state_ops.py", line 223, in assign validate_shape=validate_shape) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_state_ops.py", line 64, in assign use_locking=use_locking, name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3300, in create_op op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1801, in init self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [44,44] rhs shape= [45,45] [[node save/Assign_199 (defined at /usr/local/lib/python3.6/dist-packages/bert_base/server/graph.py:291) ]] Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1334, in _do_call return fn(*args) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1319, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [44,44] rhs shape= [45,45] [[{{node save/Assign_199}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 1276, in restore {self.saver_def.filename_tensor_name: save_path}) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 929, in run run_metadata_ptr) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1152, in _run feed_dict_tensor, options, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1328, in _do_run run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1348, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [44,44] rhs shape= [45,45] [[node save/Assign_199 (defined at /usr/local/lib/python3.6/dist-packages/bert_base/server/graph.py:291) ]]

Caused by op 'save/Assign_199', defined at: File "/usr/local/bin/bert-base-serving-start", line 10, in sys.exit(start_server()) File "/usr/local/lib/python3.6/dist-packages/bert_base/runs/init.py", line 17, in start_server server = BertServer(args) File "/usr/local/lib/python3.6/dist-packages/bert_base/server/init.py", line 92, in init with Pool(processes=1) as pool: File "/usr/lib/python3.6/multiprocessing/pool.py", line 175, in init self._repopulate_pool() File "/usr/lib/python3.6/multiprocessing/pool.py", line 236, in _repopulate_pool self._wrap_exception) File "/usr/lib/python3.6/multiprocessing/pool.py", line 255, in _repopulate_pool_static w.start() File "/usr/lib/python3.6/multiprocessing/process.py", line 105, in start self._popen = self._Popen(self) File "/usr/lib/python3.6/multiprocessing/context.py", line 277, in _Popen return Popen(process_obj) File "/usr/lib/python3.6/multiprocessing/popen_fork.py", line 19, in init self._launch(process_obj) File "/usr/lib/python3.6/multiprocessing/popen_fork.py", line 73, in _launch code = process_obj._bootstrap() File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker result = (True, func(*args, **kwds)) File "/usr/local/lib/python3.6/dist-packages/bert_base/server/graph.py", line 291, in optimize_ner_model saver = tf.train.Saver() File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 832, in init self.build() File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 844, in build self._build(self._filename, build_save=True, build_restore=True) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 881, in _build build_save=build_save, build_restore=build_restore) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 513, in _build_internal restore_sequentially, reshape) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 354, in _AddRestoreOps assign_ops.append(saveable.restore(saveable_tensors, shapes)) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saving/saveable_object_util.py", line 73, in restore self.op.get_shape().is_fully_defined()) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/state_ops.py", line 223, in assign validate_shape=validate_shape) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_state_ops.py", line 64, in assign use_locking=use_locking, name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3300, in create_op op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1801, in init self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [44,44] rhs shape= [45,45] [[node save/Assign_199 (defined at /usr/local/lib/python3.6/dist-packages/bert_base/server/graph.py:291) ]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/bert_base/server/graph.py", line 295, in optimize_ner_model saver.restore(sess, tf.train.latest_checkpoint(args.model_dir)) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 1312, in restore err, "a mismatch between the current graph and the graph") tensorflow.python.framework.errors_impl.InvalidArgumentError: Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [44,44] rhs shape= [45,45] [[node save/Assign_199 (defined at /usr/local/lib/python3.6/dist-packages/bert_base/server/graph.py:291) ]]

Caused by op 'save/Assign_199', defined at: File "/usr/local/bin/bert-base-serving-start", line 10, in sys.exit(start_server()) File "/usr/local/lib/python3.6/dist-packages/bert_base/runs/init.py", line 17, in start_server server = BertServer(args) File "/usr/local/lib/python3.6/dist-packages/bert_base/server/init.py", line 92, in init with Pool(processes=1) as pool: File "/usr/lib/python3.6/multiprocessing/pool.py", line 175, in init self._repopulate_pool() File "/usr/lib/python3.6/multiprocessing/pool.py", line 236, in _repopulate_pool self._wrap_exception) File "/usr/lib/python3.6/multiprocessing/pool.py", line 255, in _repopulate_pool_static w.start() File "/usr/lib/python3.6/multiprocessing/process.py", line 105, in start self._popen = self._Popen(self) File "/usr/lib/python3.6/multiprocessing/context.py", line 277, in _Popen return Popen(process_obj) File "/usr/lib/python3.6/multiprocessing/popen_fork.py", line 19, in init self._launch(process_obj) File "/usr/lib/python3.6/multiprocessing/popen_fork.py", line 73, in _launch code = process_obj._bootstrap() File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker result = (True, func(*args, **kwds)) File "/usr/local/lib/python3.6/dist-packages/bert_base/server/graph.py", line 291, in optimize_ner_model saver = tf.train.Saver() File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 832, in init self.build() File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 844, in build self._build(self._filename, build_save=True, build_restore=True) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 881, in _build build_save=build_save, build_restore=build_restore) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 513, in _build_internal restore_sequentially, reshape) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 354, in _AddRestoreOps assign_ops.append(saveable.restore(saveable_tensors, shapes)) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saving/saveable_object_util.py", line 73, in restore self.op.get_shape().is_fully_defined()) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/state_ops.py", line 223, in assign validate_shape=validate_shape) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_state_ops.py", line 64, in assign use_locking=use_locking, name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3300, in create_op op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1801, in init self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [44,44] rhs shape= [45,45] [[node save/Assign_199 (defined at /usr/local/lib/python3.6/dist-packages/bert_base/server/graph.py:291) ]]

Traceback (most recent call last): File "/usr/local/bin/bert-base-serving-start", line 10, in sys.exit(start_server()) File "/usr/local/lib/python3.6/dist-packages/bert_base/runs/init.py", line 17, in start_server server = BertServer(args) File "/usr/local/lib/python3.6/dist-packages/bert_base/server/init.py", line 102, in init raise FileNotFoundError('graph optimization fails and returns empty result') FileNotFoundError: graph optimization fails and returns empty result

opened by huangxz 18
训练接阶段f1等值均为0

作者大大，您好！我看了很多Issues，都是预测的时候出现f1,precision等为0的情况，而我却总是在训练中这些值为0，求问一下原因我的运行参数是：bert-base-ner-train -data_dir=/data/xiang/ccks-NER -output_dir /data/xiang/ccks-NER/output/ -do_train=True -do_eval=True -do_predict=True -vocab_file=chinese_L-12_H-768_A-12/vocab.txt -bert_config_file=chinese_L-12_H-768_A-12/bert_config.json -init_checkpoint=chinese_L-12_H-768_A-12/bert_model.ckpt -max_seq_length=32 -batch_size=64 -learning_rate=1e-3 -num_train_epochs=3.0 -device_map=0,1 -lstm_size=32 数据集是中文的问答数据，一共有27000多个问句，我试了对实体进行0/1标注以及BIEO标注（并手动改变了bert_lstm_ner.py中labels列表的值），结果都没变，和上面的一样，标签全是0(或O),最后的输出结果也没变过，连修改lstm_size,batch_size,epochs也输出也没变过是不是我的数据太少的缘故呢？具体怎么改呢？希望您看到后能告诉我一下原因

opened by xiangwf 16
用自己的新的实体标签预测得到准确率特别低预测结果中出现B-和I-不配套的情况

1、数据集不大，只有4000+条句子 2、labels.txt的内容为不知道这里面是不是必须加上"X", “[CLS]”, “[SEP]” 3、运行命令如下 4、测试集上效果很差 5、查看label_test.txt，发现问题

不知道这种状况是什么原因导致的，谢谢

opened by xmy7216 15
BUG in function project_crf_layer when using BERT + CRF without Bi-LSTM

In function project_crf_layer, activation function tanh is not needed. On the contrary, it leads to bad performance.

Many report unexpected results using this repo's code. I think this may be the problem.

Somebody check it!

opened by passerbythesun 10
训练自己模型的问题

假设我要对植物、动物名词进行NER，请问需要多少大概多少语料做训练集？

我现在标注了十几篇文章中的植物名(B-PLANT,I-PLANT)、动物名(B-AN,I-AN)，追加到了NERdata/train.txt中，然后用bert_lstm_ner.py训练出了模型。之后用模型想提取一些动植物名，可以识别出ORG、LOC、PER，但就是识别不出PLANT和AN？不清楚是哪方面的问题，训练集太少吗？

opened by 2efPer 10
效果挺好的，人名识别的精度到了97%

接下来准备用自己的语料训练一下，准备做垂直领域的实体识别，识别音乐和阅读领域里面的歌手、歌名、作者、书名。之前用纯粹的crf做过，但是由于某些网络作者和很长的歌名、书名识别效果不好。这个对于非常正常、传统的中文人名识别的不错，对于网络小说作者、像一句话一样的歌名的识别有待验证。

对了在这里我问一下假如要识别中英混杂的句子里面的歌手名和歌曲名的话，我通过下面的数据集进行训练是否可行呢，会不会有英文无法识别的情况 "我要听韩红的青藏高原"( O O O B-PER I-PER O B-SNG I-SNG I-SNG I-SNG）、 "我要听Adele的Rolling in the deep"（O O O B-PER O B-SNG I-SNG I-SNG I-SNG）

对了我在运行你的模型的时候发现你是将训练和评估放在一起的，但是这个api好像有问题，会提示stop_if_no_decrease_hook不存在，后面我将训练和评估过程分开了，我的环境是tensorflow1.14.0+cuda10.0

opened by zongyanke 8
训练模型时候报错

InvalidArgumentError (see above for traceback): indices[0,1] = 196 is not in [0, 169) [[Node: crf_loss/GatherV2_1 = GatherV2[Taxis=DT_INT32, Tindices=DT_INT32, Tparams=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](crf_loss/Reshape_3, crf_loss/add_2, crf_loss/GatherV2/axis)]]

opened by zhangjiantong 7

[BUG]blstm_layer implementation code may have a bug

    def blstm_layer(self, embedding_chars):
        """

        :return:
        """
        with tf.variable_scope('rnn_layer'):
            cell_fw, cell_bw = self._bi_dir_rnn()
            if self.num_layers > 1:
                cell_fw = rnn.MultiRNNCell([cell_fw] * self.num_layers, state_is_tuple=True)
                cell_bw = rnn.MultiRNNCell([cell_bw] * self.num_layers, state_is_tuple=True)

            outputs, _ = tf.nn.bidirectional_dynamic_rnn(cell_fw, cell_bw, embedding_chars,
                                                         dtype=tf.float32)
            outputs = tf.concat(outputs, axis=2)
        return outputs

The above code uses [cell_fw] * self.num_layers to create multiple cells, this will cause a problem, which each element of the resultant array refer to the same object and share the same params.

The problem is like that:

a = []
b = [a] * 3
print(b)  # will output [[], [], []]
a.append(1)
print(b) # will output [[1], [1], [1]]
assert id(a) == id(b[1])

opened by starplanet 7

关于英文CoNLL2003 报训练集太小？何种方式验证数据集上的指标？

使用以下方式运行：

bert-base-ner-train
-data_dir {your dataset dir}
-output_dir {training output dir}
-init_checkpoint {Google BERT model dir}
-bert_config_file {bert_config.json under the Google BERT model dir}
-vocab_file {vocab.txt under the Google BERT model dir}

报错

File "D:\projects\BERT-BiLSTM-CRF-NER\bert_base\train\bert_lstm_ner.py", line 570, in train raise AttributeError('training data is so small...') ###换成 python run.py的方式运行，会产生Exhausted resource。

想请问下，如果我想验证该代码数据集上的准确率和F1值，应该以何种方式运行？？

opened by Nicozwy 7
Bug report: using tanh for logits is wrong

https://github.com/macanv/BERT-BiLSTM-CRF-NER/blob/c1aa61d9c095e621dd10c6756a1ae651eb1d54f6/bert_base/train/lstm_crf_layer.py#L147

Generally, the above line computes the logit for each tag but adding the 'tanh' function is wrong since it will map the logits to the interval [-1, 1], thus hurt the Viterbi decoding processing and the model performance. At first, I ran my model with the above line, the results indicated that softmax is better than crf. This is counterintuitive. After removing the 'tanh' function, the obtained results seemed more reasonable with crf surpass softmax.

opened by canjiali 7
结果中precision,recall,FB1全为0

你好，我下载代码后，对自己的训练数据进行标注，并修改下面代码中self.labels一行 # 通过读取train文件获取标签的方法会出现一定的风险。 if os.path.exists(os.path.join(self.output_dir, 'label_list.pkl')): with codecs.open(os.path.join(self.output_dir, 'label_list.pkl'), 'rb') as rf: self.labels = pickle.load(rf) else: if len(self.labels) > 0: self.labels = self.labels.union(set(["X", "[CLS]", "[SEP]"])) with codecs.open(os.path.join(self.output_dir, 'label_list.pkl'), 'wb') as rf: pickle.dump(self.labels, rf) else: self.labels = ["O", 'B-TIM', 'I-TIM', "B-PER", "I-PER", "B-ORG", "I-ORG", "B-LOC", "I-LOC", "X", "[CLS]", "[SEP]"] return self.labels 改成自己标注，如下 self.labels = ["O", 'B-TECH', 'I-TECH', "E-TECH", "B-APP", "I-APP", "E-APP", "B-PRO", "I-PRO", "E-PRO", "X", "[CLS]", "[SEP]"] 我的数据标注是按照"O", 'B-TECH', 'I-TECH', "E-TECH", "B-APP", "I-APP", "E-APP", "B-PRO", "I-PRO", "E-PRO"标注的。将run.py里的启动代码加到了bert_lstm_ner.py里，启动代码bert_lstm_ner.py 如下： nohup python3.6 /dir/BERT-BiLSTM-CRF-NER-master/bert_base/train/bert_lstm_ner.py -ner='ner' -data_dir=/dir/data/ -vocab_file=/dir/chinese_L-12_H-768_A-12/vocab.txt -bert_config_file=/dir/chinese_L-12_H-768_A-12/bert_config.json -init_checkpoint=/dir/chinese_L-12_H-768_A-12/bert_model.ckpt -max_seq_length=128 -learning_rate=2e-5 -num_train_epochs=10.0 -output_dir=/dir/output/result_dir/ -batch_size=32 & 运行完后结果： shape of input_ids (?, 128) shape of input_ids (?, 128) shape of input_ids (?, 128) processed 2785945 tokens with 12273 phrases; found: 0 phrases; correct: 0. accuracy: 98.34%; precision: 0.00%; recall: 0.00%; FB1: 0.00 APP: precision: 0.00%; recall: 0.00%; FB1: 0.00 0 PRO: precision: 0.00%; recall: 0.00%; FB1: 0.00 0 TECH: precision: 0.00%; recall: 0.00%; FB1: 0.00 0 请问出现这个情况的可能原因是什么呢？还请不胜赐教，谢谢！

opened by cwj 7
训练过程具体是怎么样的
我用您的模型在自己的模型上训练，请问训练过程是

random initialization -> fine-tuning with a fixed learning rate 还是

freezing the BERT part and training the BiLSTM-CRF part -> fine-tuning the whole network with a small learning rate

因为有看到测试的时候似乎用用原BERT的representation 的
opened by zacharykzhao 0
显存几乎沾满，利用率却很低

不管数据量有多大，显存本身24000MB 多占用了23000MB多(并且这个数量不变)，请问这是怎么回事啊？用了各种方式都不管用：1、tf.data.TFRecordDataset.cache()，2、tf.data.TFRecordDataset.shard，3、把tf_record 分成多份读取，4、epoch 和batch_size 分别改成 1和16，最后只能用 session_config.gpu_options.per_process_gpu_memory_fraction @macanv 求大佬指点！

opened by zhishui3 0