路径下CPT/blob/master/finetune/generation/run_gen.py
是CPT的版本
我自己按照这个改了一个bart版本,但是显示有很多层not used或者not initialized。
Some weights of the model checkpoint at model/bart-base-chinese were not used when initializing BartForConditionalGeneration
Some weights of BartForConditionalGeneration were not initialized
不知道这些警告是否有影响,或者能否提供一个run_gen.py
的bart版本?
详细信息如下所示:
loading weights file model/bart-base-chinese/pytorch_model.bin
Some weights of the model checkpoint at model/bart-base-chinese were not used when initializing BartForConditionalGeneration: ['encoder.layers.4.fc1.bias',
'encoder.layers.0.self_attn.k_proj.bias',
'encoder.layers.3.fc1.bias',
'encoder.layers.4.fc1.weight',
'encoder.layers.1.final_layer_norm.bias',
'encoder.layers.0.fc2.weight',
'encoder.layers.0.self_attn.out_proj.bias',
'encoder.layers.1.self_attn.out_proj.weight',
'encoder.layers.3.self_attn.k_proj.bias',
'encoder.layernorm_embedding.weight',
'encoder.layers.1.fc2.weight',
'encoder.layers.5.self_attn.q_proj.weight',
'encoder.layers.5.self_attn.q_proj.bias',
'encoder.layers.0.final_layer_norm.weight',
'encoder.layers.1.self_attn.v_proj.weight',
'encoder.layers.4.self_attn.out_proj.weight',
'encoder.layers.5.self_attn_layer_norm.bias',
'encoder.layers.0.self_attn_layer_norm.bias',
'encoder.layers.3.self_attn.k_proj.weight',
'encoder.embed_tokens.weight',
'encoder.layers.1.self_attn.v_proj.bias',
'encoder.layers.5.final_layer_norm.bias',
'encoder.layers.1.fc1.weight',
'encoder.layers.5.self_attn_layer_norm.weight',
'encoder.layers.2.fc1.weight',
'encoder.layers.0.final_layer_norm.bias',
'encoder.layers.1.fc2.bias',
'encoder.layers.3.self_attn.v_proj.weight',
'encoder.layers.3.final_layer_norm.bias',
'encoder.layers.2.fc1.bias',
'encoder.layers.3.self_attn.q_proj.weight',
'encoder.layers.1.final_layer_norm.weight',
'encoder.layers.4.fc2.bias',
'encoder.layers.4.self_attn.out_proj.bias',
'encoder.layers.2.self_attn.q_proj.weight',
'encoder.layers.2.final_layer_norm.weight',
'encoder.embed_positions.weight',
'encoder.layers.3.self_attn.out_proj.bias',
'encoder.layers.3.fc1.weight',
'encoder.layers.1.fc1.bias',
'encoder.layers.0.self_attn.k_proj.weight',
'encoder.layers.1.self_attn.k_proj.bias',
'encoder.layers.0.fc2.bias',
'encoder.layers.1.self_attn.k_proj.weight',
'encoder.layers.5.self_attn.v_proj.bias',
'encoder.layers.1.self_attn.q_proj.weight',
'encoder.layers.2.final_layer_norm.bias',
'encoder.layers.4.self_attn_layer_norm.weight',
'encoder.layers.4.self_attn.v_proj.bias',
'encoder.layers.2.self_attn_layer_norm.weight',
'encoder.layers.0.fc1.weight',
'encoder.layers.4.self_attn.k_proj.bias',
'encoder.layers.0.self_attn.q_proj.bias',
'encoder.layers.4.final_layer_norm.bias',
'encoder.layers.0.self_attn.v_proj.weight',
'encoder.layers.3.final_layer_norm.weight',
'encoder.layers.5.self_attn.out_proj.weight',
'encoder.layers.4.self_attn.q_proj.weight',
'encoder.layers.0.self_attn_layer_norm.weight',
'encoder.layers.5.self_attn.v_proj.weight',
'encoder.layers.2.self_attn.v_proj.weight',
'encoder.layers.1.self_attn.out_proj.bias',
'encoder.layers.2.self_attn.k_proj.bias',
'encoder.layers.2.self_attn.out_proj.weight',
'encoder.layers.3.self_attn.v_proj.bias',
'encoder.layers.2.self_attn.q_proj.bias',
'encoder.layers.2.self_attn.out_proj.bias',
'encoder.layers.3.fc2.bias',
'encoder.layers.5.fc1.weight',
'encoder.layernorm_embedding.bias',
'encoder.layers.0.fc1.bias',
'encoder.layers.3.self_attn_layer_norm.bias',
'encoder.layers.5.self_attn.k_proj.weight',
'encoder.layers.5.fc1.bias',
'encoder.layers.3.fc2.weight',
'encoder.layers.4.fc2.weight',
'encoder.layers.0.self_attn.v_proj.bias',
'encoder.layers.0.self_attn.q_proj.weight',
'encoder.layers.1.self_attn.q_proj.bias',
'encoder.layers.3.self_attn_layer_norm.weight',
'encoder.layers.2.self_attn.k_proj.weight',
'encoder.layers.2.self_attn.v_proj.bias',
'encoder.layers.5.final_layer_norm.weight',
'encoder.layers.5.self_attn.out_proj.bias',
'encoder.layers.0.self_attn.out_proj.weight',
'encoder.layers.5.fc2.weight',
'encoder.layers.5.fc2.bias',
'encoder.layers.1.self_attn_layer_norm.bias',
'encoder.layers.4.self_attn.k_proj.weight',
'encoder.layers.5.self_attn.k_proj.bias',
'encoder.layers.3.self_attn.q_proj.bias',
'encoder.layers.4.self_attn.q_proj.bias',
'encoder.layers.1.self_attn_layer_norm.weight',
'encoder.layers.2.self_attn_layer_norm.bias',
'encoder.layers.4.final_layer_norm.weight',
'encoder.layers.4.self_attn.v_proj.weight',
'encoder.layers.2.fc2.weight',
'encoder.layers.2.fc2.bias',
'encoder.layers.4.self_attn_layer_norm.bias',
'encoder.layers.3.self_attn.out_proj.weight']
- This IS expected if you are initializing BartForConditionalGeneration from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BartForConditionalGeneration from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BartForConditionalGeneration were not initialized from the model checkpoint at model/bart-base-chinese and are newly initialized:
['encoder.encoder.layer.1.output.dense.bias',
'encoder.encoder.layer.3.attention.self.key.bias',
'encoder.encoder.layer.3.attention.output.LayerNorm.weight',
'encoder.encoder.layer.4.attention.self.value.bias',
'encoder.encoder.layer.2.attention.output.dense.bias',
'encoder.encoder.layer.4.output.LayerNorm.bias',
'encoder.encoder.layer.4.output.LayerNorm.weight',
'encoder.encoder.layer.4.attention.output.LayerNorm.weight',
'encoder.encoder.layer.0.intermediate.dense.bias',
'encoder.encoder.layer.5.attention.output.LayerNorm.weight',
'encoder.encoder.layer.0.output.LayerNorm.bias',
'encoder.encoder.layer.5.attention.output.LayerNorm.bias',
'encoder.encoder.layer.2.attention.output.LayerNorm.weight',
'encoder.encoder.layer.2.attention.self.key.weight',
'encoder.embeddings.LayerNorm.weight',
'encoder.encoder.layer.0.attention.output.LayerNorm.weight',
'encoder.encoder.layer.1.attention.self.key.bias',
'encoder.encoder.layer.3.intermediate.dense.weight',
'encoder.encoder.layer.5.intermediate.dense.weight',
'encoder.encoder.layer.0.output.dense.weight',
'encoder.encoder.layer.5.output.LayerNorm.bias',
'encoder.encoder.layer.1.output.dense.weight',
'encoder.encoder.layer.5.attention.self.query.weight',
'encoder.encoder.layer.1.output.LayerNorm.weight',
'encoder.encoder.layer.4.attention.self.key.bias',
'encoder.encoder.layer.3.output.LayerNorm.bias',
'encoder.encoder.layer.5.output.dense.bias',
'encoder.encoder.layer.4.attention.self.key.weight',
'encoder.encoder.layer.0.attention.self.key.bias',
'encoder.encoder.layer.0.attention.self.query.weight',
'encoder.encoder.layer.0.intermediate.dense.weight',
'encoder.encoder.layer.3.output.LayerNorm.weight',
'encoder.encoder.layer.3.attention.output.dense.bias',
'encoder.encoder.layer.5.output.dense.weight',
'encoder.embeddings.LayerNorm.bias',
'encoder.encoder.layer.1.attention.self.value.weight',
'encoder.encoder.layer.2.output.dense.weight',
'encoder.encoder.layer.4.intermediate.dense.weight',
'encoder.encoder.layer.2.attention.self.value.weight',
'encoder.encoder.layer.0.attention.self.value.weight',
'encoder.encoder.layer.0.attention.output.dense.bias',
'encoder.encoder.layer.2.attention.output.LayerNorm.bias',
'encoder.encoder.layer.3.output.dense.bias',
'encoder.encoder.layer.5.output.LayerNorm.weight',
'encoder.encoder.layer.5.attention.output.dense.bias',
'encoder.encoder.layer.4.attention.self.value.weight',
'encoder.encoder.layer.3.attention.self.query.bias',
'encoder.encoder.layer.3.attention.self.value.weight',
'encoder.encoder.layer.3.attention.self.key.weight',
'encoder.encoder.layer.0.output.dense.bias',
'encoder.encoder.layer.1.intermediate.dense.bias',
'encoder.encoder.layer.0.attention.self.query.bias',
'encoder.encoder.layer.1.intermediate.dense.weight',
'encoder.encoder.layer.0.attention.output.dense.weight',
'encoder.encoder.layer.5.attention.self.value.bias',
'encoder.embeddings.token_type_embeddings.weight',
'encoder.encoder.layer.1.attention.output.dense.weight',
'encoder.encoder.layer.2.attention.self.query.bias',
'encoder.encoder.layer.2.attention.self.query.weight',
'encoder.encoder.layer.2.attention.output.dense.weight',
'encoder.encoder.layer.5.attention.self.query.bias',
'encoder.embeddings.position_ids',
'encoder.embeddings.position_embeddings.weight',
'encoder.encoder.layer.3.attention.self.query.weight',
'encoder.embeddings.word_embeddings.weight',
'encoder.encoder.layer.4.output.dense.bias',
'encoder.encoder.layer.1.attention.output.LayerNorm.weight',
'encoder.encoder.layer.4.attention.self.query.bias',
'encoder.encoder.layer.3.attention.self.value.bias',
'encoder.encoder.layer.5.intermediate.dense.bias',
'encoder.encoder.layer.1.output.LayerNorm.bias',
'encoder.encoder.layer.3.attention.output.dense.weight',
'encoder.encoder.layer.3.attention.output.LayerNorm.bias',
'encoder.encoder.layer.2.output.LayerNorm.weight',
'encoder.encoder.layer.4.attention.output.dense.weight',
'encoder.encoder.layer.4.intermediate.dense.bias',
'encoder.encoder.layer.2.attention.self.value.bias',
'encoder.encoder.layer.0.attention.self.key.weight',
'encoder.encoder.layer.1.attention.self.query.weight',
'encoder.encoder.layer.2.intermediate.dense.bias',
'encoder.encoder.layer.2.intermediate.dense.weight',
'encoder.encoder.layer.5.attention.self.key.bias',
'encoder.encoder.layer.2.attention.self.key.bias',
'encoder.encoder.layer.2.output.LayerNorm.bias',
'encoder.encoder.layer.5.attention.self.key.weight',
'encoder.encoder.layer.0.attention.output.LayerNorm.bias',
'encoder.encoder.layer.5.attention.self.value.weight',
'encoder.encoder.layer.4.attention.output.dense.bias',
'encoder.encoder.layer.1.attention.output.LayerNorm.bias',
'encoder.encoder.layer.1.attention.output.dense.bias',
'encoder.encoder.layer.5.attention.output.dense.weight',
'encoder.encoder.layer.4.output.dense.weight',
'encoder.encoder.layer.0.attention.self.value.bias',
'encoder.encoder.layer.1.attention.self.value.bias',
'encoder.encoder.layer.0.output.LayerNorm.weight',
'encoder.encoder.layer.1.attention.self.key.weight',
'encoder.encoder.layer.3.intermediate.dense.bias',
'encoder.encoder.layer.1.attention.self.query.bias',
'encoder.encoder.layer.4.attention.self.query.weight',
'encoder.encoder.layer.3.output.dense.weight',
'encoder.encoder.layer.2.output.dense.bias',
'encoder.encoder.layer.4.attention.output.LayerNorm.bias']