Hello, I'm trying to reproduce the paper results with roberta and for both roberta-base
and roberta-large
after cloning the repositories into pretrained_lm
, I am receiving a warning which suggests that none of the weights are being loaded.
The md5sum for roberta-base
pytorch_model.bin
is 73db58b6c51b028e0ee031f12261b51d
The md5sum for roberta-large
pytorch_model.bin
is 234a3b27e09d3486d7719c66ba1aaa31
I am using package versions:
torch==1.9.0
torchcontrib==0.0.2
transformers==2.7.0
Can you advise on what to do or provide a link to where you are downloading the pretrained models?
The warning for roberta-large
is as follows (all 24 layers present in this warning):
07/28/2021 19:48:35 - INFO - transformers.modeling_utils - Weights of RobertaForDocRED not initialized from pretrained model: ['roberta.embeddings.ner_emb.weight', 'roberta.embeddings.ent_emb.weight', 'roberta.encoder.layer.0.attention.self.bili.0', 'roberta.encoder.layer.0.attention.self.bili.1', 'roberta.encoder.layer.0.attention.self.bili.2', 'roberta.encoder.layer.0.attention.self.bili.3', 'roberta.encoder.layer.0.attention.self.bili.4', 'roberta.encoder.layer.0.attention.self.abs_bias.0', 'roberta.encoder.layer.0.attention.self.abs_bias.1', 'roberta.encoder.layer.0.attention.self.abs_bias.2', 'roberta.encoder.layer.0.attention.self.abs_bias.3', 'roberta.encoder.layer.0.attention.self.abs_bias.4', 'roberta.encoder.layer.1.attention.self.bili.0', 'roberta.encoder.layer.1.attention.self.bili.1', 'roberta.encoder.layer.1.attention.self.bili.2', 'roberta.encoder.layer.1.attention.self.bili.3', 'roberta.encoder.layer.1.attention.self.bili.4', 'roberta.encoder.layer.1.attention.self.abs_bias.0', 'roberta.encoder.layer.1.attention.self.abs_bias.1', 'roberta.encoder.layer.1.attention.self.abs_bias.2', 'roberta.encoder.layer.1.attention.self.abs_bias.3', 'roberta.encoder.layer.1.attention.self.abs_bias.4', 'roberta.encoder.layer.2.attention.self.bili.0', 'roberta.encoder.layer.2.attention.self.bili.1', 'roberta.encoder.layer.2.attention.self.bili.2', 'roberta.encoder.layer.2.attention.self.bili.3', 'roberta.encoder.layer.2.attention.self.bili.4', 'roberta.encoder.layer.2.attention.self.abs_bias.0', 'roberta.encoder.layer.2.attention.self.abs_bias.1', 'roberta.encoder.layer.2.attention.self.abs_bias.2', 'roberta.encoder.layer.2.attention.self.abs_bias.3', 'roberta.encoder.layer.2.attention.self.abs_bias.4', 'roberta.encoder.layer.3.attention.self.bili.0', 'roberta.encoder.layer.3.attention.self.bili.1', 'roberta.encoder.layer.3.attention.self.bili.2', 'roberta.encoder.layer.3.attention.self.bili.3', 'roberta.encoder.layer.3.attention.self.bili.4', 'roberta.encoder.layer.3.attention.self.abs_bias.0', 'roberta.encoder.layer.3.attention.self.abs_bias.1', 'roberta.encoder.layer.3.attention.self.abs_bias.2', 'roberta.encoder.layer.3.attention.self.abs_bias.3', 'roberta.encoder.layer.3.attention.self.abs_bias.4', 'roberta.encoder.layer.4.attention.self.bili.0', 'roberta.encoder.layer.4.attention.self.bili.1', 'roberta.encoder.layer.4.attention.self.bili.2', 'roberta.encoder.layer.4.attention.self.bili.3', 'roberta.encoder.layer.4.attention.self.bili.4', 'roberta.encoder.layer.4.attention.self.abs_bias.0', 'roberta.encoder.layer.4.attention.self.abs_bias.1', 'roberta.encoder.layer.4.attention.self.abs_bias.2', 'roberta.encoder.layer.4.attention.self.abs_bias.3', 'roberta.encoder.layer.4.attention.self.abs_bias.4', 'roberta.encoder.layer.5.attention.self.bili.0', 'roberta.encoder.layer.5.attention.self.bili.1', 'roberta.encoder.layer.5.attention.self.bili.2', 'roberta.encoder.layer.5.attention.self.bili.3', 'roberta.encoder.layer.5.attention.self.bili.4', 'roberta.encoder.layer.5.attention.self.abs_bias.0', 'roberta.encoder.layer.5.attention.self.abs_bias.1', 'roberta.encoder.layer.5.attention.self.abs_bias.2', 'roberta.encoder.layer.5.attention.self.abs_bias.3', 'roberta.encoder.layer.5.attention.self.abs_bias.4', 'roberta.encoder.layer.6.attention.self.bili.0', 'roberta.encoder.layer.6.attention.self.bili.1', 'roberta.encoder.layer.6.attention.self.bili.2', 'roberta.encoder.layer.6.attention.self.bili.3', 'roberta.encoder.layer.6.attention.self.bili.4', 'roberta.encoder.layer.6.attention.self.abs_bias.0', 'roberta.encoder.layer.6.attention.self.abs_bias.1', 'roberta.encoder.layer.6.attention.self.abs_bias.2', 'roberta.encoder.layer.6.attention.self.abs_bias.3', 'roberta.encoder.layer.6.attention.self.abs_bias.4', 'roberta.encoder.layer.7.attention.self.bili.0', 'roberta.encoder.layer.7.attention.self.bili.1', 'roberta.encoder.layer.7.attention.self.bili.2', 'roberta.encoder.layer.7.attention.self.bili.3', 'roberta.encoder.layer.7.attention.self.bili.4', 'roberta.encoder.layer.7.attention.self.abs_bias.0', 'roberta.encoder.layer.7.attention.self.abs_bias.1', 'roberta.encoder.layer.7.attention.self.abs_bias.2', 'roberta.encoder.layer.7.attention.self.abs_bias.3', 'roberta.encoder.layer.7.attention.self.abs_bias.4', 'roberta.encoder.layer.8.attention.self.bili.0', 'roberta.encoder.layer.8.attention.self.bili.1', 'roberta.encoder.layer.8.attention.self.bili.2', 'roberta.encoder.layer.8.attention.self.bili.3', 'roberta.encoder.layer.8.attention.self.bili.4', 'roberta.encoder.layer.8.attention.self.abs_bias.0', 'roberta.encoder.layer.8.attention.self.abs_bias.1', 'roberta.encoder.layer.8.attention.self.abs_bias.2', 'roberta.encoder.layer.8.attention.self.abs_bias.3', 'roberta.encoder.layer.8.attention.self.abs_bias.4', 'roberta.encoder.layer.9.attention.self.bili.0', 'roberta.encoder.layer.9.attention.self.bili.1', 'roberta.encoder.layer.9.attention.self.bili.2', 'roberta.encoder.layer.9.attention.self.bili.3', 'roberta.encoder.layer.9.attention.self.bili.4', 'roberta.encoder.layer.9.attention.self.abs_bias.0', 'roberta.encoder.layer.9.attention.self.abs_bias.1', 'roberta.encoder.layer.9.attention.self.abs_bias.2', 'roberta.encoder.layer.9.attention.self.abs_bias.3', 'roberta.encoder.layer.9.attention.self.abs_bias.4', 'roberta.encoder.layer.10.attention.self.bili.0', 'roberta.encoder.layer.10.attention.self.bili.1', 'roberta.encoder.layer.10.attention.self.bili.2', 'roberta.encoder.layer.10.attention.self.bili.3', 'roberta.encoder.layer.10.attention.self.bili.4', 'roberta.encoder.layer.10.attention.self.abs_bias.0', 'roberta.encoder.layer.10.attention.self.abs_bias.1', 'roberta.encoder.layer.10.attention.self.abs_bias.2', 'roberta.encoder.layer.10.attention.self.abs_bias.3', 'roberta.encoder.layer.10.attention.self.abs_bias.4', 'roberta.encoder.layer.11.attention.self.bili.0', 'roberta.encoder.layer.11.attention.self.bili.1', 'roberta.encoder.layer.11.attention.self.bili.2', 'roberta.encoder.layer.11.attention.self.bili.3', 'roberta.encoder.layer.11.attention.self.bili.4', 'roberta.encoder.layer.11.attention.self.abs_bias.0', 'roberta.encoder.layer.11.attention.self.abs_bias.1', 'roberta.encoder.layer.11.attention.self.abs_bias.2', 'roberta.encoder.layer.11.attention.self.abs_bias.3', 'roberta.encoder.layer.11.attention.self.abs_bias.4', 'roberta.encoder.layer.12.attention.self.bili.0', 'roberta.encoder.layer.12.attention.self.bili.1', 'roberta.encoder.layer.12.attention.self.bili.2', 'roberta.encoder.layer.12.attention.self.bili.3', 'roberta.encoder.layer.12.attention.self.bili.4', 'roberta.encoder.layer.12.attention.self.abs_bias.0', 'roberta.encoder.layer.12.attention.self.abs_bias.1', 'roberta.encoder.layer.12.attention.self.abs_bias.2', 'roberta.encoder.layer.12.attention.self.abs_bias.3', 'roberta.encoder.layer.12.attention.self.abs_bias.4', 'roberta.encoder.layer.13.attention.self.bili.0', 'roberta.encoder.layer.13.attention.self.bili.1', 'roberta.encoder.layer.13.attention.self.bili.2', 'roberta.encoder.layer.13.attention.self.bili.3', 'roberta.encoder.layer.13.attention.self.bili.4', 'roberta.encoder.layer.13.attention.self.abs_bias.0', 'roberta.encoder.layer.13.attention.self.abs_bias.1', 'roberta.encoder.layer.13.attention.self.abs_bias.2', 'roberta.encoder.layer.13.attention.self.abs_bias.3', 'roberta.encoder.layer.13.attention.self.abs_bias.4', 'roberta.encoder.layer.14.attention.self.bili.0', 'roberta.encoder.layer.14.attention.self.bili.1', 'roberta.encoder.layer.14.attention.self.bili.2', 'roberta.encoder.layer.14.attention.self.bili.3', 'roberta.encoder.layer.14.attention.self.bili.4', 'roberta.encoder.layer.14.attention.self.abs_bias.0', 'roberta.encoder.layer.14.attention.self.abs_bias.1', 'roberta.encoder.layer.14.attention.self.abs_bias.2', 'roberta.encoder.layer.14.attention.self.abs_bias.3', 'roberta.encoder.layer.14.attention.self.abs_bias.4', 'roberta.encoder.layer.15.attention.self.bili.0', 'roberta.encoder.layer.15.attention.self.bili.1', 'roberta.encoder.layer.15.attention.self.bili.2', 'roberta.encoder.layer.15.attention.self.bili.3', 'roberta.encoder.layer.15.attention.self.bili.4', 'roberta.encoder.layer.15.attention.self.abs_bias.0', 'roberta.encoder.layer.15.attention.self.abs_bias.1', 'roberta.encoder.layer.15.attention.self.abs_bias.2', 'roberta.encoder.layer.15.attention.self.abs_bias.3', 'roberta.encoder.layer.15.attention.self.abs_bias.4', 'roberta.encoder.layer.16.attention.self.bili.0', 'roberta.encoder.layer.16.attention.self.bili.1', 'roberta.encoder.layer.16.attention.self.bili.2', 'roberta.encoder.layer.16.attention.self.bili.3', 'roberta.encoder.layer.16.attention.self.bili.4', 'roberta.encoder.layer.16.attention.self.abs_bias.0', 'roberta.encoder.layer.16.attention.self.abs_bias.1', 'roberta.encoder.layer.16.attention.self.abs_bias.2', 'roberta.encoder.layer.16.attention.self.abs_bias.3', 'roberta.encoder.layer.16.attention.self.abs_bias.4', 'roberta.encoder.layer.17.attention.self.bili.0', 'roberta.encoder.layer.17.attention.self.bili.1', 'roberta.encoder.layer.17.attention.self.bili.2', 'roberta.encoder.layer.17.attention.self.bili.3', 'roberta.encoder.layer.17.attention.self.bili.4', 'roberta.encoder.layer.17.attention.self.abs_bias.0', 'roberta.encoder.layer.17.attention.self.abs_bias.1', 'roberta.encoder.layer.17.attention.self.abs_bias.2', 'roberta.encoder.layer.17.attention.self.abs_bias.3', 'roberta.encoder.layer.17.attention.self.abs_bias.4', 'roberta.encoder.layer.18.attention.self.bili.0', 'roberta.encoder.layer.18.attention.self.bili.1', 'roberta.encoder.layer.18.attention.self.bili.2', 'roberta.encoder.layer.18.attention.self.bili.3', 'roberta.encoder.layer.18.attention.self.bili.4', 'roberta.encoder.layer.18.attention.self.abs_bias.0', 'roberta.encoder.layer.18.attention.self.abs_bias.1', 'roberta.encoder.layer.18.attention.self.abs_bias.2', 'roberta.encoder.layer.18.attention.self.abs_bias.3', 'roberta.encoder.layer.18.attention.self.abs_bias.4', 'roberta.encoder.layer.19.attention.self.bili.0', 'roberta.encoder.layer.19.attention.self.bili.1', 'roberta.encoder.layer.19.attention.self.bili.2', 'roberta.encoder.layer.19.attention.self.bili.3', 'roberta.encoder.layer.19.attention.self.bili.4', 'roberta.encoder.layer.19.attention.self.abs_bias.0', 'roberta.encoder.layer.19.attention.self.abs_bias.1', 'roberta.encoder.layer.19.attention.self.abs_bias.2', 'roberta.encoder.layer.19.attention.self.abs_bias.3', 'roberta.encoder.layer.19.attention.self.abs_bias.4', 'roberta.encoder.layer.20.attention.self.bili.0', 'roberta.encoder.layer.20.attention.self.bili.1', 'roberta.encoder.layer.20.attention.self.bili.2', 'roberta.encoder.layer.20.attention.self.bili.3', 'roberta.encoder.layer.20.attention.self.bili.4', 'roberta.encoder.layer.20.attention.self.abs_bias.0', 'roberta.encoder.layer.20.attention.self.abs_bias.1', 'roberta.encoder.layer.20.attention.self.abs_bias.2', 'roberta.encoder.layer.20.attention.self.abs_bias.3', 'roberta.encoder.layer.20.attention.self.abs_bias.4', 'roberta.encoder.layer.21.attention.self.bili.0', 'roberta.encoder.layer.21.attention.self.bili.1', 'roberta.encoder.layer.21.attention.self.bili.2', 'roberta.encoder.layer.21.attention.self.bili.3', 'roberta.encoder.layer.21.attention.self.bili.4', 'roberta.encoder.layer.21.attention.self.abs_bias.0', 'roberta.encoder.layer.21.attention.self.abs_bias.1', 'roberta.encoder.layer.21.attention.self.abs_bias.2', 'roberta.encoder.layer.21.attention.self.abs_bias.3', 'roberta.encoder.layer.21.attention.self.abs_bias.4', 'roberta.encoder.layer.22.attention.self.bili.0', 'roberta.encoder.layer.22.attention.self.bili.1', 'roberta.encoder.layer.22.attention.self.bili.2', 'roberta.encoder.layer.22.attention.self.bili.3', 'roberta.encoder.layer.22.attention.self.bili.4', 'roberta.encoder.layer.22.attention.self.abs_bias.0', 'roberta.encoder.layer.22.attention.self.abs_bias.1', 'roberta.encoder.layer.22.attention.self.abs_bias.2', 'roberta.encoder.layer.22.attention.self.abs_bias.3', 'roberta.encoder.layer.22.attention.self.abs_bias.4', 'roberta.encoder.layer.23.attention.self.bili.0', 'roberta.encoder.layer.23.attention.self.bili.1', 'roberta.encoder.layer.23.attention.self.bili.2', 'roberta.encoder.layer.23.attention.self.bili.3', 'roberta.encoder.layer.23.attention.self.bili.4', 'roberta.encoder.layer.23.attention.self.abs_bias.0', 'roberta.encoder.layer.23.attention.self.abs_bias.1', 'roberta.encoder.layer.23.attention.self.abs_bias.2', 'roberta.encoder.layer.23.attention.self.abs_bias.3', 'roberta.encoder.layer.23.attention.self.abs_bias.4', 'dim_reduction.weight', 'dim_reduction.bias', 'distance_emb.weight', 'bili.weight', 'bili.bias']