Hello, thanks for open source!
I use mmseg, and load weight from image classification result, it warns:
WARNING - The model and loaded state dict do not match exactly
missing keys in source state_dict: backbone.head.weight, backbone.head.bias
unexpected key in source state_dict: cls_token, ln1.bias, ln1.weight, layers.0.ln1.bias, layers.0.ln1.weight, layers.0.ln2.bias, layers.0.ln2.weight, layers.0.ffn.layers.0.0.bias, layers.0.ffn.layers.0.0.weight, layers.0.ffn.layers.1.bias, layers.0.ffn.layers.1.weight, layers.0.attn.attn.out_proj.bias, layers.0.attn.attn.out_proj.weight, layers.0.attn.attn.in_proj_bias, layers.0.attn.attn.in_proj_weight, layers.1.ln1.bias, layers.1.ln1.weight, layers.1.ln2.bias, layers.1.ln2.weight, layers.1.ffn.layers.0.0.bias, layers.1.ffn.layers.0.0.weight, layers.1.ffn.layers.1.bias, layers.1.ffn.layers.1.weight, layers.1.attn.attn.out_proj.bias, layers.1.attn.attn.out_proj.weight ......
And the experimental results are terrible as the experiments initialize weight with random.
So I load weight from ADE20K result, it work and warns:
WARNING - The model and loaded state dict do not match exactly
missing keys in source state_dict: backbone.head.weight, backbone.head.bias
And the result is similar to the result you offer.
Which weight should I load? ImageNet1K or ADE20K?
Or should I modify the keys of weight in ImageNet1K to adapt the key in segmentation?