Hi Yurui,
When I try to load the checkpoint for 256x176 model with the given configuration, I get size mismatch errors.
I then found the the 256x176 checkpoint has the same name as the checkpoint of 512x352, so I load the 256x176 checkpoint with configuration for 512x352 for sanity check, and it works. I think the link of 256x176 checkpoint may actually contain the checkpoint of 512x352.
Could you please take a look on this?
Thank you again!
Aiyu
The error log is attached.
Error(s) in loading state_dict for Generator:
Unexpected key(s) in state_dict: "reference_encoder.convs.4.blur.kernel", "reference_encoder.convs.4.conv.weight", "reference_encoder.convs.4.activate.bias", "reference_encoder.convs.4.extraction_operations.0.value_conv.weight", "reference_encoder.convs.4.extraction_operations.0.value_conv.bias", "reference_encoder.convs.4.extraction_operations.0.semantic_extraction_filter.weight", "reference_encoder.convs.4.extraction_operations.1.value_conv.weight", "reference_encoder.convs.4.extraction_operations.1.value_conv.bias", "reference_encoder.convs.4.extraction_operations.1.semantic_extraction_filter.weight", "skeleton_encoder.convs.4.blur.kernel", "skeleton_encoder.convs.4.conv.weight", "skeleton_encoder.convs.4.activate.bias", "target_image_renderer.convs.5.conv0.conv.weight", "target_image_renderer.convs.5.conv0.blur.kernel", "target_image_renderer.convs.5.conv0.activate.bias", "target_image_renderer.convs.5.conv1.conv.weight", "target_image_renderer.convs.5.conv1.activate.bias", "target_image_renderer.convs.5.to_rgb.upsample.kernel", "target_image_renderer.convs.5.to_rgb.conv.weight", "target_image_renderer.convs.5.to_rgb.conv.bias", "target_image_renderer.convs.4.conv0.distribution_operation.semantic_distribution_filter.weight", "target_image_renderer.convs.4.conv0.distribution_operation.semantic_distribution_filter.bias", "target_image_renderer.convs.4.conv1.distribution_operation.semantic_distribution_filter.weight", "target_image_renderer.convs.4.conv1.distribution_operation.semantic_distribution_filter.bias".
size mismatch for reference_encoder.first.conv.weight: copying a param with shape torch.Size([64, 3, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 3, 1, 1]).
size mismatch for reference_encoder.first.activate.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for reference_encoder.convs.0.conv.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]).
size mismatch for reference_encoder.convs.0.activate.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for reference_encoder.convs.0.extraction_operations.0.value_conv.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]).
size mismatch for reference_encoder.convs.0.extraction_operations.0.value_conv.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for reference_encoder.convs.0.extraction_operations.0.semantic_extraction_filter.weight: copying a param with shape torch.Size([64, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 256, 3, 3]).
size mismatch for reference_encoder.convs.0.extraction_operations.1.value_conv.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]).
size mismatch for reference_encoder.convs.0.extraction_operations.1.value_conv.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for reference_encoder.convs.0.extraction_operations.1.semantic_extraction_filter.weight: copying a param with shape torch.Size([64, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 256, 3, 3]).
size mismatch for reference_encoder.convs.1.conv.weight: copying a param with shape torch.Size([256, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 256, 3, 3]).
size mismatch for reference_encoder.convs.1.activate.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for reference_encoder.convs.1.extraction_operations.0.value_conv.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]).
size mismatch for reference_encoder.convs.1.extraction_operations.0.value_conv.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for reference_encoder.convs.1.extraction_operations.0.semantic_extraction_filter.weight: copying a param with shape torch.Size([64, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 512, 3, 3]).
size mismatch for reference_encoder.convs.1.extraction_operations.1.value_conv.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]).
size mismatch for reference_encoder.convs.1.extraction_operations.1.value_conv.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for reference_encoder.convs.1.extraction_operations.1.semantic_extraction_filter.weight: copying a param with shape torch.Size([64, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 512, 3, 3]).
size mismatch for reference_encoder.convs.2.conv.weight: copying a param with shape torch.Size([512, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]).
size mismatch for reference_encoder.convs.3.extraction_operations.0.value_conv.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 1, 1]).
size mismatch for reference_encoder.convs.3.extraction_operations.0.semantic_extraction_filter.weight: copying a param with shape torch.Size([32, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 512, 1, 1]).
size mismatch for reference_encoder.convs.3.extraction_operations.1.value_conv.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 1, 1]).
size mismatch for reference_encoder.convs.3.extraction_operations.1.semantic_extraction_filter.weight: copying a param with shape torch.Size([32, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 512, 1, 1]).
size mismatch for skeleton_encoder.first.conv.weight: copying a param with shape torch.Size([64, 20, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 20, 1, 1]).
size mismatch for skeleton_encoder.first.activate.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for skeleton_encoder.convs.0.conv.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]).
size mismatch for skeleton_encoder.convs.0.activate.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for skeleton_encoder.convs.1.conv.weight: copying a param with shape torch.Size([256, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 256, 3, 3]).
size mismatch for skeleton_encoder.convs.1.activate.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for skeleton_encoder.convs.2.conv.weight: copying a param with shape torch.Size([512, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]).
size mismatch for target_image_renderer.convs.2.conv0.distribution_operation.semantic_distribution_filter.weight: copying a param with shape torch.Size([32, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 512, 3, 3]).
size mismatch for target_image_renderer.convs.2.conv0.distribution_operation.semantic_distribution_filter.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for target_image_renderer.convs.2.conv1.distribution_operation.semantic_distribution_filter.weight: copying a param with shape torch.Size([32, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 512, 3, 3]).
size mismatch for target_image_renderer.convs.2.conv1.distribution_operation.semantic_distribution_filter.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([64]).