This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

Thank you for releasing pretrained weights. I tried to use some of your pretrained weights as you described in the readme but there is a mismatch between checkpoint weights and the model.

Logging to /tmp/openai-2021-07-22-14-28-52-986510 creating model and diffusion... Traceback (most recent call last): File "scripts/image_sample.py", line 108, in main() File "scripts/image_sample.py", line 33, in main model.load_state_dict( File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1223, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for UNetModel: Missing key(s) in state_dict: "input_blocks.3.0.op.weight", "input_blocks.3.0.op.bias", "input_blocks.4.0.skip_connection.weight", "input_blocks.4.0.skip_connection.bias", "input_blocks.6.0.op.weight", "input_blocks.6.0.op.bias", "input_blocks.7.1.norm.weight", "input_blocks.7.1.norm.bias", "input_blocks.7.1.qkv.weight", "input_blocks.7.1.qkv.bias", "input_blocks.7.1.proj_out.weight", "input_blocks.7.1.proj_out.bias", "input_blocks.8.1.norm.weight", "input_blocks.8.1.norm.bias", "input_blocks.8.1.qkv.weight", "input_blocks.8.1.qkv.bias", "input_blocks.8.1.proj_out.weight", "input_blocks.8.1.proj_out.bias", "input_blocks.9.0.op.weight", "input_blocks.9.0.op.bias", "input_blocks.10.0.skip_connection.weight", "input_blocks.10.0.skip_connection.bias", "output_blocks.2.2.conv.weight", "output_blocks.2.2.conv.bias", "output_blocks.5.2.conv.weight", "output_blocks.5.2.conv.bias", "output_blocks.8.1.conv.weight", "output_blocks.8.1.conv.bias". Unexpected key(s) in state_dict: "input_blocks.12.0.in_layers.0.weight", "input_blocks.12.0.in_layers.0.bias", "input_blocks.12.0.in_layers.2.weight", "input_blocks.12.0.in_layers.2.bias", "input_blocks.12.0.emb_layers.1.weight", "input_blocks.12.0.emb_layers.1.bias", "input_blocks.12.0.out_layers.0.weight", "input_blocks.12.0.out_layers.0.bias", "input_blocks.12.0.out_layers.3.weight", "input_blocks.12.0.out_layers.3.bias", "input_blocks.13.0.in_layers.0.weight", "input_blocks.13.0.in_layers.0.bias", "input_blocks.13.0.in_layers.2.weight", "input_blocks.13.0.in_layers.2.bias", "input_blocks.13.0.emb_layers.1.weight", "input_blocks.13.0.emb_layers.1.bias", "input_blocks.13.0.out_layers.0.weight", "input_blocks.13.0.out_layers.0.bias", "input_blocks.13.0.out_layers.3.weight", "input_blocks.13.0.out_layers.3.bias", "input_blocks.13.0.skip_connection.weight", "input_blocks.13.0.skip_connection.bias", "input_blocks.13.1.norm.weight", "input_blocks.13.1.norm.bias", "input_blocks.13.1.qkv.weight", "input_blocks.13.1.qkv.bias", "input_blocks.13.1.proj_out.weight", "input_blocks.13.1.proj_out.bias", "input_blocks.14.0.in_layers.0.weight", "input_blocks.14.0.in_layers.0.bias", "input_blocks.14.0.in_layers.2.weight", "input_blocks.14.0.in_layers.2.bias", "input_blocks.14.0.emb_layers.1.weight", "input_blocks.14.0.emb_layers.1.bias", "input_blocks.14.0.out_layers.0.weight", "input_blocks.14.0.out_layers.0.bias", "input_blocks.14.0.out_layers.3.weight", "input_blocks.14.0.out_layers.3.bias", "input_blocks.14.1.norm.weight", "input_blocks.14.1.norm.bias", "input_blocks.14.1.qkv.weight", "input_blocks.14.1.qkv.bias", "input_blocks.14.1.proj_out.weight", "input_blocks.14.1.proj_out.bias", "input_blocks.15.0.in_layers.0.weight", "input_blocks.15.0.in_layers.0.bias", "input_blocks.15.0.in_layers.2.weight", "input_blocks.15.0.in_layers.2.bias", "input_blocks.15.0.emb_layers.1.weight", "input_blocks.15.0.emb_layers.1.bias", "input_blocks.15.0.out_layers.0.weight", "input_blocks.15.0.out_layers.0.bias", "input_blocks.15.0.out_layers.3.weight", "input_blocks.15.0.out_layers.3.bias", "input_blocks.16.0.in_layers.0.weight", "input_blocks.16.0.in_layers.0.bias", "input_blocks.16.0.in_layers.2.weight", "input_blocks.16.0.in_layers.2.bias", "input_blocks.16.0.emb_layers.1.weight", "input_blocks.16.0.emb_layers.1.bias", "input_blocks.16.0.out_layers.0.weight", "input_blocks.16.0.out_layers.0.bias", "input_blocks.16.0.out_layers.3.weight", "input_blocks.16.0.out_layers.3.bias", "input_blocks.16.1.norm.weight", "input_blocks.16.1.norm.bias", "input_blocks.16.1.qkv.weight", "input_blocks.16.1.qkv.bias", "input_blocks.16.1.proj_out.weight", "input_blocks.16.1.proj_out.bias", "input_blocks.17.0.in_layers.0.weight", "input_blocks.17.0.in_layers.0.bias", "input_blocks.17.0.in_layers.2.weight", "input_blocks.17.0.in_layers.2.bias", "input_blocks.17.0.emb_layers.1.weight", "input_blocks.17.0.emb_layers.1.bias", "input_blocks.17.0.out_layers.0.weight", "input_blocks.17.0.out_layers.0.bias", "input_blocks.17.0.out_layers.3.weight", "input_blocks.17.0.out_layers.3.bias", "input_blocks.17.1.norm.weight", "input_blocks.17.1.norm.bias", "input_blocks.17.1.qkv.weight", "input_blocks.17.1.qkv.bias", "input_blocks.17.1.proj_out.weight", "input_blocks.17.1.proj_out.bias", "input_blocks.3.0.in_layers.0.weight", "input_blocks.3.0.in_layers.0.bias", "input_blocks.3.0.in_layers.2.weight", "input_blocks.3.0.in_layers.2.bias", "input_blocks.3.0.emb_layers.1.weight", "input_blocks.3.0.emb_layers.1.bias", "input_blocks.3.0.out_layers.0.weight", "input_blocks.3.0.out_layers.0.bias", "input_blocks.3.0.out_layers.3.weight", "input_blocks.3.0.out_layers.3.bias", "input_blocks.6.0.in_layers.0.weight", "input_blocks.6.0.in_layers.0.bias", "input_blocks.6.0.in_layers.2.weight", "input_blocks.6.0.in_layers.2.bias", "input_blocks.6.0.emb_layers.1.weight", "input_blocks.6.0.emb_layers.1.bias", "input_blocks.6.0.out_layers.0.weight", "input_blocks.6.0.out_layers.0.bias", "input_blocks.6.0.out_layers.3.weight", "input_blocks.6.0.out_layers.3.bias", "input_blocks.9.0.in_layers.0.weight", "input_blocks.9.0.in_layers.0.bias", "input_blocks.9.0.in_layers.2.weight", "input_blocks.9.0.in_layers.2.bias", "input_blocks.9.0.emb_layers.1.weight", "input_blocks.9.0.emb_layers.1.bias", "input_blocks.9.0.out_layers.0.weight", "input_blocks.9.0.out_layers.0.bias", "input_blocks.9.0.out_layers.3.weight", "input_blocks.9.0.out_layers.3.bias", "output_blocks.12.0.in_layers.0.weight", "output_blocks.12.0.in_layers.0.bias", "output_blocks.12.0.in_layers.2.weight", "output_blocks.12.0.in_layers.2.bias", "output_blocks.12.0.emb_layers.1.weight", "output_blocks.12.0.emb_layers.1.bias", "output_blocks.12.0.out_layers.0.weight", "output_blocks.12.0.out_layers.0.bias", "output_blocks.12.0.out_layers.3.weight", "output_blocks.12.0.out_layers.3.bias", "output_blocks.12.0.skip_connection.weight", "output_blocks.12.0.skip_connection.bias", "output_blocks.13.0.in_layers.0.weight", "output_blocks.13.0.in_layers.0.bias", "output_blocks.13.0.in_layers.2.weight", "output_blocks.13.0.in_layers.2.bias", "output_blocks.13.0.emb_layers.1.weight", "output_blocks.13.0.emb_layers.1.bias", "output_blocks.13.0.out_layers.0.weight", "output_blocks.13.0.out_layers.0.bias", "output_blocks.13.0.out_layers.3.weight", "output_blocks.13.0.out_layers.3.bias", "output_blocks.13.0.skip_connection.weight", "output_blocks.13.0.skip_connection.bias", "output_blocks.14.0.in_layers.0.weight", "output_blocks.14.0.in_layers.0.bias", "output_blocks.14.0.in_layers.2.weight", "output_blocks.14.0.in_layers.2.bias", "output_blocks.14.0.emb_layers.1.weight", "output_blocks.14.0.emb_layers.1.bias", "output_blocks.14.0.out_layers.0.weight", "output_blocks.14.0.out_layers.0.bias", "output_blocks.14.0.out_layers.3.weight", "output_blocks.14.0.out_layers.3.bias", "output_blocks.14.0.skip_connection.weight", "output_blocks.14.0.skip_connection.bias", "output_blocks.14.1.in_layers.0.weight", "output_blocks.14.1.in_layers.0.bias", "output_blocks.14.1.in_layers.2.weight", "output_blocks.14.1.in_layers.2.bias", "output_blocks.14.1.emb_layers.1.weight", "output_blocks.14.1.emb_layers.1.bias", "output_blocks.14.1.out_layers.0.weight", "output_blocks.14.1.out_layers.0.bias", "output_blocks.14.1.out_layers.3.weight", "output_blocks.14.1.out_layers.3.bias", "output_blocks.15.0.in_layers.0.weight", "output_blocks.15.0.in_layers.0.bias", "output_blocks.15.0.in_layers.2.weight", "output_blocks.15.0.in_layers.2.bias", "output_blocks.15.0.emb_layers.1.weight", "output_blocks.15.0.emb_layers.1.bias", "output_blocks.15.0.out_layers.0.weight", "output_blocks.15.0.out_layers.0.bias", "output_blocks.15.0.out_layers.3.weight", "output_blocks.15.0.out_layers.3.bias", "output_blocks.15.0.skip_connection.weight", "output_blocks.15.0.skip_connection.bias", "output_blocks.16.0.in_layers.0.weight", "output_blocks.16.0.in_layers.0.bias", "output_blocks.16.0.in_layers.2.weight", "output_blocks.16.0.in_layers.2.bias", "output_blocks.16.0.emb_layers.1.weight", "output_blocks.16.0.emb_layers.1.bias", "output_blocks.16.0.out_layers.0.weight", "output_blocks.16.0.out_layers.0.bias", "output_blocks.16.0.out_layers.3.weight", "output_blocks.16.0.out_layers.3.bias", "output_blocks.16.0.skip_connection.weight", "output_blocks.16.0.skip_connection.bias", "output_blocks.17.0.in_layers.0.weight", "output_blocks.17.0.in_layers.0.bias", "output_blocks.17.0.in_layers.2.weight", "output_blocks.17.0.in_layers.2.bias", "output_blocks.17.0.emb_layers.1.weight", "output_blocks.17.0.emb_layers.1.bias", "output_blocks.17.0.out_layers.0.weight", "output_blocks.17.0.out_layers.0.bias", "output_blocks.17.0.out_layers.3.weight", "output_blocks.17.0.out_layers.3.bias", "output_blocks.17.0.skip_connection.weight", "output_blocks.17.0.skip_connection.bias", "output_blocks.2.2.in_layers.0.weight", "output_blocks.2.2.in_layers.0.bias", "output_blocks.2.2.in_layers.2.weight", "output_blocks.2.2.in_layers.2.bias", "output_blocks.2.2.emb_layers.1.weight", "output_blocks.2.2.emb_layers.1.bias", "output_blocks.2.2.out_layers.0.weight", "output_blocks.2.2.out_layers.0.bias", "output_blocks.2.2.out_layers.3.weight", "output_blocks.2.2.out_layers.3.bias", "output_blocks.5.2.in_layers.0.weight", "output_blocks.5.2.in_layers.0.bias", "output_blocks.5.2.in_layers.2.weight", "output_blocks.5.2.in_layers.2.bias", "output_blocks.5.2.emb_layers.1.weight", "output_blocks.5.2.emb_layers.1.bias", "output_blocks.5.2.out_layers.0.weight", "output_blocks.5.2.out_layers.0.bias", "output_blocks.5.2.out_layers.3.weight", "output_blocks.5.2.out_layers.3.bias", "output_blocks.6.1.norm.weight", "output_blocks.6.1.norm.bias", "output_blocks.6.1.qkv.weight", "output_blocks.6.1.qkv.bias", "output_blocks.6.1.proj_out.weight", "output_blocks.6.1.proj_out.bias", "output_blocks.7.1.norm.weight", "output_blocks.7.1.norm.bias", "output_blocks.7.1.qkv.weight", "output_blocks.7.1.qkv.bias", "output_blocks.7.1.proj_out.weight", "output_blocks.7.1.proj_out.bias", "output_blocks.8.2.in_layers.0.weight", "output_blocks.8.2.in_layers.0.bias", "output_blocks.8.2.in_layers.2.weight", "output_blocks.8.2.in_layers.2.bias", "output_blocks.8.2.emb_layers.1.weight", "output_blocks.8.2.emb_layers.1.bias", "output_blocks.8.2.out_layers.0.weight", "output_blocks.8.2.out_layers.0.bias", "output_blocks.8.2.out_layers.3.weight", "output_blocks.8.2.out_layers.3.bias", "output_blocks.8.1.norm.weight", "output_blocks.8.1.norm.bias", "output_blocks.8.1.qkv.weight", "output_blocks.8.1.qkv.bias", "output_blocks.8.1.proj_out.weight", "output_blocks.8.1.proj_out.bias", "output_blocks.11.1.in_layers.0.weight", "output_blocks.11.1.in_layers.0.bias", "output_blocks.11.1.in_layers.2.weight", "output_blocks.11.1.in_layers.2.bias", "output_blocks.11.1.emb_layers.1.weight", "output_blocks.11.1.emb_layers.1.bias", "output_blocks.11.1.out_layers.0.weight", "output_blocks.11.1.out_layers.0.bias", "output_blocks.11.1.out_layers.3.weight", "output_blocks.11.1.out_layers.3.bias". size mismatch for time_embed.0.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([512, 128]). size mismatch for time_embed.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for time_embed.2.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 512]). size mismatch for time_embed.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for input_blocks.0.0.weight: copying a param with shape torch.Size([256, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 3, 3, 3]). size mismatch for input_blocks.0.0.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.1.0.in_layers.0.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.1.0.in_layers.0.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.1.0.in_layers.2.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for input_blocks.1.0.in_layers.2.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.1.0.emb_layers.1.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([256, 512]). size mismatch for input_blocks.1.0.emb_layers.1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for input_blocks.1.0.out_layers.0.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.1.0.out_layers.0.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.1.0.out_layers.3.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for input_blocks.1.0.out_layers.3.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.2.0.in_layers.0.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.2.0.in_layers.0.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.2.0.in_layers.2.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for input_blocks.2.0.in_layers.2.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.2.0.emb_layers.1.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([256, 512]). size mismatch for input_blocks.2.0.emb_layers.1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for input_blocks.2.0.out_layers.0.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.2.0.out_layers.0.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.2.0.out_layers.3.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for input_blocks.2.0.out_layers.3.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.4.0.in_layers.0.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.4.0.in_layers.0.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.4.0.in_layers.2.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]). size mismatch for input_blocks.4.0.emb_layers.1.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([512, 512]). size mismatch for input_blocks.5.0.emb_layers.1.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([512, 512]). size mismatch for input_blocks.7.0.in_layers.2.weight: copying a param with shape torch.Size([512, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 256, 3, 3]). size mismatch for input_blocks.7.0.in_layers.2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.7.0.emb_layers.1.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 512]). size mismatch for input_blocks.7.0.emb_layers.1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for input_blocks.7.0.out_layers.0.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.7.0.out_layers.0.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.7.0.out_layers.3.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3]). size mismatch for input_blocks.7.0.out_layers.3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.7.0.skip_connection.weight: copying a param with shape torch.Size([512, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 256, 1, 1]). size mismatch for input_blocks.7.0.skip_connection.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.8.0.in_layers.0.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.8.0.in_layers.0.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.8.0.in_layers.2.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3]). size mismatch for input_blocks.8.0.in_layers.2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.8.0.emb_layers.1.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 512]). size mismatch for input_blocks.8.0.emb_layers.1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for input_blocks.8.0.out_layers.0.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.8.0.out_layers.0.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.8.0.out_layers.3.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3]). size mismatch for input_blocks.8.0.out_layers.3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.10.0.in_layers.0.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.10.0.in_layers.0.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.10.0.in_layers.2.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 384, 3, 3]). size mismatch for input_blocks.10.0.emb_layers.1.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 512]). size mismatch for input_blocks.11.0.emb_layers.1.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 512]). size mismatch for middle_block.0.in_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.0.in_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.0.in_layers.2.weight: copying a param with shape torch.Size([1024, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). size mismatch for middle_block.0.in_layers.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.0.emb_layers.1.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 512]). size mismatch for middle_block.0.emb_layers.1.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for middle_block.0.out_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.0.out_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.0.out_layers.3.weight: copying a param with shape torch.Size([1024, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). size mismatch for middle_block.0.out_layers.3.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.1.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.1.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.1.qkv.weight: copying a param with shape torch.Size([3072, 1024, 1]) from checkpoint, the shape in current model is torch.Size([1536, 512, 1]). size mismatch for middle_block.1.qkv.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([1536]). size mismatch for middle_block.1.proj_out.weight: copying a param with shape torch.Size([1024, 1024, 1]) from checkpoint, the shape in current model is torch.Size([512, 512, 1]). size mismatch for middle_block.1.proj_out.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.2.in_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.2.in_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.2.in_layers.2.weight: copying a param with shape torch.Size([1024, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). size mismatch for middle_block.2.in_layers.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.2.emb_layers.1.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 512]). size mismatch for middle_block.2.emb_layers.1.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for middle_block.2.out_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.2.out_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.2.out_layers.3.weight: copying a param with shape torch.Size([1024, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). size mismatch for middle_block.2.out_layers.3.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.0.0.in_layers.0.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for output_blocks.0.0.in_layers.0.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for output_blocks.0.0.in_layers.2.weight: copying a param with shape torch.Size([1024, 2048, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 1024, 3, 3]). size mismatch for output_blocks.0.0.in_layers.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.0.0.emb_layers.1.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 512]). size mismatch for output_blocks.0.0.emb_layers.1.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for output_blocks.0.0.out_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.0.0.out_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.0.0.out_layers.3.weight: copying a param with shape torch.Size([1024, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). size mismatch for output_blocks.0.0.out_layers.3.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.0.0.skip_connection.weight: copying a param with shape torch.Size([1024, 2048, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 1024, 1, 1]). size mismatch for output_blocks.0.0.skip_connection.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.0.1.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.0.1.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.0.1.qkv.weight: copying a param with shape torch.Size([3072, 1024, 1]) from checkpoint, the shape in current model is torch.Size([1536, 512, 1]). size mismatch for output_blocks.0.1.qkv.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([1536]). size mismatch for output_blocks.0.1.proj_out.weight: copying a param with shape torch.Size([1024, 1024, 1]) from checkpoint, the shape in current model is torch.Size([512, 512, 1]). size mismatch for output_blocks.0.1.proj_out.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.1.0.in_layers.0.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for output_blocks.1.0.in_layers.0.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for output_blocks.1.0.in_layers.2.weight: copying a param with shape torch.Size([1024, 2048, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 1024, 3, 3]). size mismatch for output_blocks.1.0.in_layers.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.1.0.emb_layers.1.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 512]). size mismatch for output_blocks.1.0.emb_layers.1.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for output_blocks.1.0.out_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.1.0.out_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.1.0.out_layers.3.weight: copying a param with shape torch.Size([1024, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). size mismatch for output_blocks.1.0.out_layers.3.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.1.0.skip_connection.weight: copying a param with shape torch.Size([1024, 2048, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 1024, 1, 1]). size mismatch for output_blocks.1.0.skip_connection.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.1.1.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.1.1.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.1.1.qkv.weight: copying a param with shape torch.Size([3072, 1024, 1]) from checkpoint, the shape in current model is torch.Size([1536, 512, 1]). size mismatch for output_blocks.1.1.qkv.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([1536]). size mismatch for output_blocks.1.1.proj_out.weight: copying a param with shape torch.Size([1024, 1024, 1]) from checkpoint, the shape in current model is torch.Size([512, 512, 1]). size mismatch for output_blocks.1.1.proj_out.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.2.0.in_layers.0.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([896]). size mismatch for output_blocks.2.0.in_layers.0.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([896]). size mismatch for output_blocks.2.0.in_layers.2.weight: copying a param with shape torch.Size([1024, 2048, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 896, 3, 3]). size mismatch for output_blocks.2.0.in_layers.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.2.0.emb_layers.1.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 512]). size mismatch for output_blocks.2.0.emb_layers.1.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for output_blocks.2.0.out_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.2.0.out_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.2.0.out_layers.3.weight: copying a param with shape torch.Size([1024, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). size mismatch for output_blocks.2.0.out_layers.3.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.2.0.skip_connection.weight: copying a param with shape torch.Size([1024, 2048, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 896, 1, 1]). size mismatch for output_blocks.2.0.skip_connection.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.2.1.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.2.1.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.2.1.qkv.weight: copying a param with shape torch.Size([3072, 1024, 1]) from checkpoint, the shape in current model is torch.Size([1536, 512, 1]). size mismatch for output_blocks.2.1.qkv.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([1536]). size mismatch for output_blocks.2.1.proj_out.weight: copying a param with shape torch.Size([1024, 1024, 1]) from checkpoint, the shape in current model is torch.Size([512, 512, 1]). size mismatch for output_blocks.2.1.proj_out.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.3.0.in_layers.0.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([896]). size mismatch for output_blocks.3.0.in_layers.0.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([896]). size mismatch for output_blocks.3.0.in_layers.2.weight: copying a param with shape torch.Size([1024, 2048, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 896, 3, 3]). size mismatch for output_blocks.3.0.in_layers.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.3.0.emb_layers.1.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([768, 512]). size mismatch for output_blocks.3.0.emb_layers.1.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for output_blocks.3.0.out_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.3.0.out_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.3.0.out_layers.3.weight: copying a param with shape torch.Size([1024, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3]). size mismatch for output_blocks.3.0.out_layers.3.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.3.0.skip_connection.weight: copying a param with shape torch.Size([1024, 2048, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 896, 1, 1]). size mismatch for output_blocks.3.0.skip_connection.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.3.1.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.3.1.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.3.1.qkv.weight: copying a param with shape torch.Size([3072, 1024, 1]) from checkpoint, the shape in current model is torch.Size([1152, 384, 1]). size mismatch for output_blocks.3.1.qkv.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([1152]). size mismatch for output_blocks.3.1.proj_out.weight: copying a param with shape torch.Size([1024, 1024, 1]) from checkpoint, the shape in current model is torch.Size([384, 384, 1]). size mismatch for output_blocks.3.1.proj_out.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.4.0.in_layers.0.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for output_blocks.4.0.in_layers.0.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for output_blocks.4.0.in_layers.2.weight: copying a param with shape torch.Size([1024, 2048, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 768, 3, 3]). size mismatch for output_blocks.4.0.in_layers.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.4.0.emb_layers.1.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([768, 512]). size mismatch for output_blocks.4.0.emb_layers.1.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for output_blocks.4.0.out_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.4.0.out_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.4.0.out_layers.3.weight: copying a param with shape torch.Size([1024, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3]). size mismatch for output_blocks.4.0.out_layers.3.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.4.0.skip_connection.weight: copying a param with shape torch.Size([1024, 2048, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 768, 1, 1]). size mismatch for output_blocks.4.0.skip_connection.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.4.1.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.4.1.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.4.1.qkv.weight: copying a param with shape torch.Size([3072, 1024, 1]) from checkpoint, the shape in current model is torch.Size([1152, 384, 1]). size mismatch for output_blocks.4.1.qkv.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([1152]). size mismatch for output_blocks.4.1.proj_out.weight: copying a param with shape torch.Size([1024, 1024, 1]) from checkpoint, the shape in current model is torch.Size([384, 384, 1]). size mismatch for output_blocks.4.1.proj_out.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.5.0.in_layers.0.weight: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for output_blocks.5.0.in_layers.0.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for output_blocks.5.0.in_layers.2.weight: copying a param with shape torch.Size([1024, 1536, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 640, 3, 3]). size mismatch for output_blocks.5.0.in_layers.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.5.0.emb_layers.1.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([768, 512]). size mismatch for output_blocks.5.0.emb_layers.1.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for output_blocks.5.0.out_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.5.0.out_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.5.0.out_layers.3.weight: copying a param with shape torch.Size([1024, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3]). size mismatch for output_blocks.5.0.out_layers.3.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.5.0.skip_connection.weight: copying a param with shape torch.Size([1024, 1536, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 640, 1, 1]). size mismatch for output_blocks.5.0.skip_connection.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.5.1.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.5.1.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.5.1.qkv.weight: copying a param with shape torch.Size([3072, 1024, 1]) from checkpoint, the shape in current model is torch.Size([1152, 384, 1]). size mismatch for output_blocks.5.1.qkv.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([1152]). size mismatch for output_blocks.5.1.proj_out.weight: copying a param with shape torch.Size([1024, 1024, 1]) from checkpoint, the shape in current model is torch.Size([384, 384, 1]). size mismatch for output_blocks.5.1.proj_out.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.6.0.in_layers.0.weight: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for output_blocks.6.0.in_layers.0.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for output_blocks.6.0.in_layers.2.weight: copying a param with shape torch.Size([512, 1536, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 640, 3, 3]). size mismatch for output_blocks.6.0.in_layers.2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.6.0.emb_layers.1.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 512]). size mismatch for output_blocks.6.0.emb_layers.1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.6.0.out_layers.0.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.6.0.out_layers.0.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.6.0.out_layers.3.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]). size mismatch for output_blocks.6.0.out_layers.3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.6.0.skip_connection.weight: copying a param with shape torch.Size([512, 1536, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 640, 1, 1]). size mismatch for output_blocks.6.0.skip_connection.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.7.0.in_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.7.0.in_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.7.0.in_layers.2.weight: copying a param with shape torch.Size([512, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 512, 3, 3]). size mismatch for output_blocks.7.0.in_layers.2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.7.0.emb_layers.1.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 512]). size mismatch for output_blocks.7.0.emb_layers.1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.7.0.out_layers.0.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.7.0.out_layers.0.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.7.0.out_layers.3.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]). size mismatch for output_blocks.7.0.out_layers.3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.7.0.skip_connection.weight: copying a param with shape torch.Size([512, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 512, 1, 1]). size mismatch for output_blocks.7.0.skip_connection.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.8.0.in_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.8.0.in_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.8.0.in_layers.2.weight: copying a param with shape torch.Size([512, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 384, 3, 3]). size mismatch for output_blocks.8.0.in_layers.2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.8.0.emb_layers.1.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 512]). size mismatch for output_blocks.8.0.emb_layers.1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.8.0.out_layers.0.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.8.0.out_layers.0.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.8.0.out_layers.3.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]). size mismatch for output_blocks.8.0.out_layers.3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.8.0.skip_connection.weight: copying a param with shape torch.Size([512, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 384, 1, 1]). size mismatch for output_blocks.8.0.skip_connection.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.9.0.in_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.9.0.in_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.9.0.in_layers.2.weight: copying a param with shape torch.Size([512, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 384, 3, 3]). size mismatch for output_blocks.9.0.in_layers.2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.9.0.emb_layers.1.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([256, 512]). size mismatch for output_blocks.9.0.emb_layers.1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.9.0.out_layers.0.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.9.0.out_layers.0.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.9.0.out_layers.3.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for output_blocks.9.0.out_layers.3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.9.0.skip_connection.weight: copying a param with shape torch.Size([512, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 384, 1, 1]). size mismatch for output_blocks.9.0.skip_connection.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.10.0.in_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.10.0.in_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.10.0.in_layers.2.weight: copying a param with shape torch.Size([512, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 256, 3, 3]). size mismatch for output_blocks.10.0.in_layers.2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.10.0.emb_layers.1.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([256, 512]). size mismatch for output_blocks.10.0.emb_layers.1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.10.0.out_layers.0.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.10.0.out_layers.0.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.10.0.out_layers.3.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for output_blocks.10.0.out_layers.3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.10.0.skip_connection.weight: copying a param with shape torch.Size([512, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 256, 1, 1]). size mismatch for output_blocks.10.0.skip_connection.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.11.0.in_layers.0.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.11.0.in_layers.0.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.11.0.in_layers.2.weight: copying a param with shape torch.Size([512, 768, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 256, 3, 3]). size mismatch for output_blocks.11.0.in_layers.2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.11.0.emb_layers.1.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([256, 512]). size mismatch for output_blocks.11.0.emb_layers.1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.11.0.out_layers.0.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.11.0.out_layers.0.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.11.0.out_layers.3.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for output_blocks.11.0.out_layers.3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.11.0.skip_connection.weight: copying a param with shape torch.Size([512, 768, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 256, 1, 1]). size mismatch for output_blocks.11.0.skip_connection.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for out.0.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for out.0.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for out.2.weight: copying a param with shape torch.Size([6, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([3, 128, 3, 3]). size mismatch for out.2.bias: copying a param with shape torch.Size([6]) from checkpoint, the shape in current model is torch.Size([3]).

resume training does not work for multi-gpus training

I add --resume_checkpoint $path_to_checkpoint$ to continue the training, it works for a single GPU, but does not work for multi-gpus

the code gets stuck here:

Logging to /proj/ihorse_2021/users/x_manni/guided-diffusion/log9 creating model and diffusion... creating data loader... start training... loading model from checkpoint: /proj/ihorse_2021/users/x_manni/guided-diffusion/log9/model200000.pt...

opened by forever208 6
training across mutiple nodes does not work

if the number of GPUs > 8 (each node has 8 GPUs), then I have to train in several nodes

In this case, run by mpiexec -n 16 python script/image_train.py doesn't work.

It says the error of nccl

opened by forever208 4
Figures Appendix C of the paper

Hello,

Congrats on the great work and thanks for sharing the code!

I wonder how you generated Figure 8 of Appendix C in your paper?

How are the images so similar, even when the scale is so reduced? Is there any conditioning beyond classifier guidance?

Many thanks!

opened by SANCHES-Pedro 4
What is the 'zero_module' used for? [Question]
First of all, thanks for an extraordinary paper - so many interesting details!! Also, thanks for open sourcing the code.

I have a few ideas I want to test and I'm trying to understand all the parts of the code. Most of it is clear and well commented, but I can't seem to figure out the reasoning behind the 'zero_module' you have in a few places in the guided-diffusion/guided_diffusion/unet.py file?

def zero_module(module): """ Zero out the parameters of a module and return it. """ for p in module.parameters(): p.detach().zero_() return module

I couldn't find anything in the paper or online to explain why this is used.

I'm also curious why you used a custom mixed precision training instead of using PyTorch's mixed precision training (torch.cuda.amp.autocast)?
opened by santodante 3
Error when loading pretrained weights

Thank you for releasing pretrained weights. I tried to use some of your pretrained weights as you described in the readme but there is a mismatch between checkpoint weights and the model.

Logging to /tmp/openai-2021-07-22-14-28-52-986510 creating model and diffusion... Traceback (most recent call last): File "scripts/image_sample.py", line 108, in main() File "scripts/image_sample.py", line 33, in main model.load_state_dict( File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1223, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for UNetModel: Missing key(s) in state_dict: "input_blocks.3.0.op.weight", "input_blocks.3.0.op.bias", "input_blocks.4.0.skip_connection.weight", "input_blocks.4.0.skip_connection.bias", "input_blocks.6.0.op.weight", "input_blocks.6.0.op.bias", "input_blocks.7.1.norm.weight", "input_blocks.7.1.norm.bias", "input_blocks.7.1.qkv.weight", "input_blocks.7.1.qkv.bias", "input_blocks.7.1.proj_out.weight", "input_blocks.7.1.proj_out.bias", "input_blocks.8.1.norm.weight", "input_blocks.8.1.norm.bias", "input_blocks.8.1.qkv.weight", "input_blocks.8.1.qkv.bias", "input_blocks.8.1.proj_out.weight", "input_blocks.8.1.proj_out.bias", "input_blocks.9.0.op.weight", "input_blocks.9.0.op.bias", "input_blocks.10.0.skip_connection.weight", "input_blocks.10.0.skip_connection.bias", "output_blocks.2.2.conv.weight", "output_blocks.2.2.conv.bias", "output_blocks.5.2.conv.weight", "output_blocks.5.2.conv.bias", "output_blocks.8.1.conv.weight", "output_blocks.8.1.conv.bias". Unexpected key(s) in state_dict: "input_blocks.12.0.in_layers.0.weight", "input_blocks.12.0.in_layers.0.bias", "input_blocks.12.0.in_layers.2.weight", "input_blocks.12.0.in_layers.2.bias", "input_blocks.12.0.emb_layers.1.weight", "input_blocks.12.0.emb_layers.1.bias", "input_blocks.12.0.out_layers.0.weight", "input_blocks.12.0.out_layers.0.bias", "input_blocks.12.0.out_layers.3.weight", "input_blocks.12.0.out_layers.3.bias", "input_blocks.13.0.in_layers.0.weight", "input_blocks.13.0.in_layers.0.bias", "input_blocks.13.0.in_layers.2.weight", "input_blocks.13.0.in_layers.2.bias", "input_blocks.13.0.emb_layers.1.weight", "input_blocks.13.0.emb_layers.1.bias", "input_blocks.13.0.out_layers.0.weight", "input_blocks.13.0.out_layers.0.bias", "input_blocks.13.0.out_layers.3.weight", "input_blocks.13.0.out_layers.3.bias", "input_blocks.13.0.skip_connection.weight", "input_blocks.13.0.skip_connection.bias", "input_blocks.13.1.norm.weight", "input_blocks.13.1.norm.bias", "input_blocks.13.1.qkv.weight", "input_blocks.13.1.qkv.bias", "input_blocks.13.1.proj_out.weight", "input_blocks.13.1.proj_out.bias", "input_blocks.14.0.in_layers.0.weight", "input_blocks.14.0.in_layers.0.bias", "input_blocks.14.0.in_layers.2.weight", "input_blocks.14.0.in_layers.2.bias", "input_blocks.14.0.emb_layers.1.weight", "input_blocks.14.0.emb_layers.1.bias", "input_blocks.14.0.out_layers.0.weight", "input_blocks.14.0.out_layers.0.bias", "input_blocks.14.0.out_layers.3.weight", "input_blocks.14.0.out_layers.3.bias", "input_blocks.14.1.norm.weight", "input_blocks.14.1.norm.bias", "input_blocks.14.1.qkv.weight", "input_blocks.14.1.qkv.bias", "input_blocks.14.1.proj_out.weight", "input_blocks.14.1.proj_out.bias", "input_blocks.15.0.in_layers.0.weight", "input_blocks.15.0.in_layers.0.bias", "input_blocks.15.0.in_layers.2.weight", "input_blocks.15.0.in_layers.2.bias", "input_blocks.15.0.emb_layers.1.weight", "input_blocks.15.0.emb_layers.1.bias", "input_blocks.15.0.out_layers.0.weight", "input_blocks.15.0.out_layers.0.bias", "input_blocks.15.0.out_layers.3.weight", "input_blocks.15.0.out_layers.3.bias", "input_blocks.16.0.in_layers.0.weight", "input_blocks.16.0.in_layers.0.bias", "input_blocks.16.0.in_layers.2.weight", "input_blocks.16.0.in_layers.2.bias", "input_blocks.16.0.emb_layers.1.weight", "input_blocks.16.0.emb_layers.1.bias", "input_blocks.16.0.out_layers.0.weight", "input_blocks.16.0.out_layers.0.bias", "input_blocks.16.0.out_layers.3.weight", "input_blocks.16.0.out_layers.3.bias", "input_blocks.16.1.norm.weight", "input_blocks.16.1.norm.bias", "input_blocks.16.1.qkv.weight", "input_blocks.16.1.qkv.bias", "input_blocks.16.1.proj_out.weight", "input_blocks.16.1.proj_out.bias", "input_blocks.17.0.in_layers.0.weight", "input_blocks.17.0.in_layers.0.bias", "input_blocks.17.0.in_layers.2.weight", "input_blocks.17.0.in_layers.2.bias", "input_blocks.17.0.emb_layers.1.weight", "input_blocks.17.0.emb_layers.1.bias", "input_blocks.17.0.out_layers.0.weight", "input_blocks.17.0.out_layers.0.bias", "input_blocks.17.0.out_layers.3.weight", "input_blocks.17.0.out_layers.3.bias", "input_blocks.17.1.norm.weight", "input_blocks.17.1.norm.bias", "input_blocks.17.1.qkv.weight", "input_blocks.17.1.qkv.bias", "input_blocks.17.1.proj_out.weight", "input_blocks.17.1.proj_out.bias", "input_blocks.3.0.in_layers.0.weight", "input_blocks.3.0.in_layers.0.bias", "input_blocks.3.0.in_layers.2.weight", "input_blocks.3.0.in_layers.2.bias", "input_blocks.3.0.emb_layers.1.weight", "input_blocks.3.0.emb_layers.1.bias", "input_blocks.3.0.out_layers.0.weight", "input_blocks.3.0.out_layers.0.bias", "input_blocks.3.0.out_layers.3.weight", "input_blocks.3.0.out_layers.3.bias", "input_blocks.6.0.in_layers.0.weight", "input_blocks.6.0.in_layers.0.bias", "input_blocks.6.0.in_layers.2.weight", "input_blocks.6.0.in_layers.2.bias", "input_blocks.6.0.emb_layers.1.weight", "input_blocks.6.0.emb_layers.1.bias", "input_blocks.6.0.out_layers.0.weight", "input_blocks.6.0.out_layers.0.bias", "input_blocks.6.0.out_layers.3.weight", "input_blocks.6.0.out_layers.3.bias", "input_blocks.9.0.in_layers.0.weight", "input_blocks.9.0.in_layers.0.bias", "input_blocks.9.0.in_layers.2.weight", "input_blocks.9.0.in_layers.2.bias", "input_blocks.9.0.emb_layers.1.weight", "input_blocks.9.0.emb_layers.1.bias", "input_blocks.9.0.out_layers.0.weight", "input_blocks.9.0.out_layers.0.bias", "input_blocks.9.0.out_layers.3.weight", "input_blocks.9.0.out_layers.3.bias", "output_blocks.12.0.in_layers.0.weight", "output_blocks.12.0.in_layers.0.bias", "output_blocks.12.0.in_layers.2.weight", "output_blocks.12.0.in_layers.2.bias", "output_blocks.12.0.emb_layers.1.weight", "output_blocks.12.0.emb_layers.1.bias", "output_blocks.12.0.out_layers.0.weight", "output_blocks.12.0.out_layers.0.bias", "output_blocks.12.0.out_layers.3.weight", "output_blocks.12.0.out_layers.3.bias", "output_blocks.12.0.skip_connection.weight", "output_blocks.12.0.skip_connection.bias", "output_blocks.13.0.in_layers.0.weight", "output_blocks.13.0.in_layers.0.bias", "output_blocks.13.0.in_layers.2.weight", "output_blocks.13.0.in_layers.2.bias", "output_blocks.13.0.emb_layers.1.weight", "output_blocks.13.0.emb_layers.1.bias", "output_blocks.13.0.out_layers.0.weight", "output_blocks.13.0.out_layers.0.bias", "output_blocks.13.0.out_layers.3.weight", "output_blocks.13.0.out_layers.3.bias", "output_blocks.13.0.skip_connection.weight", "output_blocks.13.0.skip_connection.bias", "output_blocks.14.0.in_layers.0.weight", "output_blocks.14.0.in_layers.0.bias", "output_blocks.14.0.in_layers.2.weight", "output_blocks.14.0.in_layers.2.bias", "output_blocks.14.0.emb_layers.1.weight", "output_blocks.14.0.emb_layers.1.bias", "output_blocks.14.0.out_layers.0.weight", "output_blocks.14.0.out_layers.0.bias", "output_blocks.14.0.out_layers.3.weight", "output_blocks.14.0.out_layers.3.bias", "output_blocks.14.0.skip_connection.weight", "output_blocks.14.0.skip_connection.bias", "output_blocks.14.1.in_layers.0.weight", "output_blocks.14.1.in_layers.0.bias", "output_blocks.14.1.in_layers.2.weight", "output_blocks.14.1.in_layers.2.bias", "output_blocks.14.1.emb_layers.1.weight", "output_blocks.14.1.emb_layers.1.bias", "output_blocks.14.1.out_layers.0.weight", "output_blocks.14.1.out_layers.0.bias", "output_blocks.14.1.out_layers.3.weight", "output_blocks.14.1.out_layers.3.bias", "output_blocks.15.0.in_layers.0.weight", "output_blocks.15.0.in_layers.0.bias", "output_blocks.15.0.in_layers.2.weight", "output_blocks.15.0.in_layers.2.bias", "output_blocks.15.0.emb_layers.1.weight", "output_blocks.15.0.emb_layers.1.bias", "output_blocks.15.0.out_layers.0.weight", "output_blocks.15.0.out_layers.0.bias", "output_blocks.15.0.out_layers.3.weight", "output_blocks.15.0.out_layers.3.bias", "output_blocks.15.0.skip_connection.weight", "output_blocks.15.0.skip_connection.bias", "output_blocks.16.0.in_layers.0.weight", "output_blocks.16.0.in_layers.0.bias", "output_blocks.16.0.in_layers.2.weight", "output_blocks.16.0.in_layers.2.bias", "output_blocks.16.0.emb_layers.1.weight", "output_blocks.16.0.emb_layers.1.bias", "output_blocks.16.0.out_layers.0.weight", "output_blocks.16.0.out_layers.0.bias", "output_blocks.16.0.out_layers.3.weight", "output_blocks.16.0.out_layers.3.bias", "output_blocks.16.0.skip_connection.weight", "output_blocks.16.0.skip_connection.bias", "output_blocks.17.0.in_layers.0.weight", "output_blocks.17.0.in_layers.0.bias", "output_blocks.17.0.in_layers.2.weight", "output_blocks.17.0.in_layers.2.bias", "output_blocks.17.0.emb_layers.1.weight", "output_blocks.17.0.emb_layers.1.bias", "output_blocks.17.0.out_layers.0.weight", "output_blocks.17.0.out_layers.0.bias", "output_blocks.17.0.out_layers.3.weight", "output_blocks.17.0.out_layers.3.bias", "output_blocks.17.0.skip_connection.weight", "output_blocks.17.0.skip_connection.bias", "output_blocks.2.2.in_layers.0.weight", "output_blocks.2.2.in_layers.0.bias", "output_blocks.2.2.in_layers.2.weight", "output_blocks.2.2.in_layers.2.bias", "output_blocks.2.2.emb_layers.1.weight", "output_blocks.2.2.emb_layers.1.bias", "output_blocks.2.2.out_layers.0.weight", "output_blocks.2.2.out_layers.0.bias", "output_blocks.2.2.out_layers.3.weight", "output_blocks.2.2.out_layers.3.bias", "output_blocks.5.2.in_layers.0.weight", "output_blocks.5.2.in_layers.0.bias", "output_blocks.5.2.in_layers.2.weight", "output_blocks.5.2.in_layers.2.bias", "output_blocks.5.2.emb_layers.1.weight", "output_blocks.5.2.emb_layers.1.bias", "output_blocks.5.2.out_layers.0.weight", "output_blocks.5.2.out_layers.0.bias", "output_blocks.5.2.out_layers.3.weight", "output_blocks.5.2.out_layers.3.bias", "output_blocks.6.1.norm.weight", "output_blocks.6.1.norm.bias", "output_blocks.6.1.qkv.weight", "output_blocks.6.1.qkv.bias", "output_blocks.6.1.proj_out.weight", "output_blocks.6.1.proj_out.bias", "output_blocks.7.1.norm.weight", "output_blocks.7.1.norm.bias", "output_blocks.7.1.qkv.weight", "output_blocks.7.1.qkv.bias", "output_blocks.7.1.proj_out.weight", "output_blocks.7.1.proj_out.bias", "output_blocks.8.2.in_layers.0.weight", "output_blocks.8.2.in_layers.0.bias", "output_blocks.8.2.in_layers.2.weight", "output_blocks.8.2.in_layers.2.bias", "output_blocks.8.2.emb_layers.1.weight", "output_blocks.8.2.emb_layers.1.bias", "output_blocks.8.2.out_layers.0.weight", "output_blocks.8.2.out_layers.0.bias", "output_blocks.8.2.out_layers.3.weight", "output_blocks.8.2.out_layers.3.bias", "output_blocks.8.1.norm.weight", "output_blocks.8.1.norm.bias", "output_blocks.8.1.qkv.weight", "output_blocks.8.1.qkv.bias", "output_blocks.8.1.proj_out.weight", "output_blocks.8.1.proj_out.bias", "output_blocks.11.1.in_layers.0.weight", "output_blocks.11.1.in_layers.0.bias", "output_blocks.11.1.in_layers.2.weight", "output_blocks.11.1.in_layers.2.bias", "output_blocks.11.1.emb_layers.1.weight", "output_blocks.11.1.emb_layers.1.bias", "output_blocks.11.1.out_layers.0.weight", "output_blocks.11.1.out_layers.0.bias", "output_blocks.11.1.out_layers.3.weight", "output_blocks.11.1.out_layers.3.bias". size mismatch for time_embed.0.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([512, 128]). size mismatch for time_embed.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for time_embed.2.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 512]). size mismatch for time_embed.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for input_blocks.0.0.weight: copying a param with shape torch.Size([256, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 3, 3, 3]). size mismatch for input_blocks.0.0.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.1.0.in_layers.0.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.1.0.in_layers.0.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.1.0.in_layers.2.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for input_blocks.1.0.in_layers.2.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.1.0.emb_layers.1.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([256, 512]). size mismatch for input_blocks.1.0.emb_layers.1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for input_blocks.1.0.out_layers.0.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.1.0.out_layers.0.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.1.0.out_layers.3.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for input_blocks.1.0.out_layers.3.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.2.0.in_layers.0.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.2.0.in_layers.0.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.2.0.in_layers.2.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for input_blocks.2.0.in_layers.2.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.2.0.emb_layers.1.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([256, 512]). size mismatch for input_blocks.2.0.emb_layers.1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for input_blocks.2.0.out_layers.0.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.2.0.out_layers.0.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.2.0.out_layers.3.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for input_blocks.2.0.out_layers.3.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.4.0.in_layers.0.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.4.0.in_layers.0.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for input_blocks.4.0.in_layers.2.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]). size mismatch for input_blocks.4.0.emb_layers.1.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([512, 512]). size mismatch for input_blocks.5.0.emb_layers.1.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([512, 512]). size mismatch for input_blocks.7.0.in_layers.2.weight: copying a param with shape torch.Size([512, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 256, 3, 3]). size mismatch for input_blocks.7.0.in_layers.2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.7.0.emb_layers.1.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 512]). size mismatch for input_blocks.7.0.emb_layers.1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for input_blocks.7.0.out_layers.0.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.7.0.out_layers.0.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.7.0.out_layers.3.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3]). size mismatch for input_blocks.7.0.out_layers.3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.7.0.skip_connection.weight: copying a param with shape torch.Size([512, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 256, 1, 1]). size mismatch for input_blocks.7.0.skip_connection.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.8.0.in_layers.0.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.8.0.in_layers.0.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.8.0.in_layers.2.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3]). size mismatch for input_blocks.8.0.in_layers.2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.8.0.emb_layers.1.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([768, 512]). size mismatch for input_blocks.8.0.emb_layers.1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for input_blocks.8.0.out_layers.0.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.8.0.out_layers.0.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.8.0.out_layers.3.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3]). size mismatch for input_blocks.8.0.out_layers.3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.10.0.in_layers.0.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.10.0.in_layers.0.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for input_blocks.10.0.in_layers.2.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 384, 3, 3]). size mismatch for input_blocks.10.0.emb_layers.1.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 512]). size mismatch for input_blocks.11.0.emb_layers.1.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 512]). size mismatch for middle_block.0.in_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.0.in_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.0.in_layers.2.weight: copying a param with shape torch.Size([1024, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). size mismatch for middle_block.0.in_layers.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.0.emb_layers.1.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 512]). size mismatch for middle_block.0.emb_layers.1.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for middle_block.0.out_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.0.out_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.0.out_layers.3.weight: copying a param with shape torch.Size([1024, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). size mismatch for middle_block.0.out_layers.3.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.1.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.1.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.1.qkv.weight: copying a param with shape torch.Size([3072, 1024, 1]) from checkpoint, the shape in current model is torch.Size([1536, 512, 1]). size mismatch for middle_block.1.qkv.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([1536]). size mismatch for middle_block.1.proj_out.weight: copying a param with shape torch.Size([1024, 1024, 1]) from checkpoint, the shape in current model is torch.Size([512, 512, 1]). size mismatch for middle_block.1.proj_out.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.2.in_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.2.in_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.2.in_layers.2.weight: copying a param with shape torch.Size([1024, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). size mismatch for middle_block.2.in_layers.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.2.emb_layers.1.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 512]). size mismatch for middle_block.2.emb_layers.1.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for middle_block.2.out_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.2.out_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for middle_block.2.out_layers.3.weight: copying a param with shape torch.Size([1024, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). size mismatch for middle_block.2.out_layers.3.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.0.0.in_layers.0.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for output_blocks.0.0.in_layers.0.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for output_blocks.0.0.in_layers.2.weight: copying a param with shape torch.Size([1024, 2048, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 1024, 3, 3]). size mismatch for output_blocks.0.0.in_layers.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.0.0.emb_layers.1.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 512]). size mismatch for output_blocks.0.0.emb_layers.1.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for output_blocks.0.0.out_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.0.0.out_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.0.0.out_layers.3.weight: copying a param with shape torch.Size([1024, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). size mismatch for output_blocks.0.0.out_layers.3.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.0.0.skip_connection.weight: copying a param with shape torch.Size([1024, 2048, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 1024, 1, 1]). size mismatch for output_blocks.0.0.skip_connection.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.0.1.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.0.1.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.0.1.qkv.weight: copying a param with shape torch.Size([3072, 1024, 1]) from checkpoint, the shape in current model is torch.Size([1536, 512, 1]). size mismatch for output_blocks.0.1.qkv.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([1536]). size mismatch for output_blocks.0.1.proj_out.weight: copying a param with shape torch.Size([1024, 1024, 1]) from checkpoint, the shape in current model is torch.Size([512, 512, 1]). size mismatch for output_blocks.0.1.proj_out.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.1.0.in_layers.0.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for output_blocks.1.0.in_layers.0.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for output_blocks.1.0.in_layers.2.weight: copying a param with shape torch.Size([1024, 2048, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 1024, 3, 3]). size mismatch for output_blocks.1.0.in_layers.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.1.0.emb_layers.1.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 512]). size mismatch for output_blocks.1.0.emb_layers.1.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for output_blocks.1.0.out_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.1.0.out_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.1.0.out_layers.3.weight: copying a param with shape torch.Size([1024, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). size mismatch for output_blocks.1.0.out_layers.3.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.1.0.skip_connection.weight: copying a param with shape torch.Size([1024, 2048, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 1024, 1, 1]). size mismatch for output_blocks.1.0.skip_connection.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.1.1.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.1.1.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.1.1.qkv.weight: copying a param with shape torch.Size([3072, 1024, 1]) from checkpoint, the shape in current model is torch.Size([1536, 512, 1]). size mismatch for output_blocks.1.1.qkv.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([1536]). size mismatch for output_blocks.1.1.proj_out.weight: copying a param with shape torch.Size([1024, 1024, 1]) from checkpoint, the shape in current model is torch.Size([512, 512, 1]). size mismatch for output_blocks.1.1.proj_out.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.2.0.in_layers.0.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([896]). size mismatch for output_blocks.2.0.in_layers.0.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([896]). size mismatch for output_blocks.2.0.in_layers.2.weight: copying a param with shape torch.Size([1024, 2048, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 896, 3, 3]). size mismatch for output_blocks.2.0.in_layers.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.2.0.emb_layers.1.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 512]). size mismatch for output_blocks.2.0.emb_layers.1.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for output_blocks.2.0.out_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.2.0.out_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.2.0.out_layers.3.weight: copying a param with shape torch.Size([1024, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). size mismatch for output_blocks.2.0.out_layers.3.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.2.0.skip_connection.weight: copying a param with shape torch.Size([1024, 2048, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 896, 1, 1]). size mismatch for output_blocks.2.0.skip_connection.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.2.1.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.2.1.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.2.1.qkv.weight: copying a param with shape torch.Size([3072, 1024, 1]) from checkpoint, the shape in current model is torch.Size([1536, 512, 1]). size mismatch for output_blocks.2.1.qkv.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([1536]). size mismatch for output_blocks.2.1.proj_out.weight: copying a param with shape torch.Size([1024, 1024, 1]) from checkpoint, the shape in current model is torch.Size([512, 512, 1]). size mismatch for output_blocks.2.1.proj_out.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.3.0.in_layers.0.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([896]). size mismatch for output_blocks.3.0.in_layers.0.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([896]). size mismatch for output_blocks.3.0.in_layers.2.weight: copying a param with shape torch.Size([1024, 2048, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 896, 3, 3]). size mismatch for output_blocks.3.0.in_layers.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.3.0.emb_layers.1.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([768, 512]). size mismatch for output_blocks.3.0.emb_layers.1.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for output_blocks.3.0.out_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.3.0.out_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.3.0.out_layers.3.weight: copying a param with shape torch.Size([1024, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3]). size mismatch for output_blocks.3.0.out_layers.3.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.3.0.skip_connection.weight: copying a param with shape torch.Size([1024, 2048, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 896, 1, 1]). size mismatch for output_blocks.3.0.skip_connection.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.3.1.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.3.1.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.3.1.qkv.weight: copying a param with shape torch.Size([3072, 1024, 1]) from checkpoint, the shape in current model is torch.Size([1152, 384, 1]). size mismatch for output_blocks.3.1.qkv.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([1152]). size mismatch for output_blocks.3.1.proj_out.weight: copying a param with shape torch.Size([1024, 1024, 1]) from checkpoint, the shape in current model is torch.Size([384, 384, 1]). size mismatch for output_blocks.3.1.proj_out.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.4.0.in_layers.0.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for output_blocks.4.0.in_layers.0.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for output_blocks.4.0.in_layers.2.weight: copying a param with shape torch.Size([1024, 2048, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 768, 3, 3]). size mismatch for output_blocks.4.0.in_layers.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.4.0.emb_layers.1.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([768, 512]). size mismatch for output_blocks.4.0.emb_layers.1.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for output_blocks.4.0.out_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.4.0.out_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.4.0.out_layers.3.weight: copying a param with shape torch.Size([1024, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3]). size mismatch for output_blocks.4.0.out_layers.3.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.4.0.skip_connection.weight: copying a param with shape torch.Size([1024, 2048, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 768, 1, 1]). size mismatch for output_blocks.4.0.skip_connection.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.4.1.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.4.1.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.4.1.qkv.weight: copying a param with shape torch.Size([3072, 1024, 1]) from checkpoint, the shape in current model is torch.Size([1152, 384, 1]). size mismatch for output_blocks.4.1.qkv.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([1152]). size mismatch for output_blocks.4.1.proj_out.weight: copying a param with shape torch.Size([1024, 1024, 1]) from checkpoint, the shape in current model is torch.Size([384, 384, 1]). size mismatch for output_blocks.4.1.proj_out.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.5.0.in_layers.0.weight: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for output_blocks.5.0.in_layers.0.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for output_blocks.5.0.in_layers.2.weight: copying a param with shape torch.Size([1024, 1536, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 640, 3, 3]). size mismatch for output_blocks.5.0.in_layers.2.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.5.0.emb_layers.1.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([768, 512]). size mismatch for output_blocks.5.0.emb_layers.1.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for output_blocks.5.0.out_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.5.0.out_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.5.0.out_layers.3.weight: copying a param with shape torch.Size([1024, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 384, 3, 3]). size mismatch for output_blocks.5.0.out_layers.3.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.5.0.skip_connection.weight: copying a param with shape torch.Size([1024, 1536, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 640, 1, 1]). size mismatch for output_blocks.5.0.skip_connection.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.5.1.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.5.1.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.5.1.qkv.weight: copying a param with shape torch.Size([3072, 1024, 1]) from checkpoint, the shape in current model is torch.Size([1152, 384, 1]). size mismatch for output_blocks.5.1.qkv.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([1152]). size mismatch for output_blocks.5.1.proj_out.weight: copying a param with shape torch.Size([1024, 1024, 1]) from checkpoint, the shape in current model is torch.Size([384, 384, 1]). size mismatch for output_blocks.5.1.proj_out.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.6.0.in_layers.0.weight: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for output_blocks.6.0.in_layers.0.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([640]). size mismatch for output_blocks.6.0.in_layers.2.weight: copying a param with shape torch.Size([512, 1536, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 640, 3, 3]). size mismatch for output_blocks.6.0.in_layers.2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.6.0.emb_layers.1.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 512]). size mismatch for output_blocks.6.0.emb_layers.1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.6.0.out_layers.0.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.6.0.out_layers.0.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.6.0.out_layers.3.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]). size mismatch for output_blocks.6.0.out_layers.3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.6.0.skip_connection.weight: copying a param with shape torch.Size([512, 1536, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 640, 1, 1]). size mismatch for output_blocks.6.0.skip_connection.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.7.0.in_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.7.0.in_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.7.0.in_layers.2.weight: copying a param with shape torch.Size([512, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 512, 3, 3]). size mismatch for output_blocks.7.0.in_layers.2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.7.0.emb_layers.1.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 512]). size mismatch for output_blocks.7.0.emb_layers.1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.7.0.out_layers.0.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.7.0.out_layers.0.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.7.0.out_layers.3.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]). size mismatch for output_blocks.7.0.out_layers.3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.7.0.skip_connection.weight: copying a param with shape torch.Size([512, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 512, 1, 1]). size mismatch for output_blocks.7.0.skip_connection.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.8.0.in_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.8.0.in_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.8.0.in_layers.2.weight: copying a param with shape torch.Size([512, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 384, 3, 3]). size mismatch for output_blocks.8.0.in_layers.2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.8.0.emb_layers.1.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 512]). size mismatch for output_blocks.8.0.emb_layers.1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for output_blocks.8.0.out_layers.0.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.8.0.out_layers.0.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.8.0.out_layers.3.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]). size mismatch for output_blocks.8.0.out_layers.3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.8.0.skip_connection.weight: copying a param with shape torch.Size([512, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 384, 1, 1]). size mismatch for output_blocks.8.0.skip_connection.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.9.0.in_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.9.0.in_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for output_blocks.9.0.in_layers.2.weight: copying a param with shape torch.Size([512, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 384, 3, 3]). size mismatch for output_blocks.9.0.in_layers.2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.9.0.emb_layers.1.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([256, 512]). size mismatch for output_blocks.9.0.emb_layers.1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.9.0.out_layers.0.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.9.0.out_layers.0.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.9.0.out_layers.3.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for output_blocks.9.0.out_layers.3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.9.0.skip_connection.weight: copying a param with shape torch.Size([512, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 384, 1, 1]). size mismatch for output_blocks.9.0.skip_connection.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.10.0.in_layers.0.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.10.0.in_layers.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.10.0.in_layers.2.weight: copying a param with shape torch.Size([512, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 256, 3, 3]). size mismatch for output_blocks.10.0.in_layers.2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.10.0.emb_layers.1.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([256, 512]). size mismatch for output_blocks.10.0.emb_layers.1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.10.0.out_layers.0.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.10.0.out_layers.0.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.10.0.out_layers.3.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for output_blocks.10.0.out_layers.3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.10.0.skip_connection.weight: copying a param with shape torch.Size([512, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 256, 1, 1]). size mismatch for output_blocks.10.0.skip_connection.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.11.0.in_layers.0.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.11.0.in_layers.0.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.11.0.in_layers.2.weight: copying a param with shape torch.Size([512, 768, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 256, 3, 3]). size mismatch for output_blocks.11.0.in_layers.2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.11.0.emb_layers.1.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([256, 512]). size mismatch for output_blocks.11.0.emb_layers.1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for output_blocks.11.0.out_layers.0.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.11.0.out_layers.0.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.11.0.out_layers.3.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for output_blocks.11.0.out_layers.3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for output_blocks.11.0.skip_connection.weight: copying a param with shape torch.Size([512, 768, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 256, 1, 1]). size mismatch for output_blocks.11.0.skip_connection.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for out.0.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for out.0.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for out.2.weight: copying a param with shape torch.Size([6, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([3, 128, 3, 3]). size mismatch for out.2.bias: copying a param with shape torch.Size([6]) from checkpoint, the shape in current model is torch.Size([3]).

opened by mehdizemni 3
out of memory in FID evaluation

failed to allocate 64.00M (67108864 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory

Does anyone have a problem using GPU when doing evaluation?

opened by forever208 2
$[Bugs] Classifier is conditioning on x_{t+1} instead of x_{t} and grad calculation$
[Bugs] Classifier is conditioning on x_{t+1} instead of x_{t} and grad calculation
Hi @unixpickle @prafullasd @erinbeesley, I think I found 2 bugs:

Shouldn't we pass out["mean"] (x_{t}) instead of x (x_{t+1}) here (similarly t-1 instead of t): https://github.com/openai/guided-diffusion/blob/main/guided_diffusion/gaussian_diffusion.py#L435

Shouldn't we separate grad calculation here? https://github.com/openai/guided-diffusion/blob/main/scripts/classifier_sample.py#L61 We need grads of i-th image in the batch w.r.t. corresponding log prob and not grads of i-th image w.r.t. the sum of log probs? It makes no sense to optimize the sum, as we want each image to be directed by its own class.

I might have misunderstood something please let me know if so! :)
opened by gordicaleksa 2
Training doesn't really work

When running

python scripts/image_train.py --data_dir path/to/images $MODEL_FLAGS $DIFFUSION_FLAGS $TRAIN_FLAGS The code will run for 2 seconds and finish it without creating some /tmp folder. There isn't any error neither.

opened by baniasbaabe 2
Add dimension as input to create_model

Hi, thank you so much for open sourcing the code. I was planning on using this repo for 3d images and although I can see that the UNet class allows 3d inputs, the "dims" keyword is not passed through the create_model function in script_util.py. If it is alright with the owners of the repo, I would like to push my change that would allow this. This change would keep the default dimension to 2 so that no breaking changes hopefully result.

opened by jamaliki 2
guided diffusion super resolution network training is diverging

Hello everyone,

I am working with guided diffusion. I would like to reproduce the results of the repository for the 64->256 super resolution network.

My issue is that the upsampled images look good for the first 5000 iterations, but then the loss rapidly increases and the output is only noise from 6000 iterations on.

The difference between my trained model and the authors provided model, is that I don't do any conditioning, so I removed the classifier. The training batch in this way only provides the "low_res" image, and not the output from the classifier "y".

What do you think is wrong here? do you have some hints for debugging?

Please have a look at the produced samples and loss at: https://i.imgur.com/upuzHPg.png (I couldn't embed the image on)

EDIT: my batch size is 3, whereas it seems the original repo has batch size 256. This could be a culprit. However, I have no idea which kind of GPU can load a batch size of 256. As a second question. Which GPU can handle such a massive batch size of 256? is it a multi GPU configuration?

My goal is to produce a solid baseline to work on.

Thanks in advance to anyone who helps,

Stefano

opened by stsavian 1
Question about training on multi GPUs

Hi, I use multi-gpus to train my model like mpiexec -n 8 python scripts/image_train.py. However, I encountered an issue that I couldn't load the ckpt to all gpus and it encountered OOM when I finetuned 512*512 model. But I didn't encountered this problem when I tried to train on one GPU. Could I ask what I should do?

opened by YUHANG-Ma 1
About the gpu memory usage of the training process

Hello, I have reproduced the experiment of the DDPM basic model before. When I use a picture of 128*128 size and batchsize is set to 2, the gpu memory usage is about 18G; when I use your code, the same setting, gpu The usage is about 5G, is this normal, or maybe there is something wrong with my settings?

opened by zy-charon 0
Parameters for Diffusion Model and Classifier Training

Hello, I would like to ask about the parameters in the diffusion model and classifier： predict_xstart, rescale_timesteps, rescale_learned_sigmas, classifier_use_scale_shift_norm, classifier_resblock_updown, classifier_pool, use_checkpoint, use_scale_shift_norm, resblock_updown, use_fp16 and use_new_attention_order, etc. What do they mean? How should it be set during training?

opened by zy-charon 0
After training the sample pictures get some weird color tints

Sometimes in training i get some weird color schemes in my pictures, while the original data has no tints at all. Is there a reason for it, and how could I avoid it?

Original data is like:

And the output is:

opened by benjamin-bertram 2
How is the diffusion model used in the detection task

Hello, I would like to ask which weight file I should download if I use the diffusion model in the detection task, or maybe I need several cards to train the diffusion model by myself

opened by QY1994-0919 0

This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

Related tags

Overview

guided-diffusion

Usage

Comments

Owner

OpenAI

Minimal diffusion models - Minimal code and simple experiments to play with Denoising Diffusion Probabilistic Models (DDPMs)

Official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.

Pytorch-diffusion - A basic PyTorch implementation of 'Denoising Diffusion Probabilistic Models'

High-Resolution Image Synthesis with Latent Diffusion Models

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data - Official PyTorch Implementation (CVPR 2022)

PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis

[CVPR 2021] Anycost GANs for Interactive Image Synthesis and Editing

Implementation supporting the ICCV 2017 paper "GANs for Biological Image Synthesis"

Pytorch Implementation of DiffSinger: Diffusion Acoustic Model for Singing Voice Synthesis (TTS Extension)

The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

Technical experimentations to beat the stock market using deep learning :chart_with_upwards_trend:

End-to-end beat and downbeat tracking in the time domain.

Implementation of the state of the art beat-detection, downbeat-detection and tempo-estimation model

This is the PyTorch implementation of GANs N’ Roses: Stable, Controllable, Diverse Image to Image Translation

AOT-GAN for High-Resolution Image Inpainting (codebase for image inpainting)

This repository contains several image-to-image translation models, whcih were tested for RGB to NIR image generation. The models are Pix2Pix, Pix2PixHD, CycleGAN and PointWise.

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

Learning Energy-Based Models by Diffusion Recovery Likelihood