Activating More Pixels in Image Super-Resolution Transformer

Related tags

Deep Learning HAT
Overview

HAT [Paper Link]

Activating More Pixels in Image Super-Resolution Transformer

Xiangyu Chen, Xintao Wang, Jiantao Zhou and Chao Dong

BibTeX

@article{chen2022activating,
  title={Activating More Pixels in Image Super-Resolution Transformer},
  author={Chen, Xiangyu and Wang, Xintao and Zhou, Jiantao and Dong, Chao},
  journal={arXiv preprint arXiv:2205.04437},
  year={2022}
}

Environment

Installation

pip install -r requirements.txt
python setup.py develop

How To Test

  • Refer to ./options/test for the configuration file of the model to be tested, and prepare the testing data and pretrained model.
  • The pretrained models are available at Google Drive or Baidu Netdisk (access code: qyrl).
  • Then run the follwing codes (taking HAT_SRx4_ImageNet-pretrain.pth as an example):
python hat/test.py -opt options/test/HAT_SRx4_ImageNet-pretrain.yml

The testing results will be saved in the ./results folder.

Results

The inference results on benchmark datasets are available at Google Drive or Baidu Netdisk (access code: 63p5).

This repo is still being updated. The training codes will be released soon.

Comments
  • Add Replicate demo and API

    Add Replicate demo and API

    Hey @Xiangtaokong ! 👋

    This pull request makes it possible to run your model inside a Docker environment, which makes it easier for other people to run it. We're using an open source tool called Cog to make this process easier.

    This also means we can make a web page where other people can run your model! We have added HAT_SRx4_ImageNet for SingleImageDataset for people to easily test their own input image, view it here: https://replicate.com/cjwbw/hat

    Replicate also have an API, so people can easily run your model from their code:

    import replicate
    model = replicate.models.get("cjwbw/hat")
    output = model.predict(image="...")
    

    You are more than welcome to modify the Replicate page (e.g. Example Gallery), let me know and I can transfer ownership to your account.

    In case you're wondering who I am, I'm from Replicate, where we're trying to make machine learning reproducible. We got frustrated that we couldn't run all the really interesting ML work being done. So, we're going round implementing models we like. 😊

    opened by chenxwh 7
  • How to get same result on multiple runs ?

    How to get same result on multiple runs ?

    Hi, thanks for your sharing and contribution!

    I tried to reproduce same result about training loss on my custom dataset, but it didn't work.

    So, I wonder if HAT can return the exactly same result about training loss.

    Any help would be much appreciated, thanks.

    My environments

    • windows 10
    • python : 3.7.13
    • pytorch : 1.12.1+cu113
    • torchvision : 0.13.1+cu113
    • cuda : 11.3
    • cudnn : 8.4.1
    • basicsr : both 1.3.4.9 and 1.4.2 (latest version)

    Methods I tried

    • use_hflip = False
    • use_rot = False
    • use_shuffle = False
    • num_worker_per_gpu = 0
    opened by Dongwoo-Im 5
  • out of memory on testing

    out of memory on testing

    I set the batch size to 2 during training and it works fine. I am using the V100 16G GPU card just one. However, when I tried to test, it resulted in "out of memory" CUDA errors

    [ 2022-09-21 08:54:33,454 INFO: Model [HATModel] is created. 2022-09-21 08:54:33,455 INFO: Testing open... Traceback (most recent call last): File "/mnt/disk2/HAT/hat/test.py", line 11, in test_pipeline(root_path) File "/home/ubuntu/venv/lib/python3.10/site-packages/basicsr/test.py", line 40, in test_pipeline model.validation(test_loader, current_iter=opt['name'], tb_logger=None, save_img=opt['val']['save_img']) File "/home/ubuntu/venv/lib/python3.10/site-packages/basicsr/models/base_model.py", line 48, in validation self.nondist_validation(dataloader, current_iter, tb_logger, save_img) File "/home/ubuntu/venv/lib/python3.10/site-packages/basicsr/models/sr_model.py", line 157, in nondist_validation self.test() File "/mnt/disk2/HAT/hat/models/hat_model.py", line 29, in test self.output = self.net_g(img) File "/home/ubuntu/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/mnt/disk2/HAT/hat/archs/hat_arch.py", line 978, in forward x = self.conv_after_body(self.forward_features(x)) + x File "/mnt/disk2/HAT/hat/archs/hat_arch.py", line 964, in forward_features x = layer(x, x_size, params) File "/home/ubuntu/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/mnt/disk2/HAT/hat/archs/hat_arch.py", line 619, in forward return self.patch_embed(self.conv(self.patch_unembed(self.residual_group(x, x_size, params), x_size))) + x File "/home/ubuntu/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/mnt/disk2/HAT/hat/archs/hat_arch.py", line 530, in forward x = self.overlap_attn(x, x_size, params['rpi_oca']) File "/home/ubuntu/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/mnt/disk2/HAT/hat/archs/hat_arch.py", line 425, in forward attn = attn + relative_position_bias.unsqueeze(0) RuntimeError: CUDA out of memory. Tried to allocate 3.38 GiB (GPU 0; 15.78 GiB total capacity; 6.44 GiB already allocated; 1.18 GiB free; 13.45 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF ]

    How can I solve it?

    opened by ziippy 4
  • I dont understand

    I dont understand

    Every image I've tried to upscale with HAT just seems to stretch the image to a larger, even more blurrier size. I've tried all sorts of sizes ranging from 64x64 to 1024x1024. It seems to just click and drag the image larger for me without actually enhancing anything.

    Am I doing something wrong? I'd love to be able to use this project but right now its very confusing to me. :/

    opened by dillfrescott 3
  • Why the loss does not converged?

    Why the loss does not converged?

    I'm learning with DF2K and using 4 GPUs.

    Also, I'm refer the "train_HAT_SRx4_finetune_from_ImageNet_pretrain.yml" file.

    I just change the dataroot_gt, dataroot_lq for train and val. also change the num_worker_per_gpu, batch_size_per_gpu like that

    ### num_worker_per_gpu: 6
    ### batch_size_per_gpu: 4
    num_worker_per_gpu: 3
    batch_size_per_gpu: 8
    

    But, after 80000 iter.. The l_pix does not converged. [ 2022-09-19 20:15:50,914 INFO: [train..][epoch:738, iter: 79,000, lr:(1.000e-05,)] [eta: 6 days, 12:51:28, time (data): 3.029 (0.004)] l_pix: 2.1359e-02 2022-09-19 20:15:50,915 INFO: Saving models and training states. 2022-09-19 20:21:18,215 INFO: [train..][epoch:739, iter: 79,100, lr:(1.000e-05,)] [eta: 6 days, 12:45:51, time (data): 3.218 (0.341)] l_pix: 3.3533e-02 2022-09-19 20:26:43,030 INFO: [train..][epoch:740, iter: 79,200, lr:(1.000e-05,)] [eta: 6 days, 12:40:10, time (data): 2.068 (0.031)] l_pix: 1.9019e-02 2022-09-19 20:32:16,706 INFO: [train..][epoch:741, iter: 79,300, lr:(1.000e-05,)] [eta: 6 days, 12:34:47, time (data): 3.264 (0.337)] l_pix: 1.9070e-02 2022-09-19 20:37:43,115 INFO: [train..][epoch:742, iter: 79,400, lr:(1.000e-05,)] [eta: 6 days, 12:29:08, time (data): 3.401 (0.004)] l_pix: 1.7958e-02 2022-09-19 20:42:36,323 INFO: [train..][epoch:742, iter: 79,500, lr:(1.000e-05,)] [eta: 6 days, 12:22:19, time (data): 2.954 (0.020)] l_pix: 1.5392e-02 2022-09-19 20:48:30,628 INFO: [train..][epoch:743, iter: 79,600, lr:(1.000e-05,)] [eta: 6 days, 12:17:40, time (data): 3.378 (0.003)] l_pix: 2.8961e-02 2022-09-19 20:53:45,430 INFO: [train..][epoch:744, iter: 79,700, lr:(1.000e-05,)] [eta: 6 days, 12:11:37, time (data): 3.156 (0.225)] l_pix: 3.7259e-02 2022-09-19 20:59:13,519 INFO: [train..][epoch:745, iter: 79,800, lr:(1.000e-05,)] [eta: 6 days, 12:06:02, time (data): 3.902 (0.031)] l_pix: 2.7916e-02 2022-09-19 21:04:49,328 INFO: [train..][epoch:746, iter: 79,900, lr:(1.000e-05,)] [eta: 6 days, 12:00:44, time (data): 3.374 (0.410)] l_pix: 2.1746e-02 2022-09-19 21:10:27,211 INFO: [train..][epoch:747, iter: 80,000, lr:(1.000e-05,)] [eta: 6 days, 11:55:30, time (data): 3.748 (0.094)] l_pix: 2.1582e-02 2022-09-19 21:10:27,213 INFO: Saving models and training states. 2022-09-19 21:22:35,811 INFO: Validation open # psnr: 20.3545 Best: 20.3660 @ 65000 iter # ssim: 0.4768 Best: 0.4769 @ 65000 iter

    2022-09-19 21:27:52,322 INFO: [train..][epoch:748, iter: 80,100, lr:(1.000e-05,)] [eta: 6 days, 12:15:17, time (data): 3.176 (0.366)] l_pix: 2.4691e-02 2022-09-19 21:33:13,818 INFO: [train..][epoch:749, iter: 80,200, lr:(1.000e-05,)] [eta: 6 days, 12:09:25, time (data): 3.303 (0.093)] l_pix: 2.2727e-02 2022-09-19 21:38:51,310 INFO: [train..][epoch:750, iter: 80,300, lr:(1.000e-05,)] [eta: 6 days, 12:04:08, time (data): 3.374 (0.419)] l_pix: 1.5810e-02 2022-09-19 21:44:40,636 INFO: [train..][epoch:751, iter: 80,400, lr:(1.000e-05,)] [eta: 6 days, 11:59:15, time (data): 3.433 (0.393)] l_pix: 1.9958e-02 2022-09-19 21:50:00,407 INFO: [train..][epoch:752, iter: 80,500, lr:(1.000e-05,)] [eta: 6 days, 11:53:20, time (data): 3.198 (0.192)] l_pix: 2.1157e-02 2022-09-19 21:55:30,407 INFO: [train..][epoch:753, iter: 80,600, lr:(1.000e-05,)] [eta: 6 days, 11:47:47, time (data): 3.248 (0.231)] l_pix: 2.8304e-02 2022-09-19 22:00:58,110 INFO: [train..][epoch:754, iter: 80,700, lr:(1.000e-05,)] [eta: 6 days, 11:42:09, time (data): 3.279 (0.391)] l_pix: 2.4832e-02 2022-09-19 22:06:35,306 INFO: [train..][epoch:755, iter: 80,800, lr:(1.000e-05,)] [eta: 6 days, 11:36:50, time (data): 3.326 (0.384)] l_pix: 2.9092e-02 2022-09-19 22:12:15,613 INFO: [train..][epoch:756, iter: 80,900, lr:(1.000e-05,)] [eta: 6 days, 11:31:38, time (data): 3.429 (0.409)] l_pix: 2.6695e-02 2022-09-19 22:17:41,607 INFO: [train..][epoch:757, iter: 81,000, lr:(1.000e-05,)] [eta: 6 days, 11:25:57, time (data): 3.343 (0.408)] l_pix: 3.1762e-02 2022-09-19 22:17:41,609 INFO: Saving models and training states. ]

    Do you know why the loss does not converged??

    attched file is my .yaml file

    please advise to me. train_HAT_SRx4_my_others_to_open.yml--.log

    opened by ziippy 3
  • Can you tell me how to preprocess images in HAT?

    Can you tell me how to preprocess images in HAT?

    Thank you for sharing your code.

    It appears to use BGR images during preprocessing.

    Can I know the process of preprocessing other than this? (Example. Divide by 255)

    opened by saeu5407 2
  • Questions related to pre-trained dataset ImageNet

    Questions related to pre-trained dataset ImageNet

    Thank you for your outstanding work! I have some questions about the ImageNet dataset that I would like to ask you.

    1. I find it a bit confusing that in the pre-training yml file there are only GT files about ImageNet and no LR files.
    2. There are some images with small size (e.g. less than 256x256) inside the ImageNet dataset, how do you handle for these images?
    3. By the way, can you give me an overview of the ImageNet dataset preparation? Looking forward to your reply!
    opened by GoPikachue 2
  • commented out

    commented out "Tile n/n"

    When I using the test.py then some additional information displayed like that.

    2022-09-25 20:49:05,012 INFO: Testing open... Tile 1/4 Tile 2/4 Tile 3/4 Tile 4/4 Tile 1/4 Tile 2/4 Tile 3/4 Tile 4/4 Tile 1/4 Tile 2/4 Tile 3/4 Tile 4/4 Tile 1/4 Tile 2/

    I think, commented out this "Tile n/n" log is more better.

    opened by ziippy 2
  • Training Error. Could you provide some guidance on how to fix this error?

    Training Error. Could you provide some guidance on how to fix this error?

    FileNotFoundError: [Errno 2] No such file or directory: '/qfs/projects/mage/watk681/DIV2K_train_HR/DIV2K_train_HR/002116_s044.png'

    However, DIV2K/DIV2K_train_HR/ uses 0001.png, 0002.png, ..., 0800.png. Any guidance on how to generate the compatible meta_info_DF2Ksub_GT.txt to work with DIV2K/DIV2K_train_HR/ uses 0001.png, 0002.png, ..., 0800.png.?

    opened by yazidoudou18 1
  • About position encoding and attention mask

    About position encoding and attention mask

    Hello,

    Thanks for your great work!

    What is the difference in implementing position encoding and attention mask in overlapped cross-window attention? I mean that overlapped cross-window attention is different from the vanilla one, since the window size of Q and K are different, and I think using original RPE and attention mask does not make sense.

    Could you please give me some hints? Thanks in advance.

    opened by mrluin 1
  • network output image code

    network output image code

    Hello, I have questions about the network output image code. Where is the code of the final output image of the network ? I want to know the specific code operation of the network to save the image.Why is there only [ ' save _ img ' ] in the code, but no specific saved code ? 000b97e7e872c9520c69f09a80deb44

    opened by 7linlin 1
  • Rather than always going from 64 to up, can we process 640x480 images as is?

    Rather than always going from 64 to up, can we process 640x480 images as is?

    Rather than always going from 64 to up, can we process 640x480 images as is?

    I want to test how well it does without having to shrink the image to 64 height.

    opened by H19012 1
  • Can you modify for train_1, train_2 ... like val_1, val_2 ...?

    Can you modify for train_1, train_2 ... like val_1, val_2 ...?

    I want to train with separated folders like train_1, train_2.

    I checked the .yaml support multiple validation directory like val_x. but, train_x is not supported. Is there any plan for supporting that?

    my GT is divided separated with each domain. (load, bird, car, ... and so on) So, If you can. please do that.

    opened by ziippy 1
  • Does HAT have tile mode?

    Does HAT have tile mode?

    Hello, congratulation for getting highest score in "paperwithcode" benchmark summary. But, I curious with HAT, so I installed it.

    I inference an image

    !python hat/test.py -opt options/test/HAT_SRx4_ImageNet-LR.yml

    But, it was errror because CUDA out of memory. I'm sorry because my VRAM only 2 GB. Yeah, I will try with google colab, but it has limited time and GPU quota.

    Can HAT has a feature like tiling the image? for example !python hat/test.py -opt options/test/HAT_SRx4_ImageNet-LR.yml --tile 256

    opened by ichsan2895 6
Owner
XyChen
PhD. Student,Computer Vision
XyChen
VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).

VSR-Transformer By Jiezhang Cao, Yawei Li, Kai Zhang, Luc Van Gool This paper proposes a new Transformer for video super-resolution (called VSR-Transf

Jiezhang Cao 225 Nov 13, 2022
Official PyTorch code for Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021)

Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021) This repository is the official P

Jingyun Liang 156 Nov 23, 2022
Official PyTorch code for Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021)

Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021) This repository is the official P

Jingyun Liang 156 Nov 23, 2022
A framework for joint super-resolution and image synthesis, without requiring real training data

SynthSR This repository contains code to train a Convolutional Neural Network (CNN) for Super-resolution (SR), or joint SR and data synthesis. The met

null 81 Nov 18, 2022
Repository for "Exploring Sparsity in Image Super-Resolution for Efficient Inference", CVPR 2021

SMSR Reposity for "Exploring Sparsity in Image Super-Resolution for Efficient Inference" [arXiv] Highlights Locate and skip redundant computation in S

Longguang Wang 224 Nov 30, 2022
MASA-SR: Matching Acceleration and Spatial Adaptation for Reference-Based Image Super-Resolution (CVPR2021)

MASA-SR Official PyTorch implementation of our CVPR2021 paper MASA-SR: Matching Acceleration and Spatial Adaptation for Reference-Based Image Super-Re

DV Lab 125 Nov 22, 2022
PyTorch code for our paper "Attention in Attention Network for Image Super-Resolution"

Under construction... Attention in Attention Network for Image Super-Resolution (A2N) This repository is an PyTorch implementation of the paper "Atten

Haoyu Chen 69 Nov 20, 2022
PyTorch implementation of Graph Convolutional Networks in Feature Space for Image Deblurring and Super-resolution, IJCNN 2021.

GCResNet PyTorch implementation of Graph Convolutional Networks in Feature Space for Image Deblurring and Super-resolution, IJCNN 2021. The code will

null 11 May 19, 2022
PyTorch code for our paper "Image Super-Resolution with Non-Local Sparse Attention" (CVPR2021).

Image Super-Resolution with Non-Local Sparse Attention This repository is for NLSN introduced in the following paper "Image Super-Resolution with Non-

null 140 Nov 27, 2022
PyTorch code for our ECCV 2020 paper "Single Image Super-Resolution via a Holistic Attention Network"

HAN PyTorch code for our ECCV 2020 paper "Single Image Super-Resolution via a Holistic Attention Network" This repository is for HAN introduced in the

五维空间 140 Nov 23, 2022
Implementation of paper: "Image Super-Resolution Using Dense Skip Connections" in PyTorch

SRDenseNet-pytorch Implementation of paper: "Image Super-Resolution Using Dense Skip Connections" in PyTorch (http://openaccess.thecvf.com/content_ICC

wxy 114 Nov 26, 2022
[ACM MM 2021] Joint Implicit Image Function for Guided Depth Super-Resolution

Joint Implicit Image Function for Guided Depth Super-Resolution This repository contains the code for: Joint Implicit Image Function for Guided Depth

hawkey 76 Nov 19, 2022
Practical Single-Image Super-Resolution Using Look-Up Table

Practical Single-Image Super-Resolution Using Look-Up Table [Paper] Dependency Python 3.6 PyTorch glob numpy pillow tqdm tensorboardx 1. Training deep

Younghyun Jo 113 Dec 5, 2022
PyTorch code for our ECCV 2018 paper "Image Super-Resolution Using Very Deep Residual Channel Attention Networks"

PyTorch code for our ECCV 2018 paper "Image Super-Resolution Using Very Deep Residual Channel Attention Networks"

Yulun Zhang 1.2k Dec 3, 2022
PyTorch version of the paper 'Enhanced Deep Residual Networks for Single Image Super-Resolution' (CVPRW 2017)

About PyTorch 1.2.0 Now the master branch supports PyTorch 1.2.0 by default. Due to the serious version problem (especially torch.utils.data.dataloade

Sanghyun Son 2.1k Nov 29, 2022
pytorch implementation for Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network arXiv:1609.04802

PyTorch SRResNet Implementation of Paper: "Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network"(https://arxiv.org/abs

Jiu XU 435 Nov 24, 2022
Unoffical implementation about Image Super-Resolution via Iterative Refinement by Pytorch

Image Super-Resolution via Iterative Refinement Paper | Project Brief This is a unoffical implementation about Image Super-Resolution via Iterative Re

LiangWei Jiang 2.4k Dec 2, 2022
Official PyTorch code for Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021)

Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021) This repository is the official PyTorc

Jingyun Liang 137 Nov 26, 2022
PyTorch Implementation of "Light Field Image Super-Resolution with Transformers"

LFT PyTorch implementation of "Light Field Image Super-Resolution with Transformers", arXiv 2021. [pdf]. Contributions: We make the first attempt to a

Squidward 62 Nov 28, 2022