traiNNer is an open source image and video restoration (super-resolution, denoising, deblurring and others) and image to image translation toolbox based on PyTorch.

Last update: Jan 4, 2023

Related tags

Deep Learning convolutional-neural-networks pix2pix super-resolution upscale image-restoration deblurring denoising cyclegan esrgan cartoonization srflow real-esrgan bsrgan real-sr

Overview

traiNNer

traiNNer is an open source image and video restoration (super-resolution, denoising, deblurring and others) and image to image translation toolbox based on PyTorch.

Here you will find: boilerplate code for training and testing computer vision (CV) models, different methods and strategies integrated in a single pipeline and modularity to add and remove components as needed, including new network architectures and templates for different training strategies. The code is under a constant state of change, so if you find an issue or bug please open a issue, a discussion or write in one of the Discord channels for help.

Different from other repositories, here the focus is not only on repeating previous papers' results, but to enable more people to train their own models more easily, using their own custom datasets, as well as integrating new ideas to increase the performance of the models. For these reasons, a lot of the code is made in order to automatically take care of fixing potential issues, whenever possible.

Details of the currently supported architectures can be found here.

For a changelog and general list of features of this repository, check here.

Dependencies
Codes
Usage
Pretrained models
Datasets
How to help

Dependencies

Python 3 (Recommend to use Anaconda)
PyTorch >= 0.4.0. PyTorch >= 1.7.0 required to enable certain features (SWA, AMP, others), as well as torchvision.
NVIDIA GPU + CUDA
Python packages: pip install numpy opencv-python
JSON files can be used for the configuration option files, but in order to use YAML, the PyYAML python package is also a dependency: pip install PyYAML

Optional Dependencies

Python package: pip install tensorboardX, for visualizing curves.
Python package: pip install lmdb, for lmdb database support.
Python package: pip install scipy to use CEM.
Python package: pip install Pillow to use as an alternative image backend (default is OpenCV).
Python package: pip install joblib to train White-box Cartoonization (WBC) models.

Codes

This repository is a full framework for training different kinds of networks, with multiple enhancements and options. In ./codes you will find a more detailed explaination of the code framework ).

You will also find:

Some useful scripts. More details in ./codes/scripts.
Evaluation codes, e.g., PSNR/SSIM metric.

Additionally, it is complemented by other repositories like DLIP, that can be used in order to extract estimated kernels and noise patches from real images, using a modified KernelGAN and patches extraction code. Detailed instructions about how to use the estimated kernels are available here

Usage

Training

Data and model preparation

In order to train your own models, you will need to create a dataset consisting of images, and prepare these images, both considering IO constrains, as well as the task the model should target. Detailed data preparation can be seen in codes/data.

Pretrained models that can be used for fine-tuning are available.

Detailed instructions on how to train are also available.

Augmentations strategies for training real-world models (blind SR) like Real-SR, BSRGAN and Real-ESRGAN are provided via presets that define the blur, resizing and noise configurations, but many more augmentations are available to define custom training strategies.

How to Test

For simple testing

The recommended way to get started with some of the models produced by the training codes available in this repository is by getting the pretrained models to be tested and run them in the companion repository iNNfer, with the purpose of model inference.

Additionally, you can also use a GUI (for ESRGAN models, for video) or a smaller repo for inference (for ESRGAN, for video).

If you are interested in obtaining results that can automatically return evaluation metrics, it is also possible to do inference of batches of images and some additional options with the instructions in how to test.

Pretrained models

The most recent community pretrained models can be found in the Wiki, Discord channels (game upscale and animation upscale) and nmkd's models.

For more details about the original and experimental pretrained models, please see pretrained models.

You can put the downloaded models in the default experiments/pretrained_models directory and use them in the options files with the corresponding network architectures.

Model interpolation

Models that were trained using the same pretrained model or are derivates of the same pretrained model are able to be interpolated to combine the properties of both. The original author demostrated this by interpolating the PSNR pretrained model (which is not perceptually good, but results in smooth images) with the ESRGAN resulting models that have more details but sometimes is excessive to control a balance in the resulting images, instead of interpolating the resulting images from both models, giving much better results.

The capabilities of linearly interpolating models are also explored in "DNI": Deep Network Interpolation for Continuous Imagery Effect Transition (CVPR19) with very interesting results and examples. The script for interpolation can be found in the net_interp.py file. This is an alternative to create new models without additional training and also to create pretrained models for easier fine tuning. Below is an example of interpolating between a PSNR-oriented and a perceptual ESRGAN model (first row), and examples of interpolating CycleGAN style transfer models.

More details and explanations of interpolation can be found here in the Wiki.

Datasets

Many datasets are publicly available and used to train models in a way that can be benchmarked and compared with other models. You are also able to create your own datasets with your own images.

Any dataset can be augmented to expose the model to information that might not be available in the images, such a noise and blur. For this reason, a data augmentation pipeline has been added to the options in this repository. It is also possible to add other types of augmentations, such as Batch Augmentations to apply them to minibatches instead of single images. Lastly, if your dataset is small, you can make use of Differential Augmentations to allow the discriminator to extract more information from the available images and train better models. More information can be found in the augmentations document.

How to help

There are multiple ways to help this project. The first one is by using it and trying to train your own models. You can open an issue if you find any bugs or start a discussion if you have ideas, questions or would like to showcase your results.

If you would like to contribute in the form of adding or fixing code, you can do so by cloning this repo and creating a PR. Ideally, it's better for PR to be precise and not changing many parts of the code at the same time, so it can be reviewed and tested. If possible, open an issue or discussion prior to creating the PR and we can talk about any ideas.

You can also join the discord servers and share results and questions with other users.

Lastly, after it has been suggested many times before, now there are options to donate to show your support to the project and help stir it in directions that will make it even more useful. Below you will find those options that were suggested.

Patreon

Bitcoin Address: 1JyWsAu7aVz5ZeQHsWCBmRuScjNhCEJuVL

Ethereum Address: 0xa26AAb3367D34457401Af3A5A0304d6CbE6529A2

Additional Help

If you have any questions, we have a couple of discord servers (game upscale and animation upscale) where you can ask them and a Wiki with more information.

Acknowledgement

Code architecture is originally inspired by pytorch-cyclegan and the first version of BasicSR.

Comments

Correct usage of lmdb

I used create_lmdb.py to create both my LR and HR datasets, and I was wondering how I should configure my options file. Do the settings differ from using HR/LR image folders?
bug

opened by N0manDemo 11
[Suggestion]: Relativistic GAN Type

I was reading about the GAN types (Vanilla, LSGAN, and WGAN-GP) already included in BasicSR, and I found a new type that may bring a sizable performance increase to the discriminators used in upscaling methods like ESRGAN and PPON.

https://arxiv.org/abs/1807.00734

This paper outlines the idea behind a relativistic discriminator and showcases new variants of existing GANs that were created to use this approach. There is also source code available: https://www.github.com/AlexiaJM/RelativisticGAN

The one that stood out to me was RaLSGAN.

It performs better than the other variants in most tests involving generating images that are 128x128 or less. When it comes to SGAN (Standard GAN), it outperforms this variant by a large margin.

Interested to hear your thoughts on this,

N0man
question

opened by N0manDemo 9
FutureWarning and UserWarning

D:\traiNNer\codes\models\base_model.py:921: FutureWarning: Non-finite norm encountered in torch.nn.utils.clip_grad_norm_; continuing anyway. Note that the default behavior will change in a future release to error out if a non-finite total norm is encountered. At that point, setting error_if_nonfinite=false will be required to retain the old behavior. self.grad_clip( C:\Python39\lib\site-packages\torch\optim\lr_scheduler.py:129: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate warnings.warn("Detected call of lr_scheduler.step() before optimizer.step(). "

How fix this?

opened by cheuS1-n 5
TypeError: 'NoneType' object cannot be interpreted as an integer

Hello. I am training on Colab and i get following error.

export CUDA_VISIBLE_DEVICES=0 20-12-30 03:59:48.658 - INFO: name: ftrainer use_tb_logger: False model: srragan scale: 8 batch_multiplier: 1 gpu_ids: [0] datasets:[ train:[ name: Dataset mode: LRHROTF dataroot_HR: ['/content/datasets/set0/train/hr', '/content/datasets/set1/train/hr', '/content/datasets/set2/train/hr'] dataroot_LR: ['/content/datasets/set0/train/lr', '/content/datasets/set1/train/lr', '/content/datasets/set2/train/lr'] subset_file: None use_shuffle: True n_workers: 4 batch_size: 100 HR_size: 128 phase: train scale: 8 data_type: img virtual_batch_size: 100 ] val:[ name: Validation mode: LRHROTF dataroot_HR: ['/content/datasets/set0/val/hr', '/content/datasets/set1/val/hr', '/content/datasets/set2/val/hr'] dataroot_LR: ['/content/datasets/set0/val/lr', '/content/datasets/set1/val/lr', '/content/datasets/set2/val/lr'] phase: val scale: 8 data_type: img ] ] path:[ root: /content/BasicSR/ pretrain_model_G: ../experiments/pretrained_models/Restart.pth experiments_root: /content/BasicSR/experiments/ftrainer models: /content/BasicSR/experiments/ftrainer/models training_state: /content/BasicSR/experiments/ftrainer/training_state log: /content/BasicSR/experiments/ftrainer val_images: /content/BasicSR/experiments/ftrainer/val_images ] network_G:[ which_model_G: RRDB_net norm_type: None mode: CNA nf: 64 nb: 23 in_nc: 3 out_nc: 3 gc: 32 group: 1 convtype: Conv2D net_act: leakyrelu scale: 8 ] network_D:[ which_model_D: discriminator_vgg norm_type: batch act_type: leakyrelu mode: CNA nf: 64 in_nc: 3 ] train:[ lr_G: 0.0001 lr_D: 0.0001 use_frequency_separation: False lr_scheme: MultiStepLR lr_steps: [50000, 100000, 200000, 300000] lr_gamma: 0.5 pixel_criterion: l1 pixel_weight: 0.01 feature_criterion: l1 feature_weight: 1 gan_type: vanilla gan_weight: 0.005 manual_seed: 0 niter: 500000.0 val_freq: 100 overwrite_val_imgs: None val_comparison: None ] logger:[ print_freq: 100 save_checkpoint_freq: 100.0 backup_freq: 100 overwrite_chkp: None ] is_train: True

20-12-30 03:59:48.658 - INFO: Random seed: 0 20-12-30 03:59:48.716 - INFO: Dataset [LRHRDataset - Dataset] is created. 20-12-30 03:59:48.716 - INFO: Number of train images: 1,307, iters: 14 20-12-30 03:59:48.716 - INFO: Total epochs needed: 35715 for iters 500,000 20-12-30 03:59:48.719 - INFO: Dataset [LRHRDataset - Validation] is created. 20-12-30 03:59:48.719 - INFO: Number of val images in [Validation]: 358 20-12-30 03:59:48.752 - INFO: AMP library available Traceback (most recent call last): File "train.py", line 256, in main() File "train.py", line 98, in main model = create_model(opt) File "/content/BasicSR/codes/models/init.py", line 26, in create_model m = M(opt) File "/content/BasicSR/codes/models/SRRaGAN_model.py", line 51, in init self.netG = networks.define_G(opt).to(self.device) # G File "/content/BasicSR/codes/models/networks.py", line 160, in define_G finalact=opt_net['finalact'], gaussian_noise=opt_net['gaussian'], plus=opt_net['plus'], nr=opt_net['nr']) File "/content/BasicSR/codes/models/modules/architectures/RRDBNet_arch.py", line 26, in init gaussian_noise=gaussian_noise, plus=plus) for _ in range(nb)] File "/content/BasicSR/codes/models/modules/architectures/RRDBNet_arch.py", line 26, in gaussian_noise=gaussian_noise, plus=plus) for _ in range(nb)] File "/content/BasicSR/codes/models/modules/architectures/RRDBNet_arch.py", line 86, in init gaussian_noise=gaussian_noise, plus=plus) for _ in range(nr)] TypeError: 'NoneType' object cannot be interpreted as an integer

opened by keywae 5
Forcing image size to a multiple of 4 even when 'scale' is 1
The first error is that it is forcing the image size to mutiple of 4 when the scale is 1.

Secondly, even thought it has cropped/expanded the image it still gives this error as it does not scale the corresponding lr image

LOGS: The image size needs to be a multiple of 4. The loaded image size was (817, 398), so it was adjusted to (816, 400). This adjustment will be done to all images whose sizes are not multiples of 4. The image size needs to be a multiple of 4. The loaded image size was (476, 485), so it was adjusted to (476, 484). This adjustment will be done to all images whose sizes are not multiples of 4. Traceback (most recent call last): File "train.py", line 417, in main() File "train.py", line 413, in main fit(model, opt, dataloaders, steps_states, data_params, loggers) File "train.py", line 215, in fit for n, train_data in enumerate(dataloaders['train'], start=1): File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 517, in next data = self._next_data() File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1199, in _next_data return self._process_data(data) File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1225, in _process_data data.reraise() File "/usr/local/lib/python3.7/dist-packages/torch/_utils.py", line 429, in reraise raise self.exc_type(msg) RuntimeError: Caught RuntimeError in DataLoader worker process 0. Original Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop data = fetcher.fetch(index) File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch return self.collate_fn(data) File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 73, in default_collate return {key: default_collate([d[key] for d in batch]) for key in elem} File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 73, in return {key: default_collate([d[key] for d in batch]) for key in elem} File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 55, in default_collate return torch.stack(batch, 0, out=out) RuntimeError: stack expects each tensor to be equal size, but got [3, 400, 816] at entry 0 and [3, 560, 464] at entry 1
opened by mansum6 4
train_ppon.py import errors
I tried train_ppon.py and got an import error on

from models.modules.LPIPS import compute_dists as lpips

I tried commenting it out and the usage of lpips and there seemed to be other import errors.

I've not tried test_ppon.py, but maybe that has errors too.
opened by sean-horton 4

lmdb has no valid image file

i get this error

  File "C:\ManduScale\Train\codes\train.py", line 500, in <module>
    main()
  File "C:\ManduScale\Train\codes\train.py", line 487, in main
    dataloaders, data_params = get_dataloaders(opt)
  File "C:\ManduScale\Train\codes\train.py", line 134, in get_dataloaders
    dataset = create_dataset(dataset_opt)
  File "C:\ManduScale\Train\codes\data\__init__.py", line 79, in create_dataset
    dataset = D(dataset_opt)
  File "C:\ManduScale\Train\codes\data\aligned_dataset.py", line 41, in __init__
    self.A_paths, self.B_paths = get_dataroots_paths(self.opt, strict=False, keys_ds=self.keys_ds)
  File "C:\ManduScale\Train\codes\data\base_dataset.py", line 235, in get_dataroots_paths
    paths_A, paths_B = read_dataroots(opt, keys_ds=keys_ds)
  File "C:\ManduScale\Train\codes\data\base_dataset.py", line 168, in read_dataroots
    paths_A, paths_B = paired_dataset_validation(A_images_paths, B_images_paths,
  File "C:\ManduScale\Train\codes\data\base_dataset.py", line 99, in paired_dataset_validation
    A_paths = get_image_paths(data_type, paths[0], max_dataset_size)  # get image paths
  File "C:\ManduScale\Train\codes\dataops\common.py", line 82, in get_image_paths
    paths = sorted(_get_paths_from_images(dataroot, max_dataset_size=max_dataset_size))
  File "C:\ManduScale\Train\codes\dataops\common.py", line 43, in _get_paths_from_images
    assert images, '{:s} has no valid image file'.format(path)
AssertionError: C:\ManduScale\OPScale\DataSet\FourthSet\LR1.lmdb has no valid image file

my config is

dataroot_HR: ['C:\ManduScale\OPScale\DataSet\FourthSet\HR', 
    'C:\ManduScale\OPScale\DataSet\FourthSet\HR', 
    'C:\ManduScale\OPScale\DataSet\FourthSet\HR', 
    'C:\ManduScale\OPScale\DataSet\FourthSet\HR', 
    'C:\ManduScale\OPScale\DataSet\FourthSet\HR', 
    'C:\ManduScale\OPScale\DataSet\FourthSet\HR', 
    'C:\ManduScale\OPScale\DataSet\FourthSet\HR', 
    'C:\ManduScale\OPScale\DataSet\FourthSet\HR', 
    'C:\ManduScale\OPScale\DataSet\FourthSet\HR', 
    'C:\ManduScale\OPScale\DataSet\FourthSet\HR', 
    'C:\ManduScale\OPScale\DataSet\FourthSet\HR', 
    'C:\ManduScale\OPScale\DataSet\FourthSet\HR', 
    'C:\ManduScale\OPScale\DataSet\FourthSet\HR', 
    'C:\ManduScale\OPScale\DataSet\FourthSet\HR', 
    'C:\ManduScale\OPScale\DataSet\FourthSet\HR', 
    'C:\ManduScale\OPScale\DataSet\FourthSet\HR', 
    'C:\ManduScale\OPScale\DataSet\FourthSet\HR', 
    'C:\ManduScale\OPScale\DataSet\FourthSet\HR', 
    'C:\ManduScale\OPScale\DataSet\FourthSet\HR',  
    'C:\ManduScale\OPScale\DataSet\FourthSet\HR', 
    'C:\ManduScale\OPScale\DataSet\FourthSet\HR', 
    'C:\ManduScale\OPScale\DataSet\FourthSet\HR', 
    'C:\ManduScale\OPScale\DataSet\FourthSet\HR', 
    'C:\ManduScale\OPScale\DataSet\FourthSet\HR', 
    ]
    dataroot_LR: ['C:\ManduScale\OPScale\DataSet\FourthSet\LR1.lmdb',
    'C:\ManduScale\OPScale\DataSet\FourthSet\LR2.lmdb',
    'C:\ManduScale\OPScale\DataSet\FourthSet\LR3.lmdb',
    'C:\ManduScale\OPScale\DataSet\FourthSet\LR4.lmdb',
    'C:\ManduScale\OPScale\DataSet\FourthSet\LR5.lmdb',
    'C:\ManduScale\OPScale\DataSet\FourthSet\LR6.lmdb',
    'C:\ManduScale\OPScale\DataSet\FourthSet\LR7.lmdb',
    'C:\ManduScale\OPScale\DataSet\FourthSet\LR8.lmdb',
    'C:\ManduScale\OPScale\DataSet\FourthSet\LR9.lmdb',
    'C:\ManduScale\OPScale\DataSet\FourthSet\LR10.lmdb',
    'C:\ManduScale\OPScale\DataSet\FourthSet\LR11.lmdb',
    'C:\ManduScale\OPScale\DataSet\FourthSet\LR12.lmdb',
    'C:\ManduScale\OPScale\DataSet\FourthSet\LR13.lmdb',
    'C:\ManduScale\OPScale\DataSet\FourthSet\LR14.lmdb',
    'C:\ManduScale\OPScale\DataSet\FourthSet\LR15.lmdb',
    'C:\ManduScale\OPScale\DataSet\FourthSet\LR16.lmdb',
    'C:\ManduScale\OPScale\DataSet\FourthSet\LR17.lmdb',
    'C:\ManduScale\OPScale\DataSet\FourthSet\LR18.lmdb',
    'C:\ManduScale\OPScale\DataSet\FourthSet\LR19.lmdb',
    'C:\ManduScale\OPScale\DataSet\FourthSet\LR20.lmdb',
    'C:\ManduScale\OPScale\DataSet\FourthSet\LR22.lmdb',
    'C:\ManduScale\OPScale\DataSet\FourthSet\LR23.lmdb',
    'C:\ManduScale\OPScale\DataSet\FourthSet\LR24.lmdb',
    'C:\ManduScale\OPScale\DataSet\FourthSet\LR25.lmdb',
    ]

anyone can help me fix this issue?

opened by MHketbi 3

RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB

On Colab restarting training again results in following error:

File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 889, in call_impl result = self.forward(*input, **kwargs) File "/content/BasicSR/codes/models/modules/architectures/block.py", line 428, in forward sampled_noise = self.noise.repeat(*x.size()).normal() * scale RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 15.78 GiB total capacity; 14.32 GiB already allocated; 20.75 MiB free; 14.46 GiB reserved in total by PyTorch)

The only way to restart training was to reduce batch all the way from 64 to 3

I've tried running following commands to no avail: gc.collect() torch.cuda.empty_cache()

There seems to a resolution here: https://discuss.pytorch.org/t/how-can-we-release-gpu-memory-cache/14530/27

opened by mansum6 3
PPON Error when moving to Phase 2

I was training a model with PPON (192) + MultiScale + Diffaug, and I receive the following error when moving to Phase 2: I have AMP disabled because my GPU doesn't support it. error.log

21-01-27 11:26:52.449 - INFO: Random seed: 0 21-01-27 11:26:52.647 - INFO: Dataset [LRHRDataset - DIV2K] is created. 21-01-27 11:26:52.647 - INFO: Number of train images: 37,933, iters: 2,371 21-01-27 11:26:52.647 - INFO: Total epochs needed: 43 for iters 100,000 21-01-27 11:26:52.648 - INFO: Dataset [LRHRDataset - val_set14_part] is created. 21-01-27 11:26:52.648 - INFO: Number of val images in [val_set14_part]: 1 21-01-27 11:26:52.650 - INFO: AMP library available 21-01-27 11:26:52.827 - INFO: Initialization method [kaiming] 21-01-27 11:26:54.127 - INFO: Initialization method [kaiming] 21-01-27 11:26:54.185 - INFO: Loading pretrained model for G [../experiments/pretrained_models/PPON_G.pth] ... 21-01-27 11:26:55.276 - INFO: Network G structure: DataParallel - PPON, with parameters: 17,267,657 21-01-27 11:26:55.277 - INFO: Network D structure: DataParallel - MultiscaleDiscriminator, with parameters: 8,296,899 21-01-27 11:26:55.277 - INFO: Model [PPONModel] is created. 21-01-27 11:26:55.277 - INFO: Start training from epoch: 0, iter: 0 21-01-27 11:26:55.991 - INFO: Switching to phase: p2, step: 1 Traceback (most recent call last): File "/mnt/ext4-storage/Training/BasicSR/codes/train.py", line 382, in main() File "/mnt/ext4-storage/Training/BasicSR/codes/train.py", line 378, in main fit(model, opt, dataloaders, steps_states, data_params, loggers) File "/mnt/ext4-storage/Training/BasicSR/codes/train.py", line 221, in fit model.optimize_parameters(virtual_step) # calculate loss functions, get gradients, update network weights File "/mnt/ext4-storage/Training/BasicSR/codes/models/ppon_model.py", line 199, in optimize_parameters l_g_total.backward() AttributeError: 'float' object has no attribute 'backward'
bug

opened by N0manDemo 3
Huge cleanup, various fixes, improvements to doc strings, some to-do's e.t.c

There's honestly too much to list of what was changed here. I probably should have done pull requests in more of a chunk/section rather than an overall blast of cleanup of all kinds of places.

I spent hours upon hours cleaning up files.

I added my fork to Codacy: https://app.codacy.com/gh/rlaPHOENiX/BasicSR/dashboard?branch=master And reduced the number of issues down 13% (From 2087 Issues (41%~) to 1812 (32%)).

I didn't just do clean ups either, I did edit some functions, shorten them, remove unnecessary stuff, e.t.c.

I even reduced code-reuse in relation to the train files, and moved the train code to a new class Trainer which is inherited from a new base class Runner which I hope to create a Tester class from in the future.

This will probably take a while to fully review as there was soooo many general code-style syntax problems, which are now flooding the git diff making it quite hard to see what I actually changed.

Regardless hopefully it comes in useful.

opened by rlaphoenix 3
Pixel Unshuffle is broken

Training a 1x model with Pixel Unshuffle (using the supplied pretrained model) yields this error:

[Python] RuntimeError: Given groups=1, weight of size [64, 48, 3, 3], expected input[1, 4, 297, 397] to have 48 channels, but got 4 channels instead [ESRGAN] Upscaling Error: Index was outside the bounds of the array. at Cupscale.PreviewMerger.Merge() at Cupscale.Main.Upscale.<Run>d__8.MoveNext()
bug

opened by Kim2091 2
Have added basic config to output SR outputs for val sets to Tensorboard

This change will allow users to add the configuration variables to the logger section to define tb_log_generated: true and tb_log_lr: true which will then output the same validation images to tensorboard directly for users.

This was something I added purely for my own benefit when training pixel based stuff, and it could be HUGE outputs with loads of validation sets, however there seemed like no sane way to allow users to dictate what validation sets they should output, I thought maybe a sort of image_log_skip variable that would let a user specify they only wanted to write out like 1 in every 50 validation images etc.

Anyway feel free to reject, its something I did for BasicSR and now I am doing a little prototype with this newer version I wanted to be able to report my stuff all in TB, so make any alterations you want.

Thanks again for the great library.

opened by grofit 0
Fix bug in functional.erase (you can fix in Cutout)

Must be [i:h, j:w, :] not [i:i+h, j:j+w, :] because 'w' already contain 'j'. w = mask_size + j (it calculates in transforms, Cutout) This functional use in Cutout in transforms I fixed erase, but you can fix Cutout

opened by magorokhoov 1
`nearest_aligned` is not aligned
When using nearest_aligned the output is noticeably shifted down and to the right. This affects my models severely and causes noticeable warping in their output.

I used this code in augmentations.py to produce the output images:

if __name__ == '__main__': img = cv2.imread('test.png') img_A, _ = Scale(img=img, scale=4, algo=997, ds_kernel=None, img_type='cv2') cv2.imwrite('output.png', img_A)

original image

Output from nearest_aligned as per the above code:

Output from convert test.png -interpolate Average -filter point -resize 25% magick-nearest.png:

Explicitly sampling the top left corner closely matches the offset from nearest_aligned: convert test.png -define sample:offset=0%x0% -sample 25% magick-sampled-top-left.png
opened by awused 2
Use linear RGB downscaling for most/some downscaling operations.

Didn't include nearest_aligned since it shouldn't matter when there's no blending of pixels going on.

This is a pretty minimal, targeted change, and it's also really poorly tested since I've only run it locally myself without testing a wide range of settings or inputs. I am also certainly missing some places where downscaling can happen, but these seem to be the places used during OTF augmentations for my config and I didn't want to make a very invasive change to an unfamiliar Python codebase.

Probably not ready to merge, but probably usable by people who want to apply it locally.

opened by awused 1

Owner

GitHub

[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

Reference-based Video Super-Resolution (RefVSR) Official PyTorch Implementation of the CVPR 2022 Paper Project | arXiv | RealMCVSR Dataset This repo c

151 Dec 30, 2022

Image Restoration Toolbox (PyTorch). Training and testing codes for DPIR, USRNet, DnCNN, FFDNet, SRMD, DPSR, BSRGAN, SwinIR

2k Dec 31, 2022

Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic video-to-video translation.

vid2vid Project | YouTube(short) | YouTube(full) | arXiv | Paper(full) Pytorch implementation for high-resolution (e.g., 2048x1024) photorealistic vid

8.1k Jan 1, 2023

SimDeblur is a simple framework for image and video deblurring, implemented by PyTorch

SimDeblur (Simple Deblurring) is an open source framework for image and video deblurring toolbox based on PyTorch, which contains most deep-learning based state-of-the-art deblurring algorithms. It is easy to implement your own image or video deblurring or other restoration algorithms.

220 Jan 7, 2023

A curated list of resources for Image and Video Deblurring

1.7k Jan 1, 2023

EFENet: Reference-based Video Super-Resolution with Enhanced Flow Estimation

EFENet EFENet: Reference-based Video Super-Resolution with Enhanced Flow Estimation Code is a bit messy now. I woud clean up soon. For training the EF

6 Oct 20, 2021

The official pytorch implemention of the CVPR paper "Temporal Modulation Network for Controllable Space-Time Video Super-Resolution".

This is the official PyTorch implementation of TMNet in the CVPR 2021 paper "Temporal Modulation Network for Controllable Space-Time VideoSuper-Resolu

95 Oct 24, 2022

A PyTorch Reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution

TecoGAN-PyTorch Introduction This is a PyTorch reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution (VSR). Please refer to

165 Dec 17, 2022

PyTorch implementation of EGVSR: Efficcient & Generic Video Super-Resolution (VSR)

This is a PyTorch implementation of EGVSR: Efficcient & Generic Video Super-Resolution (VSR), using subpixel convolution to optimize the inference speed of TecoGAN VSR model. Please refer to the official implementation ESPCN and TecoGAN for more information.

789 Jan 4, 2023

MMDetection3D is an open source object detection toolbox based on PyTorch

MMDetection3D is an open source object detection toolbox based on PyTorch, towards the next-generation platform for general 3D detection. It is a part of the OpenMMLab project developed by MMLab.

3.2k Jan 5, 2023

LaneDet is an open source lane detection toolbox based on PyTorch that aims to pull together a wide variety of state-of-the-art lane detection models

LaneDet is an open source lane detection toolbox based on PyTorch that aims to pull together a wide variety of state-of-the-art lane detection models. Developers can reproduce these SOTA methods and build their own methods.

405 Jan 4, 2023

traiNNer is an open source image and video restoration (super-resolution, denoising, deblurring and others) and image to image translation toolbox based on PyTorch.

Related tags

Overview

traiNNer

Table of Contents

Dependencies

Optional Dependencies

Codes

Usage

Training

Data and model preparation

How to Test

For simple testing

Pretrained models

Model interpolation

Datasets

How to help

Additional Help

Acknowledgement

Comments

Owner

[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

Image Restoration Toolbox (PyTorch). Training and testing codes for DPIR, USRNet, DnCNN, FFDNet, SRMD, DPSR, BSRGAN, SwinIR

Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic video-to-video translation.

SimDeblur is a simple framework for image and video deblurring, implemented by PyTorch

A curated list of resources for Image and Video Deblurring

EFENet: Reference-based Video Super-Resolution with Enhanced Flow Estimation

The official pytorch implemention of the CVPR paper "Temporal Modulation Network for Controllable Space-Time Video Super-Resolution".

A PyTorch Reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution

PyTorch implementation of EGVSR: Efficcient & Generic Video Super-Resolution (VSR)

MMDetection3D is an open source object detection toolbox based on PyTorch

LaneDet is an open source lane detection toolbox based on PyTorch that aims to pull together a wide variety of state-of-the-art lane detection models

MMFlow is an open source optical flow toolbox based on PyTorch

An open source object detection toolbox based on PyTorch

mmfewshot is an open source few shot learning toolbox based on PyTorch

Mmdetection3d Noted - MMDetection3D is an open source object detection toolbox based on PyTorch

Cascaded Deep Video Deblurring Using Temporal Sharpness Prior and Non-local Spatial-Temporal Similarity

BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond

BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment

Fast and Context-Aware Framework for Space-Time Video Super-Resolution (VCIP 2021)