Code for You Only Cut Once: Boosting Data Augmentation with a Single Cut

Overview

You Only Cut Once (YOCO)

YOCO is a simple method/strategy of performing augmentations, which enjoys the properties of parameter-free, easy usage, and boosting almost all augmentations for free (negligible computation & memory cost). We hope our study will attract the community’s attention in revisiting how to perform data augmentations.

You Only Cut Once: Boosting Data Augmentation with a Single Cut
Junlin Han, Pengfei Fang, Weihao Li, Jie Hong, Ali Armin, Ian Reid, Lars Petersson, Hongdong Li
DATA61-CSIRO and Australian National University and University of Adelaide
Preprint

@inproceedings{han2022yoco,
  title={You Only Cut Once: Boosting Data Augmentation with a Single Cut},
  author={Junlin Han and Pengfei Fang and Weihao Li and Jie Hong and Mohammad Ali Armin and and Ian Reid and Lars Petersson and Hongdong Li},
  booktitle={arXiv preprint arXiv:2201.12078},
  year={2022}
}

YOCO cuts one image into two equal pieces, either in the height or the width dimension. The same data augmentations are performed independently within each piece. Augmented pieces are then concatenated together to form one single augmented image.  

Results

Overall, YOCO benefits almost all augmentations in multiple vision tasks (classification, contrastive learning, object detection, instance segmentation, image deraining, image super-resolution). Please see our paper for more.

Easy usages

Applying YOCO is quite easy, here is a demo code of performing YOCO at the batch level.

***
images: images to be augmented, here is tensor with (b,c,h,w) shape
aug: composed augmentation operations
h: height of images
w: width of images
***

def YOCO(images, aug, h, w):
    images = torch.cat((aug(images[:, :, :, 0:int(w/2)]), aug(images[:, :, :, int(w/2):w])), dim=3) if \
    torch.rand(1) > 0.5 else torch.cat((aug(images[:, :, 0:int(h/2), :]), aug(images[:, :, int(h/2):h, :])), dim=2)
    return images
    
for i, (images, target) in enumerate(train_loader):    
    aug = torch.nn.Sequential(
      transforms.RandomHorizontalFlip(), )
    _, _, h, w = images.shape
    # perform augmentations with YOCO
    images = YOCO(images, aug, h, w) 

Prerequisites

This repo aims to be minimal modifications on official PyTorch ImageNet training code and MoCo. Following their instructions to install the environments and prepare the datasets.

timm is also required for ImageNet classification, simply run

pip install timm

Images augmented with YOCO

For each quadruplet, we show the original input image, augmented image from image-level augmentation, and two images from different cut dimensions produced by YOCO.

Contact

[email protected] or [email protected]

If you tried YOCO in other tasks/datasets/augmentations, please feel free to let me know the results. They will be collected and presented in this repo, regardless of positive or negative. Many thanks!

Acknowledgments

Our code is developed based on official PyTorch ImageNet training code and MoCo.

You might also like...
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

A Sklearn-like Framework for Hyperparameter Tuning and AutoML in Deep Learning projects. Finally have the right abstractions and design patterns to properly do AutoML. Let your pipeline steps have hyperparameter spaces. Enable checkpoints to cut duplicate calculations. Go from research to production environment easily. This is the implementation of our work Deep Extreme Cut (DEXTR), for object segmentation from extreme points.
This is the implementation of our work Deep Extreme Cut (DEXTR), for object segmentation from extreme points.

This is the implementation of our work Deep Extreme Cut (DEXTR), for object segmentation from extreme points.

Team nan solution repository for FPT data-centric competition. Data augmentation, Albumentation, Mosaic, Visualization, KNN application
Team nan solution repository for FPT data-centric competition. Data augmentation, Albumentation, Mosaic, Visualization, KNN application

FPT_data_centric_competition - Team nan solution repository for FPT data-centric competition. Data augmentation, Albumentation, Mosaic, Visualization, KNN application

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

 Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging
Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging

Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging This repository contains an implementation

Comments
  • YOCO generates a single image when we have single image

    YOCO generates a single image when we have single image

    Hi @JunlinHan

    I have a dataset contains single image and I am simply applying your YOCO technique to visualize image generated by YOCO. I just get a single output sometimes the output is same image as input and sometimes flip+cut. Is that correct?

    Code

    import torch
    from torchvision.utils import save_image
    import torchvision.transforms as transforms
    from torchvision import datasets
    from torchvision.transforms import ToTensor
    from torch.utils.data import DataLoader
    
    training_data = datasets.ImageFolder(root="/media/cvpr/CM_22/dataset/train", transform=ToTensor())
    train_loader = DataLoader(training_data, batch_size=64, shuffle=True)
    
    
    def YOCO(images, aug, h, w):
        images = torch.cat((aug(images[:, :, :, 0:int(w / 2)]), aug(images[:, :, :, int(w / 2):w])), dim=3) if \
            torch.rand(1) > 0.5 else torch.cat((aug(images[:, :, 0:int(h / 2), :]), aug(images[:, :, int(h / 2):h, :])),
                                               dim=2)
        return images
    
    
    for i, (images, target) in enumerate(train_loader):
        aug = torch.nn.Sequential(
            transforms.RandomHorizontalFlip(), )
        _, _, h, w = images.shape
        # perform augmentations with YOCO
        images = YOCO(images, aug, h, w)
        save_image(images, 'img' + str(i) + '.png')
    
    

    Input image: aeroplane_+_tiger_google_0017

    Output Image img0

    opened by khawar-islam 4
  • Did you split the training dataset into train and val?

    Did you split the training dataset into train and val?

    Hi, thanks for your inspiring work! I got a question here. Have you split the training dataset into train and val? From the code you shared, it seems that you directly used the test dataset for validation by which the model with the best performance is picked: val_loader = torch.utils.data.DataLoader( datasets.CIFAR10('../data', train=False, transform=transform_test), batch_size=args.batch_size, shuffle=True, num_workers=args.workers, pin_memory=True) I am not sure if it is ok and I am really confused, because I notice some researchers make the split while some others do not.

    opened by Fivethousand5k 4
  • Calculating accuracy on CIFAR-10 multiple times or Single time

    Calculating accuracy on CIFAR-10 multiple times or Single time

    Dear @JunlinHan,

    Did you run all experiments multiple times than average and put them into the table? Like training resnet18 fives times and then averaging the accuracy and putting it into the table. Why I am asking this question because every time when I am training resnet18 I got different accuracy and error rate (almost 1% fluctuation).

    Regards, Khawar

    opened by khawar-islam 1
  • Missing Test top-1 error rate

    Missing Test top-1 error rate

    Dear @JunlinHan

    Thank you for your work. I am working on your paper and reproducing the results given in the paper. I have trained a model based on your configuration for cifar-10 but I cannot find an error-rate file. How we can evaluate the top-1 error-rate? Like this

    print('Test\t  Prec@1: {top1.avg:.3f} (Err: {error:.3f} )\n'
              .format(top1=top1, error=100 - top1.avg))
    
    opened by khawar-islam 1
Owner
ANU/CSIRO/AIML/U Adelaide. Working on vision/graphics. Email: [email protected]
null
An integration of several popular automatic augmentation methods, including OHL (Online Hyper-Parameter Learning for Auto-Augmentation Strategy) and AWS (Improving Auto Augment via Augmentation Wise Weight Sharing) by Sensetime Research.

An integration of several popular automatic augmentation methods, including OHL (Online Hyper-Parameter Learning for Auto-Augmentation Strategy) and AWS (Improving Auto Augment via Augmentation Wise Weight Sharing) by Sensetime Research.

null 45 Dec 8, 2022
The official implementation of You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient.

You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient (paper) @misc{zhang2021compress,

null 46 Dec 7, 2022
PyTorch implementation of the YOLO (You Only Look Once) v2

PyTorch implementation of the YOLO (You Only Look Once) v2 The YOLOv2 is one of the most popular one-stage object detector. This project adopts PyTorc

申瑞珉 (Ruimin Shen) 433 Nov 24, 2022
You Only Hypothesize Once: Point Cloud Registration with Rotation-equivariant Descriptors

You Only Hypothesize Once: Point Cloud Registration with Rotation-equivariant Descriptors In this paper, we propose a novel local descriptor-based fra

Haiping Wang 80 Dec 15, 2022
The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

Cutoff: A Simple Data Augmentation Approach for Natural Language This repository contains source code necessary to reproduce the results presented in

Dinghan Shen 49 Dec 22, 2022
Image transformations designed for Scene Text Recognition (STR) data augmentation. Published at ICCV 2021 Workshop on Interactive Labeling and Data Augmentation for Vision.

Data Augmentation for Scene Text Recognition (ICCV 2021 Workshop) (Pronounced as "strog") Paper Arxiv Why it matters? Scene Text Recognition (STR) req

Rowel Atienza 152 Dec 28, 2022
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

Distributed (Deep) Machine Learning Community 23.6k Dec 31, 2022
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

Distributed (Deep) Machine Learning Community 20.6k Feb 13, 2021
Augmentation for Single-Image-Super-Resolution

SRAugmentation Augmentation for Single-Image-Super-Resolution Implimentation CutBlur Cutout CutMix Cutup CutMixup Blend RGBPermutation Identity OneOf

Yubo 6 Jun 27, 2022
Code of paper Interact, Embed, and EnlargE (IEEE): Boosting Modality-specific Representations for Multi-Modal Person Re-identification.

Interact, Embed, and EnlargE (IEEE): Boosting Modality-specific Representations for Multi-Modal Person Re-identification We provide the codes for repr

null 12 Dec 12, 2022