Learned image compression

Overview

Overview

Pytorch code of our recent work A Unified End-to-End Framework for Efficient Deep Image Compression.

We first release the code for Variational image compression with a scale hyperprior, we will update our code to our full implementaion of our paper.

Content

Prerequisites

You should install the libraries of this repo.

pip install -r requirements.txt

Data Preparation

We need to first prepare the training and validation data. The trainging data is from flicker.com. You can obtain the training data according to description of CompressionData.

The validation data is the popular kodak dataset.

bash data/download_kodak.sh

Training

For high bitrate (4096, 6144, 8192), the out_channel_N is 192 and the out_channel_M is 320 in 'config_high.json'. For low bitrate (256, 512, 1024, 2048), the out_channel_N is 128 and the out_channel_M is 192 in 'config_low.json'.

Details

PSNR experiments.

For high bitrate of 8192, we first train from scratch as follows.

CUDA_VISIBLE_DEVICES=0 python train.py --config examples/example/config_high.json -n baseline_8192 --train flicker_path --val kodak_path

For other high bitrate (4096, 6144), we use the converged model of 8192 as pretrain model and set the learning rate as 1e-5. The training iterations are set as 500000.

The low bitrate (256, 512, 1024, 2048) training process follows the same strategy.

MS-SSIM experiments

You should change the distorsion loss to (1-MS_SSIM), and fine-tune the pretrained model optimized by PSNR to accelerate the training process. You can find more details in our released paper. The training strategy is similar.

If your find our code is helpful for your research, please cite our paper. Besides, this code is only for research.

@article{liu2020unified,
  title={A Unified End-to-End Framework for Efficient Deep Image Compression},
  author={Liu, Jiaheng and Lu, Guo and Hu, Zhihao and Xu, Dong},
  journal={arXiv preprint arXiv:2002.03370},
  year={2020}
}
Comments
  • 关于训练数据的问题 Decompressed Data Too Large

    关于训练数据的问题 Decompressed Data Too Large

    作者,你好!你的代码对我很有帮助,但是你的训练数据实在太大,国内用户下载google drive太慢。所以我把你的dataset里面注释那行transforms.RandomResizedCrop(self.image_size)加上了,使用了256的裁剪。但是我训练数据使用121326张图片,大概迭代1000次(还在第0个epoch),就报pillow太大的错误了ValueError: Decompressed Data Too Large。但是将图片训练集改到856张的训练集后,就没有出现问题。

    opened by SailVR 5
  • Implementation  Related Question

    Implementation Related Question

    Hello

    Just wanted to know whether this implementation is a simple autoencoder based one or not? As compressed_feature_renorm is directly fed into the encoder. We are not calculating any mean or standard deviation here.

    opened by salmanali96 3
  • Trained Model

    Trained Model

    Dear Jiaheng Liu,

    I am keep trying to reproduce result but number of iterations are too big. Thus, It takes long time on moderate GPU. I am requesting you to please give me trained model.

    Kind Regards, Khawar

    opened by khawar-islam 3
  • Hello. Thanks to your contribution, I am getting a lot of help in studying.

    Hello. Thanks to your contribution, I am getting a lot of help in studying.

    Hello. I'm getting a lot of help in studying.

    First of all, thank you very much.

    I'm writing this because there's one thing I'm confusing.

    In 'compression/model.py', there is a function, feature_probs_based_sigma.

        def feature_probs_based_sigma(feature, sigma):
            mu = torch.zeros_like(sigma)
            sigma = sigma.clamp(1e-10, 1e10)
            gaussian = torch.distributions.laplace.Laplace(mu, sigma)
            probs = gaussian.cdf(feature + 0.5) - gaussian.cdf(feature - 0.5)
            total_bits = torch.sum(torch.clamp(-1.0 * torch.log(probs + 1e-10) / math.log(2.0), 0, 50))
            return total_bits, probs
    

    In there, the 'gaussian' is defined by 'torch.distributions.laplace.Laplace'. I don't know why you didn't define 'gaussian' by 'torch.distribution.normal.Normal' As far as I understand, in the author's tensorflow implementation he used normal distribution.

    If you don't mind, could you answer me?

    Thank you again for your contribution.

    opened by herok97 2
  • Use of exponential in Synthesis_prior_net

    Use of exponential in Synthesis_prior_net

    You Synthesis_prior_net class is called as follow:

        def forward(self, x):
            x = self.relu1(self.deconv1(x))
            x = self.relu2(self.deconv2(x))
            return torch.exp(self.deconv3(x))
    

    Why did you use the exponential function instead of ReLU ? That does not seem to be mentioned in Balle's or your paper.

    opened by trougnouf 2
  • Is Entroypy model  not implemented?

    Is Entroypy model not implemented?

    Hello,I noticed that your code is Balle,2018's work on Pytorch. But it doesn't have the Entropy Model Code and code about using z to AE/AD, In Balle's work, it is done on Tensorflow .Have you finish this part? Thanks for answering.

    opened by ywz978020607 2
  • Poor results with MS-SSIM optimization

    Poor results with MS-SSIM optimization

    Do you have any advice regarding MS-SSIM optimization? I get consistently poor results when optimizing directly for the MS-SSIM. Do you have to train longer, mix with other losses, or switch to MS-SSIM later in training in order to get the results described?

    opened by trougnouf 2
  • Questions about the laplace distribution

    Questions about the laplace distribution

    Hi jiaheng, I see you use laplace distribution instead of gaussian in the original ICLR2018 paper, is it because laplace is better than gaussian for univariate distribution estimation? so you use GMM in your full model and it's better than laplace?

    opened by yaoqi-zd 2
  • possible full CPU consumption and a fix

    possible full CPU consumption and a fix

    Hi, thanks so much for sharing your code! I find directly run the train.py may cause too much cpu consumption(~5000% on my machine), so I make some modifications to the original code to reduce cpu burden(->100% on my machine).

    1. in train.py, set pin_memory=False
    train_dataset = Datasets(train_data_dir, image_size)
    train_loader = DataLoader(dataset=train_dataset,
                                  batch_size=batch_size,
                                  shuffle=True,
                                  pin_memory=False, 
                                  num_workers=2)
    
    1. in model.py, pre-allocate gpu memory of quant_noise data.
    class ImageCompressor(nn.Module):
        def __init__(self, out_channel_N=192, out_channel_M=320):
            super(ImageCompressor, self).__init__()
            self.Encoder = Analysis_net(out_channel_N=out_channel_N, out_channel_M=out_channel_M)
            self.Decoder = Synthesis_net(out_channel_N=out_channel_N, out_channel_M=out_channel_M)
            self.priorEncoder = Analysis_prior_net(out_channel_N=out_channel_N, out_channel_M=out_channel_M)
            self.priorDecoder = Synthesis_prior_net(out_channel_N=out_channel_N, out_channel_M=out_channel_M)
            self.bitEstimator_z = BitEstimator(out_channel_N)
            self.out_channel_N = out_channel_N
            self.out_channel_M = out_channel_M
          
            self.quant_noise_feature = torch.zeros(4, self.out_channel_M, 256 // 16, 256 // 16).cuda()
            self.quant_noise_z = torch.zeros(4, self.out_channel_N, 256 // 64, 256 // 64).cuda()
            self.quant_noise_feature = torch.nn.init.uniform_(torch.zeros_like(self.quant_noise_feature), -0.5, 0.5)
            self.quant_noise_z = torch.nn.init.uniform_(torch.zeros_like(self.quant_noise_z), -0.5, 0.5)
    
        def forward(self, input_image):            
            feature = self.Encoder(input_image)
            batch_size = feature.size()[0]
    
            z = self.priorEncoder(feature)
    
    opened by yaoqi-zd 2
  • setting bit-per-pixel

    setting bit-per-pixel

    Where can we set bit-per-pixel? For example in your article ( fig 9 ), how did you set different bit-per-pixel? Should we train the model for different bit-per-pixels?

    opened by Alihjt 1
  • Could you provide the pretrained model?

    Could you provide the pretrained model?

    Hi, thanks a lot for your reimplementation of "Variational image compression with a scale hyperprior" , it really helps. I wonder if you could provide the pretrained model as well, because I really need it to start my training. Thanks again.

    opened by Qi-X 1
  • about the bpp/psnr results

    about the bpp/psnr results

    Hi, thanks for your great work. I tried to reproduce the paper's result. I used your code directly (The "train_lambda": 8192,"out_channel_N": 192,"out_channel_M": 320) , and trained the model with the provided dataset from Flicker. However, the val result on Kodak seems bad. The mse loss is about1.2e-4, the estimated bpp is around 1.8 ( the model has already been trained for at least 100 epoch). I do not know what is wrong with my model? Is there any thing that I missed such as doing some data augment??

    opened by kv123654 0
  • bit sequence for the features

    bit sequence for the features

    Hi jiaheng:

    Thanks a lot for the repo, it really helps a lot.

    One thing I am interested about is to generate the real bit sequence for the latent features and latent variable z. I read the codes in this repo and it seems you only use the bitestimator.py to estimate the entropy but didn't generate the bit sequence. If people consider the transmission of images, then the bit sequence should be important... Did you implement that before or are there any hints for that? Thanks a lot in advance!

    Best, Chenghong

    opened by aprilbian 0
Owner
Jiaheng Liu
Ph.D. Student
Jiaheng Liu
A Pytorch Implementation of a continuously rate adjustable learned image compression framework.

GainedVAE A Pytorch Implementation of a continuously rate adjustable learned image compression framework, Gained Variational Autoencoder(GainedVAE). N

null 39 Dec 24, 2022
An Image compression simulator that uses Source Extractor and Monte Carlo methods to examine the post compressive effects different compression algorithms have.

ImageCompressionSimulation An Image compression simulator that uses Source Extractor and Monte Carlo methods to examine the post compressive effects o

James Park 1 Dec 11, 2021
Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme (NeurIPS2021)

Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme (NeurIPS2021) Overview Prerequisites Linux Pytho

Shaojie Li 34 Mar 31, 2022
Object detection on multiple datasets with an automatically learned unified label space.

Simple multi-dataset detection An object detector trained on multiple large-scale datasets with a unified label space; Winning solution of E

Xingyi Zhou 407 Dec 30, 2022
Code for Mesh Convolution Using a Learned Kernel Basis

Mesh Convolution This repository contains the implementation (in PyTorch) of the paper FULLY CONVOLUTIONAL MESH AUTOENCODER USING EFFICIENT SPATIALLY

Yi_Zhou 35 Jan 3, 2023
Official implementation of "Accelerating Reinforcement Learning with Learned Skill Priors", Pertsch et al., CoRL 2020

Accelerating Reinforcement Learning with Learned Skill Priors [Project Website] [Paper] Karl Pertsch1, Youngwoon Lee1, Joseph Lim1 1CLVR Lab, Universi

Cognitive Learning for Vision and Robotics (CLVR) lab @ USC 134 Dec 6, 2022
Self-Learned Video Rain Streak Removal: When Cyclic Consistency Meets Temporal Correspondence

In this paper, we address the problem of rain streaks removal in video by developing a self-learned rain streak removal method, which does not require any clean groundtruth images in the training process.

Yang Wenhan 44 Dec 6, 2022
Learned Token Pruning for Transformers

LTP: Learned Token Pruning for Transformers Check our paper for more details. Installation We follow the same installation procedure as the original H

Sehoon Kim 52 Dec 29, 2022
a pytorch implementation of auto-punctuation learned character by character

Learning Auto-Punctuation by Reading Engadget Articles Link to Other of my work ?? Deep Learning Notes: A collection of my notes going from basic mult

Ge Yang 137 Nov 9, 2022
Official code release for "Learned Spatial Representations for Few-shot Talking-Head Synthesis" ICCV 2021

Official code release for "Learned Spatial Representations for Few-shot Talking-Head Synthesis" ICCV 2021

Moustafa Meshry 16 Oct 5, 2022
a pytorch implementation of auto-punctuation learned character by character

Learning Auto-Punctuation by Reading Engadget Articles Link to Other of my work ?? Deep Learning Notes: A collection of my notes going from basic mult

Ge Yang 137 Nov 9, 2022
Source code of the paper PatchGraph: In-hand tactile tracking with learned surface normals.

PatchGraph This repository contains the source code of the paper PatchGraph: In-hand tactile tracking with learned surface normals. Installation Creat

Paloma Sodhi 11 Dec 15, 2022
Codes for realizing theories learned from Data Mining, Machine Learning, Deep Learning without using the present Python packages.

Codes-for-Algorithms Codes for realizing theories learned from Data Mining, Machine Learning, Deep Learning without using the present Python packages.

Tracy (Shengmin) Tao 1 Apr 12, 2022
A PyTorch implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"

TokenLearner: What Can 8 Learned Tokens Do for Images and Videos? Source: Improving Vision Transformer Efficiency and Accuracy by Learning to Tokenize

Caiyong Wang 14 Sep 20, 2022
UMEC: Unified Model and Embedding Compression for Efficient Recommendation Systems

[ICLR 2021] "UMEC: Unified Model and Embedding Compression for Efficient Recommendation Systems" by Jiayi Shen, Haotao Wang*, Shupeng Gui*, Jianchao Tan, Zhangyang Wang, and Ji Liu

VITA 39 Dec 3, 2022
Code of paper "CDFI: Compression-Driven Network Design for Frame Interpolation", CVPR 2021

CDFI (Compression-Driven-Frame-Interpolation) [Paper] (Coming soon...) | [arXiv] Tianyu Ding*, Luming Liang*, Zhihui Zhu, Ilya Zharkov IEEE Conference

Tianyu Ding 95 Dec 4, 2022
Pytorch implementation of COIN, a framework for compression with implicit neural representations 🌸

COIN ?? This repo contains a Pytorch implementation of COIN: COmpression with Implicit Neural representations, including code to reproduce all experim

Emilien Dupont 104 Dec 14, 2022
Deep Compression for Dense Point Cloud Maps.

DEPOCO This repository implements the algorithms described in our paper Deep Compression for Dense Point Cloud Maps. How to get started (using Docker)

Photogrammetry & Robotics Bonn 67 Dec 6, 2022
The official implementation of You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient.

You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient (paper) @misc{zhang2021compress,

null 46 Dec 7, 2022