Towards End-to-End Image Compression and Analysis with Transformers
Source code of our AAAI 2022 paper "Towards End-to-End Image Compression and Analysis with Transformers".
Usage
The code is run with Python 3.7
, Pytorch 1.8.1
, Timm 0.4.9
and Compressai 1.1.4
.
Data preparation
Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder
, and the training and validation data is expected to be in the train
folder and val
folder respectively:
/path/to/imagenet/
train/
class1/
img1.jpeg
class2/
img2.jpeg
val/
class1/
img3.jpeg
class2/
img4.jpeg
Pretrained model
The ./pretrained_model
provides the pretrained model without compression.
- Test
Please adjust --data-path
and run sh test.sh
:
python main.py --eval --resume ./pretrain_s/checkpoint.pth --model pretrained_model --data-path /path/to/imagenet/ --output_dir ./eval
The ./pretrain_s/checkpoint.pth
can be downloaded from Baidu Netdisk, with access code aaai
.
- Train
Please adjust --data-path
and run sh train.sh
:
python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model pretrained_model --no-model-ema --clip-grad 1.0 --batch-size 128 --num_workers 16 --data-path /path/to/imagenet/ --output_dir ./ckp_pretrain
Full model
The ./full_model
provides the full model with compression.
- Test
Please adjust --data-path
and --resume
, respectively. Run sh test.sh
:
python main.py --eval --resume ./ckp_s_q1/checkpoint.pth --model full_model --no-pretrained --data-path /path/to/imagenet/ --output_dir ./eval
The ./ckp_s_q1/checkpoint.pth
, ./ckp_s_q2/checkpoint.pth
and ./ckp_s_q3/checkpoint.pth
can be downloaded from Baidu Netdisk, with access code aaai
.
- Train
Please download ./pretrain_s/checkpoint.pth
from Baidu Netdisk with access code aaai
, adjust --data-path
and --quality
, respectively.
quality | alpha | beta |
---|---|---|
1 | 0.1 | 0.001 |
2 | 0.3 | 0.003 |
3 | 0.6 | 0.006 |
Run sh train.sh
:
python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model full_model --batch-size 128 --num_workers 16 --clip-grad 1.0 --quality 1 --data-path /path/to/imagenet/ --output_dir ./ckp_full
Citation
@InProceedings{Bai2022AAAI,
title={Towards End-to-End Image Compression and Analysis with Transformers},
author={Bai, Yuanchao and Yang, Xu and Liu, Xianming and Jiang, Junjun and Wang, Yaowei and Ji, Xiangyang and Gao, Wen},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
year={2022}
}