[IJCAI'21] Deep Automatic Natural Image Matting

Jizhizi_Li

Last update: Jan 6, 2023

Related tags

Overview

Deep Automatic Natural Image Matting [IJCAI-21]

This is the official repository of the paper Deep Automatic Natural Image Matting.

Introduction | Network | AIM-500 | Results | Statement

📆 News
The training code, inference code and the pretrained models will be released soon.
[2021-07-16]: Publish the validation dataset AIM-500. Please follow the readme.txt for details.

Introduction

Different from previous methods only focusing on images with salient opaque foregrounds such as humans and animals, in this paper, we investigate the difficulties when extending the automatic matting methods to natural images with salient transparent/meticulous foregrounds or non-salient foregrounds.

To address the problem, we propose a novel end-to-end matting network, which can predict a generalized trimap for any image of the above types as a unified semantic representation. Simultaneously, the learned semantic features guide the matting network to focus on the transition areas via an attention mechanism.

We also construct a test set AIM-500 that contains 500 diverse natural images covering all types along with manually labeled alpha mattes, making it feasible to benchmark the generalization ability of AIM models. Results of the experiments demonstrate that our network trained on available composite matting datasets outperforms existing methods both objectively and subjectively.

Network

We propose the methods consist of:

Improved Backbone for Matting: an advanced max-pooling version of ResNet-34, serves as the backbone for the matting network, pretrained on ImageNet;
Unified Semantic Representation: a type-wise semantic representation to replace the traditional trimaps;
Guided Matting Process: an attention based mechanism to guide the matting process by leveraging the learned semantic features from the semantic decoder to focus on extracting details only within transition area.

The backbone pretrained on ImageNet and the model pretrained on synthetic matting dataset will be released soon.

Pretrained-backbone	Pretrained-model
coming soon	coming soon

AIM-500

We propose AIM-500 (Automatic Image Matting-500), the first natural image matting test set, which contains 500 high-resolution real-world natural images from all three types (SO, STM, NS), many categories, and the manually labeled alpha mattes. Some examples and the amount of each category are shown below. The AIM-500 dataset is published now, can be downloaded directly from this link. Please follow the readme.txt for more details.

Portrait	Animal	Transparent	Plant	Furniture	Toy	Fruit
100	200	34	75	45	36	10

Results

We test our network on different types of images in AIM-500 and compare with previous SOTA methods, the results are shown below.

Statement

If you are interested in our work, please consider citing the following:

@inproceedings{ijcai2021-danim,
  title     = {Deep Automatic Natural Image Matting},
  author    = {Li, Jizhizi and Zhang, Jing and Tao, Dacheng},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},
  year      = {2021},
}

This project is under the MIT license. For further questions, please contact [email protected].

Relevant Projects

End-to-end Animal Image Matting
Jizhizi Li, Jing Zhang, Stephen J. Maybank, Dacheng Tao

Comments

Some test results on images in the wild

First: thanks for this amazing repository and research and for sharing it.

So I did test some images with the code/pre-trained model you released. I share some results below. I have some very good results and some more disappointing ones (I am unreasonably demanding, sorry for that).

One question is that I have some differences on the 3 default sample images results from the repo, i.e. when inferencing on 1.png, 2.png and 3.png): is that due to the paper trained model vs the newly trained model differences?

Another question: do you know why sometimes there is 100% background detection, e.g. with the last two images below?

I am on cuda 10.2 and pytorch 1.9, fyi. thanks

opened by Tetsujinfr 9
Impossible to obtain the same results with the pretrained model
Hello,

Thanks for the great work, I believe your research is really interesting!

I'd like to report issues obtaining the results displayed on the README.

FIrst, when running on the same environment as described on the README (with torch=1.4.0), I run in the following exception:

Traceback (most recent call last): File "core/test.py", line 321, in <module> load_model_and_deploy(args) File "core/test.py", line 298, in load_model_and_deploy ckpt = torch.load(args.model_path,map_location=torch.device('cpu')) File "/home/malrick/anaconda3/envs/aim-env/lib/python3.6/site-packages/torch/serialization.py", line 527, in load with _open_zipfile_reader(f) as opened_zipfile: File "/home/malrick/anaconda3/envs/aim-env/lib/python3.6/site-packages/torch/serialization.py", line 224, in __init__ super(_open_zipfile_reader, self).__init__(torch._C.PyTorchFileReader(name_or_buffer)) RuntimeError: version_ <= kMaxSupportedFileFormatVersion INTERNAL ASSERT FAILED at /opt/conda/conda-bld/pytorch_1579022034529/work/caffe2/serialize/inline_container.cc:132, please report a bug to PyTorch. Attempted to read a PyTorch file with version 3, but the maximum supported version for reading is 2. Your PyTorch installation may be too old. (init at /opt/conda/conda-bld/pytorch_1579022034529/work/caffe2/serialize/inline_container.cc:132)

So it seems that the pre-trained model can't be read with such an old version of PyTorch. The export of the conda environment is available here: aim-env.txt (just replace .txt extension with .yml, GitHub doesn't support hosting yml files)

I then ran the script with the same environment that @JizhiziLi described in #7 (PyTorch 1.7.1, and python 3.7.7). Conda export is available here: aim-env-2.txt

I have set the parameter test_choice=Hybrid, in file core/scripts/test_samples.sh, and I have the ratios set to global_ratio=1/4, local_ratio=1/2 in file core/test.py

Also, as my RTX3090 is not supported by these older versions of PyTorch, I used CPU for inference.

To do so, I had to adjust certain parts of the code, first at line 292:

if torch.cuda.device_count()==0: print(f'Running on CPU...') args.cuda = False ckpt = torch.load(args.model_path,map_location=torch.device('cpu')) else: print(f'Running on GPU with CUDA as {args.cuda}...') ckpt = torch.load(args.model_path)

to:

if not args.cuda or torch.cuda.device_count()==0: print(f'Running on CPU...') args.cuda = False ckpt = torch.load(args.model_path,map_location=torch.device('cpu')) else: print(f'Running on GPU with CUDA as {args.cuda}...') ckpt = torch.load(args.model_path)

and line 42:

tensor_img = torch.from_numpy(scale_img.astype(np.float32)[:, :, :]).permute(2, 0, 1).cuda()

to

tensor_img = torch.from_numpy(scale_img.astype(np.float32)[:, :, :]).permute(2, 0, 1) if args.cuda: tensor_img = tensor_img.cuda()

Removing the --cuda parameter in scripts/test_samples.sh then allowed to run inference on CPU.

The results I'm obtaining are the same as in issue #7.

I also ran inference similarly on the AIM-500 dataset, and I obtained the following results in logs/test_logs/DEBUG.log:

INFO:root:Testing numbers: 500 INFO:root:SAD: 70.16285879981845 INFO:root:MSE: 0.033924558738880374 INFO:root:MAD: 0.041689636977158 INFO:root:SAD TRIMAP: 48.64482158505621 INFO:root:MSE TRIMAP: 0.09556289661012443 INFO:root:MAD TRIMAP: 0.13346166896791076 INFO:root:SAD FG: 17.866437117307104 INFO:root:SAD BG: 3.651600097455164 INFO:root:CONN: 67.30147147906628 INFO:root:GRAD: 58.72859108352662

The artifacts that are obtained are really surprising, and they seem to be induced at a quite low resolution.

Hope this helps, and we can find a way to have this great repository to work!
opened by Malrick 3
Training code

Thank you for the great new model! It produces really cool results.

It would be very interesting to play with model and train on my own datasets.

Could you please give some estimation when will you release the training code so I don't refresh a page several times a day :)

opened by baleksey 3
Number of iterations for each epochs

How many iterations does it do on each epochs? Form line: https://github.com/JizhiziLi/AIM/blob/master/core/train.py#L81, it looks like it does only 4 iterations for each epochs. Does this approach train the model with all images in one epoch?

opened by anilsathyan7 1
Training code

Hi, Jizhizi Li 🤗 🤗 🤗. Many thanks for this amazing repo! Works very nice for my images with clothes! 🥳 👌👋

But it seems to me that the results can be further improved by additional training on individual cases. 👀

Some results:

Do you have some plans for realization the training code? I have already seen the previous question on this topic, but I can't wait. 😈 😈 😈 Thank you for the answer! Have a very very nice day 🤓😄😄

opened by Kakoedlinnoeslovo 1
Question about the results for DIM+Trimap

I've test my trimap based model on AIM-500. However, the results are much better that reported in your paper. It is reasonable that trimap-based models have advantages over trimap-free ones. I wonder that if you did not fuse the foreground and background labels provided by the trimap during test.

opened by XavierCHEN34 1
USR type json file does not match DIM, HAttMatting dataset file name

Hi @JizhiziLi After downloading the json file, the filename for DIM and HAttMatting dataset is different from their original datasets.. Could you please provide the filename matching file? Thank you!

opened by zoezhou1999 0
Wrong SAD result

I tested the aim-net500 dataset with the model you provided and found that the SAD was only 47. Can you check that the model you uploaded in GitHub is correct? In addition, did you do other processing on the DIM dataset? In your dim_hatt_am2k_type.json file, the key of the DIM dataset is number 1,2,3,4....，but the names of DIM datasets are different. So how did you rename them. I use the training_fg_names.txt file to rename them, and retrained the model according to the steps you provided on GitHub, and the SAD is only 55.

opened by gaka2012 1
Rotated predictions

Hi I've been testing you model quite a bit and have noticed, that some vertical images tend to rotate to the horisontal position. I wonted why and if there is a way of turning it off, since they rotate randomly clockwise or counter clockwise. An alternative would be to just analyse the images that are in horisontal position, however I'm not sure if that wouldn't have an impact on the performance.

opened by Wojak27 0

DUTS SOD accuracy and loss values

I trained the model with duts dataset for 100 epochs(default) but the loss and accuracy seems to be low:-

INFO:root:AIM-Epoch[100/100](1/660) Lr:0.00010000 Loss:0.82403 Global:0.66851 Local:0.11801 Fusion-alpha:0.03752 Speed:4.63853s/iter Exa(h:m:s):00:50:56
INFO:root:AIM-Epoch[100/100](2/660) Lr:0.00010000 Loss:0.83574 Global:0.67843 Local:0.10950 Fusion-alpha:0.04781 Speed:3.84507s/iter Exa(h:m:s):00:42:10
INFO:root:AIM-Epoch[100/100](3/660) Lr:0.00010000 Loss:0.78562 Global:0.64933 Local:0.11512 Fusion-alpha:0.02116 Speed:3.55392s/iter Exa(h:m:s):00:38:54
INFO:root:AIM-Epoch[100/100](4/660) Lr:0.00010000 Loss:0.80663 Global:0.65181 Local:0.12981 Fusion-alpha:0.02502 Speed:3.40979s/iter Exa(h:m:s):00:37:16
INFO:root:Checkpoint saved to models/trained/aim_transfer_duts/ckpt_epoch100.pth

If our dataset contains SO only, how long should we train the network, i.e if we are not planning to train further on synthetic datasets which contains STM, NS images? Is there a benchmark/comparison for DUTS dataset with other models? Also, is the resnet backbone frozen during training?

opened by anilsathyan7 0

Owner

Jizhizi_Li

Ph.D. student at the University of Sydney - Artificial Intelligence

GitHub

U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

104 Nov 25, 2022

Official implementation of the paper "Lightweight Deep CNN for Natural Image Matting via Similarity Preserving Knowledge Distillation"

Lightweight-Deep-CNN-for-Natural-Image-Matting-via-Similarity-Preserving-Knowledge-Distillation Introduction Accepted at IEEE Signal Processing Letter

19 Jun 7, 2022

Automatic self-diagnosis program (python required)Automatic self-diagnosis program (python required)

auto-self-checker 자동으로 자가진단 해주는 프로그램(python 필요) 중요 이 프로그램이 실행될때에는 절대로 마우스포인터를 움직이거나 키보드를 건드리면 안된다(화면인식, 마우스포인터로 직접 클릭) 사용법 프로그램을 구동할 폴더 내의 cmd창에서 pip

1 Dec 30, 2021

PyMatting: A Python Library for Alpha Matting

Given an input image and a hand-drawn trimap (top row), alpha matting estimates the alpha channel of a foreground object which can then be composed onto a different background (bottom row).

1.4k Dec 30, 2022

Github project for Attention-guided Temporal Coherent Video Object Matting.

Attention-guided Temporal Coherent Video Object Matting This is the Github project for our paper Attention-guided Temporal Coherent Video Object Matti

71 Dec 19, 2022

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

6.5k Jan 4, 2023

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Robust Video Matting (RVM) English | 中文 Official repository for the paper Robust High-Resolution Video Matting with Temporal Guidance. RVM is specific

2 Aug 21, 2022

MODNet: Trimap-Free Portrait Matting in Real Time

MODNet is a model for real-time portrait matting with only RGB image input.

2.8k Dec 30, 2022

Real-Time High-Resolution Background Matting

Real-Time High-Resolution Background Matting Official repository for the paper Real-Time High-Resolution Background Matting. Our model requires captur

6.1k Jan 3, 2023

Video Matting Refinement For Python

Video-matting refinement Library (use pip to install) scikit-image numpy av matplotlib Run Static background python path_to_video.mp4 Moving backgroun

3 Jan 11, 2022

Rethinking Portrait Matting with Privacy Preserving

Rethinking Portrait Matting with Privacy Preserving This is the official repository of the paper Rethinking Portrait Matting with Privacy Preserving.

184 Jan 3, 2023

Automatic deep learning for image classification.

AutoDL AutoDL automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications. With just a few line

2 Oct 12, 2022

Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.

Deep Image Search - AI-Based Image Search Engine Deep Image Search is an AI-based image search engine that includes deep transfer learning features Ex

139 Jan 1, 2023

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

H2O H2O is an in-memory platform for distributed, scalable machine learning. H2O uses familiar interfaces like R, Python, Scala, Java, JSON and the Fl

6.1k Jan 5, 2023

Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral) This repo is the official imp

46 Dec 21, 2022

This source code is implemented using keras library based on "Automatic ocular artifacts removal in EEG using deep learning"

CSP_Deep_EEG This source code is implemented using keras library based on "Automatic ocular artifacts removal in EEG using deep learning" {https://www

2 Nov 8, 2022

[IJCAI'21] Deep Automatic Natural Image Matting

Related tags

Overview

Deep Automatic Natural Image Matting [IJCAI-21]

This is the official repository of the paper Deep Automatic Natural Image Matting.

📆 News

Introduction

Network

AIM-500

Results

Statement

Relevant Projects

Comments

Owner

Jizhizi_Li

U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

Official implementation of the paper "Lightweight Deep CNN for Natural Image Matting via Similarity Preserving Knowledge Distillation"

Automatic self-diagnosis program (python required)Automatic self-diagnosis program (python required)

PyMatting: A Python Library for Alpha Matting

Github project for Attention-guided Temporal Coherent Video Object Matting.

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

MODNet: Trimap-Free Portrait Matting in Real Time

Real-Time High-Resolution Background Matting

Video Matting Refinement For Python

Rethinking Portrait Matting with Privacy Preserving

Automatic deep learning for image classification.

Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.

Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

This source code is implemented using keras library based on "Automatic ocular artifacts removal in EEG using deep learning"

A Web API for automatic background removal using Deep Learning. App is made using Flask and deployed on Heroku.

Automatic Image Background Subtraction

Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt)