[IJCAI'21] Deep Automatic Natural Image Matting

Overview

Deep Automatic Natural Image Matting [IJCAI-21]

This is the official repository of the paper Deep Automatic Natural Image Matting.

Introduction | Network | AIM-500 | Results | Statement


πŸ“† News

The training code, inference code and the pretrained models will be released soon.

[2021-07-16]: Publish the validation dataset AIM-500. Please follow the readme.txt for details.

Introduction

Different from previous methods only focusing on images with salient opaque foregrounds such as humans and animals, in this paper, we investigate the difficulties when extending the automatic matting methods to natural images with salient transparent/meticulous foregrounds or non-salient foregrounds.

To address the problem, we propose a novel end-to-end matting network, which can predict a generalized trimap for any image of the above types as a unified semantic representation. Simultaneously, the learned semantic features guide the matting network to focus on the transition areas via an attention mechanism.

We also construct a test set AIM-500 that contains 500 diverse natural images covering all types along with manually labeled alpha mattes, making it feasible to benchmark the generalization ability of AIM models. Results of the experiments demonstrate that our network trained on available composite matting datasets outperforms existing methods both objectively and subjectively.

Network

We propose the methods consist of:

  • Improved Backbone for Matting: an advanced max-pooling version of ResNet-34, serves as the backbone for the matting network, pretrained on ImageNet;

  • Unified Semantic Representation: a type-wise semantic representation to replace the traditional trimaps;

  • Guided Matting Process: an attention based mechanism to guide the matting process by leveraging the learned semantic features from the semantic decoder to focus on extracting details only within transition area.

The backbone pretrained on ImageNet and the model pretrained on synthetic matting dataset will be released soon.

Pretrained-backbone Pretrained-model
coming soon coming soon

AIM-500

We propose AIM-500 (Automatic Image Matting-500), the first natural image matting test set, which contains 500 high-resolution real-world natural images from all three types (SO, STM, NS), many categories, and the manually labeled alpha mattes. Some examples and the amount of each category are shown below. The AIM-500 dataset is published now, can be downloaded directly from this link. Please follow the readme.txt for more details.

Portrait Animal Transparent Plant Furniture Toy Fruit
100 200 34 75 45 36 10

Results

We test our network on different types of images in AIM-500 and compare with previous SOTA methods, the results are shown below.

Statement

If you are interested in our work, please consider citing the following:

@inproceedings{ijcai2021-danim,
  title     = {Deep Automatic Natural Image Matting},
  author    = {Li, Jizhizi and Zhang, Jing and Tao, Dacheng},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},
  year      = {2021},
}

This project is under the MIT license. For further questions, please contact [email protected].

Relevant Projects

End-to-end Animal Image Matting
Jizhizi Li, Jing Zhang, Stephen J. Maybank, Dacheng Tao

Comments
  • Some test results on images in the wild

    Some test results on images in the wild

    First: thanks for this amazing repository and research and for sharing it.

    So I did test some images with the code/pre-trained model you released. I share some results below. I have some very good results and some more disappointing ones (I am unreasonably demanding, sorry for that).

    One question is that I have some differences on the 3 default sample images results from the repo, i.e. when inferencing on 1.png, 2.png and 3.png): is that due to the paper trained model vs the newly trained model differences?

    Another question: do you know why sometimes there is 100% background detection, e.g. with the last two images below?

    I am on cuda 10.2 and pytorch 1.9, fyi. thanks

    20211003_015310

    20211003_015348

    20211003_015529

    20211003_015709

    20211003_015209

    opened by Tetsujinfr 9
  • Impossible to obtain the same results with the pretrained model

    Impossible to obtain the same results with the pretrained model

    Hello,

    Thanks for the great work, I believe your research is really interesting!

    I'd like to report issues obtaining the results displayed on the README.

    FIrst, when running on the same environment as described on the README (with torch=1.4.0), I run in the following exception:

    Traceback (most recent call last):
      File "core/test.py", line 321, in <module>
        load_model_and_deploy(args)
      File "core/test.py", line 298, in load_model_and_deploy
        ckpt = torch.load(args.model_path,map_location=torch.device('cpu'))
      File "/home/malrick/anaconda3/envs/aim-env/lib/python3.6/site-packages/torch/serialization.py", line 527, in load
        with _open_zipfile_reader(f) as opened_zipfile:
      File "/home/malrick/anaconda3/envs/aim-env/lib/python3.6/site-packages/torch/serialization.py", line 224, in __init__
        super(_open_zipfile_reader, self).__init__(torch._C.PyTorchFileReader(name_or_buffer))
    RuntimeError: version_ <= kMaxSupportedFileFormatVersion INTERNAL ASSERT FAILED at /opt/conda/conda-bld/pytorch_1579022034529/work/caffe2/serialize/inline_container.cc:132, please report a bug to PyTorch. Attempted to read a PyTorch file with version 3, but the maximum supported version for reading is 2. Your PyTorch installation may be too old. (init at /opt/conda/conda-bld/pytorch_1579022034529/work/caffe2/serialize/inline_container.cc:132)
    

    So it seems that the pre-trained model can't be read with such an old version of PyTorch. The export of the conda environment is available here: aim-env.txt (just replace .txt extension with .yml, GitHub doesn't support hosting yml files)


    I then ran the script with the same environment that @JizhiziLi described in #7 (PyTorch 1.7.1, and python 3.7.7). Conda export is available here: aim-env-2.txt

    I have set the parameter test_choice=Hybrid, in file core/scripts/test_samples.sh, and I have the ratios set to global_ratio=1/4, local_ratio=1/2 in file core/test.py

    Also, as my RTX3090 is not supported by these older versions of PyTorch, I used CPU for inference.

    To do so, I had to adjust certain parts of the code, first at line 292:

     	if torch.cuda.device_count()==0:
    		print(f'Running on CPU...')
    		args.cuda = False
    		ckpt = torch.load(args.model_path,map_location=torch.device('cpu'))
    	else:
    		print(f'Running on GPU with CUDA as {args.cuda}...')
    		ckpt = torch.load(args.model_path)
    

    to:

    
    	if not args.cuda or torch.cuda.device_count()==0:
    		print(f'Running on CPU...')
    		args.cuda = False
    		ckpt = torch.load(args.model_path,map_location=torch.device('cpu'))
    	else:
    		print(f'Running on GPU with CUDA as {args.cuda}...')
    		ckpt = torch.load(args.model_path)
    

    and line 42:

    	tensor_img = torch.from_numpy(scale_img.astype(np.float32)[:, :, :]).permute(2, 0, 1).cuda()
    

    to

    	tensor_img = torch.from_numpy(scale_img.astype(np.float32)[:, :, :]).permute(2, 0, 1)
    	if args.cuda:
    		tensor_img = tensor_img.cuda()
    

    Removing the --cuda parameter in scripts/test_samples.sh then allowed to run inference on CPU.

    The results I'm obtaining are the same as in issue #7.

    1 1

    2 2

    3 3

    I also ran inference similarly on the AIM-500 dataset, and I obtained the following results in logs/test_logs/DEBUG.log:

    INFO:root:Testing numbers: 500 INFO:root:SAD: 70.16285879981845 INFO:root:MSE: 0.033924558738880374 INFO:root:MAD: 0.041689636977158 INFO:root:SAD TRIMAP: 48.64482158505621 INFO:root:MSE TRIMAP: 0.09556289661012443 INFO:root:MAD TRIMAP: 0.13346166896791076 INFO:root:SAD FG: 17.866437117307104 INFO:root:SAD BG: 3.651600097455164 INFO:root:CONN: 67.30147147906628 INFO:root:GRAD: 58.72859108352662

    The artifacts that are obtained are really surprising, and they seem to be induced at a quite low resolution.

    Hope this helps, and we can find a way to have this great repository to work!

    opened by Malrick 3
  • Training code

    Training code

    Thank you for the great new model! It produces really cool results.

    It would be very interesting to play with model and train on my own datasets.

    Could you please give some estimation when will you release the training code so I don't refresh a page several times a day :)

    opened by baleksey 3
  • Number of iterations for each epochs

    Number of iterations for each epochs

    How many iterations does it do on each epochs? Form line: https://github.com/JizhiziLi/AIM/blob/master/core/train.py#L81, it looks like it does only 4 iterations for each epochs. Does this approach train the model with all images in one epoch?

    opened by anilsathyan7 1
  • Training code

    Training code

    Hi, Jizhizi Li πŸ€— πŸ€— πŸ€—. Many thanks for this amazing repo! Works very nice for my images with clothes! πŸ₯³ πŸ‘ŒπŸ‘‹

    But it seems to me that the results can be further improved by additional training on individual cases. πŸ‘€

    Some results: image image image

    Do you have some plans for realization the training code? I have already seen the previous question on this topic, but I can't wait. 😈 😈 😈 Thank you for the answer! Have a very very nice day πŸ€“πŸ˜„πŸ˜„

    opened by Kakoedlinnoeslovo 1
  • Question about the results for DIM+Trimap

    Question about the results for DIM+Trimap

    I've test my trimap based model on AIM-500. However, the results are much better that reported in your paper. It is reasonable that trimap-based models have advantages over trimap-free ones. I wonder that if you did not fuse the foreground and background labels provided by the trimap during test.

    opened by XavierCHEN34 1
  • USR type json file does not match DIM, HAttMatting dataset file name

    USR type json file does not match DIM, HAttMatting dataset file name

    Hi @JizhiziLi After downloading the json file, the filename for DIM and HAttMatting dataset is different from their original datasets.. Could you please provide the filename matching file? Thank you!

    opened by zoezhou1999 0
  • Wrong SAD result

    Wrong SAD result

    I tested the aim-net500 dataset with the model you provided and found that the SAD was only 47. Can you check that the model you uploaded in GitHub is correct? In addition, did you do other processing on the DIM dataset? In your dim_hatt_am2k_type.json file, the key of the DIM dataset is number 1,2,3,4....,but the names of DIM datasets are different. So how did you rename them. I use the training_fg_names.txt file to rename them, and retrained the model according to the steps you provided on GitHub, and the SAD is only 55.

    opened by gaka2012 1
  • Rotated predictions

    Rotated predictions

    Hi I've been testing you model quite a bit and have noticed, that some vertical images tend to rotate to the horisontal position. I wonted why and if there is a way of turning it off, since they rotate randomly clockwise or counter clockwise. An alternative would be to just analyse the images that are in horisontal position, however I'm not sure if that wouldn't have an impact on the performance.

    opened by Wojak27 0
  • DUTS SOD accuracy and loss values

    DUTS SOD accuracy and loss values

    I trained the model with duts dataset for 100 epochs(default) but the loss and accuracy seems to be low:-

    INFO:root:AIM-Epoch[100/100](1/660) Lr:0.00010000 Loss:0.82403 Global:0.66851 Local:0.11801 Fusion-alpha:0.03752 Speed:4.63853s/iter Exa(h:m:s):00:50:56
    INFO:root:AIM-Epoch[100/100](2/660) Lr:0.00010000 Loss:0.83574 Global:0.67843 Local:0.10950 Fusion-alpha:0.04781 Speed:3.84507s/iter Exa(h:m:s):00:42:10
    INFO:root:AIM-Epoch[100/100](3/660) Lr:0.00010000 Loss:0.78562 Global:0.64933 Local:0.11512 Fusion-alpha:0.02116 Speed:3.55392s/iter Exa(h:m:s):00:38:54
    INFO:root:AIM-Epoch[100/100](4/660) Lr:0.00010000 Loss:0.80663 Global:0.65181 Local:0.12981 Fusion-alpha:0.02502 Speed:3.40979s/iter Exa(h:m:s):00:37:16
    INFO:root:Checkpoint saved to models/trained/aim_transfer_duts/ckpt_epoch100.pth
    

    If our dataset contains SO only, how long should we train the network, i.e if we are not planning to train further on synthetic datasets which contains STM, NS images? Is there a benchmark/comparison for DUTS dataset with other models? Also, is the resnet backbone frozen during training?

    opened by anilsathyan7 0
Owner
Jizhizi_Li
Ph.D. student at the University of Sydney - Artificial Intelligence
Jizhizi_Li
U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

Dennis Bappert 104 Nov 25, 2022
Official implementation of the paper "Lightweight Deep CNN for Natural Image Matting via Similarity Preserving Knowledge Distillation"

Lightweight-Deep-CNN-for-Natural-Image-Matting-via-Similarity-Preserving-Knowledge-Distillation Introduction Accepted at IEEE Signal Processing Letter

DongGeun-Yoon 19 Jun 7, 2022
Automatic self-diagnosis program (python required)Automatic self-diagnosis program (python required)

auto-self-checker μžλ™μœΌλ‘œ μžκ°€μ§„λ‹¨ ν•΄μ£ΌλŠ” ν”„λ‘œκ·Έλž¨(python ν•„μš”) μ€‘μš” 이 ν”„λ‘œκ·Έλž¨μ΄ μ‹€ν–‰λ λ•Œμ—λŠ” μ ˆλŒ€λ‘œ λ§ˆμš°μŠ€ν¬μΈν„°λ₯Ό μ›€μ§μ΄κ±°λ‚˜ ν‚€λ³΄λ“œλ₯Ό κ±΄λ“œλ¦¬λ©΄ μ•ˆλœλ‹€(화면인식, λ§ˆμš°μŠ€ν¬μΈν„°λ‘œ 직접 클릭) μ‚¬μš©λ²• ν”„λ‘œκ·Έλž¨μ„ ꡬ동할 폴더 λ‚΄μ˜ cmdμ°½μ—μ„œ pip

null 1 Dec 30, 2021
PyMatting: A Python Library for Alpha Matting

Given an input image and a hand-drawn trimap (top row), alpha matting estimates the alpha channel of a foreground object which can then be composed onto a different background (bottom row).

PyMatting 1.4k Dec 30, 2022
Github project for Attention-guided Temporal Coherent Video Object Matting.

Attention-guided Temporal Coherent Video Object Matting This is the Github project for our paper Attention-guided Temporal Coherent Video Object Matti

null 71 Dec 19, 2022
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Peter Lin 6.5k Jan 4, 2023
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Robust Video Matting (RVM) English | δΈ­ζ–‡ Official repository for the paper Robust High-Resolution Video Matting with Temporal Guidance. RVM is specific

flow-dev 2 Aug 21, 2022
MODNet: Trimap-Free Portrait Matting in Real Time

MODNet is a model for real-time portrait matting with only RGB image input.

Zhanghan Ke 2.8k Dec 30, 2022
Real-Time High-Resolution Background Matting

Real-Time High-Resolution Background Matting Official repository for the paper Real-Time High-Resolution Background Matting. Our model requires captur

Peter Lin 6.1k Jan 3, 2023
Video Matting Refinement For Python

Video-matting refinement Library (use pip to install) scikit-image numpy av matplotlib Run Static background python path_to_video.mp4 Moving backgroun

null 3 Jan 11, 2022
Rethinking Portrait Matting with Privacy Preserving

Rethinking Portrait Matting with Privacy Preserving This is the official repository of the paper Rethinking Portrait Matting with Privacy Preserving.

null 184 Jan 3, 2023
Automatic deep learning for image classification.

AutoDL AutoDL automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications. With just a few line

wenqi 2 Oct 12, 2022
Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.

Deep Image Search - AI-Based Image Search Engine Deep Image Search is an AI-based image search engine that includes deep transfer learning features Ex

null 139 Jan 1, 2023
Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral) This repo is the official imp

ε¦‚δ»Šζˆ‘ε·²ε‰‘ζŒ‡ε€©ζΆ― 46 Dec 21, 2022
This source code is implemented using keras library based on "Automatic ocular artifacts removal in EEG using deep learning"

CSP_Deep_EEG This source code is implemented using keras library based on "Automatic ocular artifacts removal in EEG using deep learning" {https://www

Seyed Mahdi Roostaiyan 2 Nov 8, 2022
A Web API for automatic background removal using Deep Learning. App is made using Flask and deployed on Heroku.

Automatic_Background_Remover A Web API for automatic background removal using Deep Learning. App is made using Flask and deployed on Heroku. ?? https:

Gaurav 16 Oct 29, 2022
Automatic Image Background Subtraction

Automatic Image Background Subtraction This repo contains set of scripts for automatic one-shot image background subtraction task using the following

Oleg SΓ©mery 6 Dec 5, 2022
Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt)

Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt) Task Training huge unsupervised deep neural networks yields to strong progress in

null 2 Aug 5, 2022