Official repository of Semantic Image Matting

Related tags

Deep Learning SIM
Overview

Semantic Image Matting


This is the official repository of Semantic Image Matting (CVPR2021).

Overview

framework

Natural image matting separates the foreground from background in fractional occupancy which can be caused by highly transparent objects, complex foreground (e.g., net or tree), and/or objects containing very fine details (e.g., hairs). Although conventional matting formulation can be applied to all of the above cases, no previous work has attempted to reason the underlying causes of matting due to various foreground semantics.

We show how to obtain better alpha mattes by incorporating into our framework semantic classification of matting regions. Specifically, we consider and learn 20 classes of matting patterns, and propose to extend the conventional trimap to semantic trimap. The proposed semantic trimap can be obtained automatically through patch structure analysis within trimap regions. Meanwhile, we learn a multi-class discriminator to regularize the alpha prediction at semantic level, and content-sensitive weights to balance different regularization losses.

Dataset

Download our semantic image matting dataset (SIMD) here. SIMD is composed self-collected images and a subset of adobe images. To obtain the complete dataset, please contact Brian Price ([email protected]) for the Adobe Image Matting dataset first and follow the instructions within SIMD.zip.

Requirements

The codes are tested in the following environment:

  • Python 3.7
  • Pytorch 1.9.0
  • CUDA 10.2 & CuDNN 7.6.5

Performance

Some pretrained models are listed below with their performance.

Methods SAD MSE Grad Conn Link
SIMD 27.9 4.7 11.6 20.8 model
Composition-1K (paper) 28.0 5.8 10.8 24.8
Composition-1K (repo) 27.7 5.6 10.7 24.4 model

Run

Download the model and put it under checkpoints/DIM or checkpoints/Adobe in the root directory. Download the classifier here and put it under checkpoints. Run the inference and evaluation by

python scripts/main.py -c config/CONFIG.yaml 

Results

example1

example2

Reference

If you find our work useful in your research, please consider citing:

@inproceedings{sun2021sim,
  author    = {Yanan Sun and Chi-Keung Tang and Yu-Wing Tai}
  title     = {Semantic Image Matting},
  booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year      = {2021},
}

Acknowledgment

This repo borrows code from several repos, like GCA and FBA.

Comments
  • Double concatenation of input image?

    Double concatenation of input image?

    https://github.com/nowsyn/SIM/blob/master/networks/model.py#L153

    conv_out[-6]=conv_out[0]=https://github.com/nowsyn/SIM/blob/8a9f12ff92b2127c3988f799ba8878817e3f486c/networks/util.py#L176

    so conv_out[-6][first 3 channels] is the input image. am i correct or am i missing something?

    opened by mhashas 1
  • Question about threshold of 50 pixels

    Question about threshold of 50 pixels

    Hi,

    Awesome work for introducing different forms of trimaps into training! After reading through the code implementation, there is one point I'm not sure about. In 3.1.2 'Simulating User Scribbles', it says 'To avoid the points being too close to each other, we set a threshold of 50 pixels between each two points'. After going through code in dataloader/Test_dataset/data_generator.py, class Genclickmap, seems there is no strict constraint applied on the distances between randomly selected fg points and bg points. I am not sure if I have missed some stuff, hope someone could help me. Thanks in advance.

    opened by WANGEOGEO 1
  • about SAD of your dataset and weight

    about SAD of your dataset and weight

    I think I have correctly generated the test set with your code. but I got 29.58599 SAD with your weight from https://drive.google.com/file/d/1kmhQtO-6wXTxgHtQCLRj3xPOEtXyzTXC/view?usp=sharing and https://drive.google.com/file/d/12JCGqDylBXJpgDhj4hg_JZYdbHlX8TKe/view?usp=sharing. b548f33ab9c559926d5f68088a00443a_2011_001608 Fig. 8 row 1 in your paper.

    opened by Serge-weihao 1
  • Classifier is corrupted

    Classifier is corrupted

    Dear author ,

    Thank you for your awesome work , but I am facing 1 issue whenever I try to download the classifier it gives me a error stating the file is corrupted , can u pls help me with this , Looking forward to hearing from you.

    Best regards, Dane

    opened by Deep190 1
  • Some thoughts about the project

    Some thoughts about the project

    1. I didn't find the training code, will it be released in the future?
    2. I would like to ask whether this project is aimed at the matting of specific objects (Focus only on people, fire, trees... )
    3. For general object matting, is there any good implementation project, hope to recommend ,like this image
    opened by zhanghongyong123456 0
  • UnpicklingError: invalid load key, '<'.

    UnpicklingError: invalid load key, '<'.

    hi, Here's my kaggle After I correct my sim.toml, I ran into the following error message: !python main.py -c config/sim.toml

    2021-11-23 00:23:44,177-main.py:33-INFO-=> creating classifier 'resnet50' 2021-11-23 00:23:44,634-main.py:36-INFO-=> loading checkpoint 'checkpoints/classifier_resnet50_best.pth.tar' 2021-11-23 00:23:44,743-main.py:40-INFO-=> loaded checkpoint 'checkpoints/classifier_resnet50_best.pth.tar' modifying input layer to accept 11 channels 2021-11-23 00:23:45,403-main.py:63-INFO-Pretrain: no checkpoint found at 'None' 2021-11-23 00:23:45,404-main.py:66-INFO-Resume: loading 'checkpoints/SIM/ckpt_best.pth' Traceback (most recent call last): File "main.py", line 337, in main() File "main.py", line 327, in main model = build_sim_model(cfg.model, logger) File "main.py", line 67, in build_sim_model ckpt = torch.load(args.resume_checkpoint, map_location=torch.device('cpu')) File "/opt/conda/lib/python3.7/site-packages/torch/serialization.py", line 608, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "/opt/conda/lib/python3.7/site-packages/torch/serialization.py", line 777, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, '<'.

    any help ?

    opened by rkuo2000 0
Owner
null
[IJCAI'21] Deep Automatic Natural Image Matting

Deep Automatic Natural Image Matting [IJCAI-21] This is the official repository of the paper Deep Automatic Natural Image Matting. Introduction | Netw

Jizhizi_Li 316 Jan 6, 2023
PyMatting: A Python Library for Alpha Matting

Given an input image and a hand-drawn trimap (top row), alpha matting estimates the alpha channel of a foreground object which can then be composed onto a different background (bottom row).

PyMatting 1.4k Dec 30, 2022
Github project for Attention-guided Temporal Coherent Video Object Matting.

Attention-guided Temporal Coherent Video Object Matting This is the Github project for our paper Attention-guided Temporal Coherent Video Object Matti

null 71 Dec 19, 2022
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Peter Lin 6.5k Jan 4, 2023
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Robust Video Matting (RVM) English | 中文 Official repository for the paper Robust High-Resolution Video Matting with Temporal Guidance. RVM is specific

flow-dev 2 Aug 21, 2022
MODNet: Trimap-Free Portrait Matting in Real Time

MODNet is a model for real-time portrait matting with only RGB image input.

Zhanghan Ke 2.8k Dec 30, 2022
Real-Time High-Resolution Background Matting

Real-Time High-Resolution Background Matting Official repository for the paper Real-Time High-Resolution Background Matting. Our model requires captur

Peter Lin 6.1k Jan 3, 2023
Video Matting Refinement For Python

Video-matting refinement Library (use pip to install) scikit-image numpy av matplotlib Run Static background python path_to_video.mp4 Moving backgroun

null 3 Jan 11, 2022
Rethinking Portrait Matting with Privacy Preserving

Rethinking Portrait Matting with Privacy Preserving This is the official repository of the paper Rethinking Portrait Matting with Privacy Preserving.

null 184 Jan 3, 2023
Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int

Jiwoon Ahn 337 Dec 15, 2022
Official repository for "Restormer: Efficient Transformer for High-Resolution Image Restoration". SOTA for motion deblurring, image deraining, denoising (Gaussian/real data), and defocus deblurring.

Restormer: Efficient Transformer for High-Resolution Image Restoration Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan,

Syed Waqas Zamir 906 Dec 30, 2022
This repository contains several image-to-image translation models, whcih were tested for RGB to NIR image generation. The models are Pix2Pix, Pix2PixHD, CycleGAN and PointWise.

RGB2NIR_Experimental This repository contains several image-to-image translation models, whcih were tested for RGB to NIR image generation. The models

null 5 Jan 4, 2023
Official code release for: EditGAN: High-Precision Semantic Image Editing

Official code release for: EditGAN: High-Precision Semantic Image Editing

null 565 Jan 5, 2023
Sound-guided Semantic Image Manipulation - Official Pytorch Code (CVPR 2022)

?? Sound-guided Semantic Image Manipulation (CVPR2022) Official Pytorch Implementation Sound-guided Semantic Image Manipulation IEEE/CVF Conference on

CVLAB 58 Dec 28, 2022
A curated list of awesome resources related to Semantic Search🔎 and Semantic Similarity tasks.

A curated list of awesome resources related to Semantic Search?? and Semantic Similarity tasks.

null 224 Jan 4, 2023
Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation) Download Synthia dataset The model uses

null 32 Sep 21, 2022
Build upon neural radiance fields to create a scene-specific implicit 3D semantic representation, Semantic-NeRF

Semantic-NeRF: Semantic Neural Radiance Fields Project Page | Video | Paper | Data In-Place Scene Labelling and Understanding with Implicit Scene Repr

Shuaifeng Zhi 243 Jan 7, 2023
Official repository for Few-shot Image Generation via Cross-domain Correspondence (CVPR '21)

Few-shot Image Generation via Cross-domain Correspondence Utkarsh Ojha, Yijun Li, Jingwan Lu, Alexei A. Efros, Yong Jae Lee, Eli Shechtman, Richard Zh

Utkarsh Ojha 251 Dec 11, 2022