Official repository for the paper F, B, Alpha Matting

Overview

FBA Matting Open In Colab PWC License: MIT Arxiv

Official repository for the paper F, B, Alpha Matting. This paper and project is under heavy revision for peer reviewed publication, and so I will not be able to release the training code yet.
Marco Forte1, François Pitié1

1 Trinity College Dublin

Requirements

GPU memory >= 11GB for inference on Adobe Composition-1K testing set, more generally for resolutions above 1920x1080.

Packages:

  • torch >= 1.4
  • numpy
  • opencv-python

Additional Packages for jupyter notebook

  • matplotlib
  • gdown (to download model inside notebook)

Models

These models have been trained on Adobe Image Matting Dataset. They are covered by the Adobe Deep Image Mattng Dataset License Agreement so they can only be used and distributed for noncommercial purposes.
More results of this model avialiable on the alphamatting.com, the videomatting.com benchmark, and the supplementary materials PDF.

Model Name File Size SAD MSE Grad Conn
FBA Table. 4 139mb 26.4 5.4 10.6 21.5

Prediction

We provide a script demo.py and jupyter notebook which both give the foreground, background and alpha predictions of our model. The test time augmentation code will be made availiable soon.
In the torchscript notebook we show how to convert the model to torchscript.

In this video I demonstrate how to create a trimap in Pinta/Paint.NET.

Training

Training code is not released at this time. It may be released upon acceptance of the paper. Here are the key takeaways from our work with regards training.

  • Use a batch-size of 1, and use Group Normalisation and Weight Standardisation in your network.
  • Train with clipping of the alpha instead of sigmoid.
  • The L1 alpha, compositional loss and laplacian loss are beneficial. Gradient loss is not needed.
  • For foreground prediction, we extend the foreground to the entire image and define the loss on the entire image or at least the unknown region. We found this better than solely where alpha>0. Code for foreground extension

Citation

@article{forte2020fbamatting,
  title   = {F, B, Alpha Matting},
  author  = {Marco Forte and François Pitié},
  journal = {CoRR},
  volume  = {abs/2003.07711},
  year    = {2020},
}

Related works of ours

  • 99% accurate interactive object selection with just a few clicks: PDF, Code
Comments
  • No train code

    No train code

    Hi~ Thank you for the wonderful work. I find your work on the alphamatting.com, and I am very glad that you open source your code so quickly. However, I do not find the training code in the repository. Would you like also open source the training code?

    opened by xup16 7
  • Artifacts when applying to green screen removal

    Artifacts when applying to green screen removal

    Hello!

    I am trying to apply the FBA Matting method to removing green screen from images (portraits of people shot on a green screen). Overall, it works well but it leaves some green pixels mixed with the hair (I attach here an example).

    I can see 2 possible reasons for this issue:

    1. As the result heavily depends on the quality of trimaps, better trimaps are needed. I am currently generating trimaps using DeepLabV3. I tried to adjust some parameters (conf_threshold, dilation), it got better but not perfect yet. Do you know if there are any other parameters that should be tuned or models I can try for trimaps creation?
    2. This model was trained on Adobe Image Matting Dataset that does not contain portraits with green screen. Do you think it would make a big difference if I retrained this model on my custom dataset of green screen portraits?

    Thank you in advance for any help! I would truly appreciate it! black_happy_17-gigapixel-standard-scale-2_00x_swapped_bg

    opened by TatianaSnauwaert 6
  • Pyramid Laplacian loss

    Pyramid Laplacian loss

    Thanks for the great work! Your results are really impressive. I noticed that you suggested using laplacian loss for training. May I ask for the implementation details of it?

    opened by yucornetto 6
  • Questions about training and test details

    Questions about training and test details

    Hi, thank you for sharing your code and paper!

    Recently, I’m reproducing your work. I have some questions about the training and test details.

    1. In the paper, it is said that RAdam optimizer is used with momentum 0.9 and weight decay 0.0001. But I didn’t found “momentum” parameter in the official RAdam optimzier code. Did you modify the official code or just set beta1 to 0.9?

    2. About the weight decay, there are two descriptions in the paper. a) weight decay 0.0001 in RAdam optimizer, b) weight decay of 0.005, 1e-5 to convolutional weights and GB parameters. How did you set them in your code? I tried to set the weight decay in the optimizer to 0.0001, and add L2 loss to conv weights and GN weights & bias with weight 0.005 and 1e-5. Because I think L2 loss here is equivalent to weight decay. Is it same as yours?

    3. The input resolution for training patches. The training patches of size 640, 480, 320, are randomly cropped during training. After that, did your resize them to a certain size? If not, how to train the model with batch size 6?

    4. The input channel for test. In section 3.6, “During inference, the full-resolution input images and trimaps are concatenated as 4-channel input and fed into the network.” But in the code, 11-channel input is used. Is it a typo?

    5. About the re-estimated foreground. I tried to re-estimate the training foreground images, but only succeeded in 411 of 431 fg. 20 of them failed to be optimized. Did you have the same problem? Besides, when calculating the alpha * foreground error during test, are the ground-truth foreground images re-estimated by closed form?

    Thank you !

    opened by xymsh 6
  • About the number of input channels

    About the number of input channels

    In paper 3.1 :

    First, we increase the number of input channels from 3 to 9 to allow for the extra trimap. We encode the trimap using Gaussian blurs of the definite foreground and background masks at three different scales (in a similar way to the method of [19] in interactive segmentation). This encoding differs from existing approaches in deep image matting, as they usually encode the trimap as a single channel with value 1 if foreground, 0.5 for unknown and 0 for background.

    I know the output channels is 7 (a=1, F=3, B=3) But why the input channels is 9? In you code,I saw the input are image and trimap,then input channels will be 4. So,why the input channels is 9 in paper 3.1?

    opened by leijue222 4
  • has the granting of your recent patent with Adobe effectively killed open source image matting?

    has the granting of your recent patent with Adobe effectively killed open source image matting?

    https://patents.google.com/patent/US11004208B2/en

    patent seems to cover the idea of user interaction to create binary matte -> trimap -> final matte and every variation of that.

    opened by 21pl 3
  • Loss function and  fba_fusion()

    Loss function and fba_fusion()

    in fba_decoder, there is a fba_fusion() function. Is the loss function calculated by the output of the fba_fusion()? or the output of the conv_up4()->clamp and sigmoid

    opened by wchstrife 3
  • About the color-spill problem and Fig.2. in your paper

    About the color-spill problem and Fig.2. in your paper

    Hi, I have read your paper on arxiv. In your paper, the color-spill problem was proposed and an example was given in Fig.2. I tried to reproduce your result in Fig.2 using the same foreground in Composition-1k dataset. But the composite image is totally different from yours. Besides, I tried to use Closed-form method to estimate a new foreground, but got a different one. My new fg is black in the area where alpha equals to zero.

    So how did you composite the original foreground onto the background? And how did you re-estimate the new foreground? Can you please share your code?

    Thanks!

    opened by xymsh 3
  • double concatenation of input image?

    double concatenation of input image?

    https://github.com/MarcoForte/FBA_Matting/blob/master/networks/models.py#L208, here conv_out[-6] is conv_out[0], which is https://github.com/MarcoForte/FBA_Matting/blob/master/networks/models.py#L99 the actual image, so we're concatenating the image twice basically

    am i correct or am i missing something?

    opened by mhashas 2
  • script for foreground estimation

    script for foreground estimation

    I use Code for foreground extension for foreground prediction on adobe dataset. But when processing some pictures, there will be insufficient memory.

    I noticed that you mentioned that a custom multi-level solver written in C. Could you please provide this code?

    opened by wchstrife 2
  • TTA details

    TTA details

    The effect of your proposed model and training method is really amazing. “This, combined with longer training at 45 epochs and TTA, has a bigger impact than the choices of loss functions”. It seems TTA also plays an important role. So I want to konw the detail of TTA, such as the example code or related info. Looking forward to your reply!

    opened by uptodiff 2
  • AttributeError: 'Args' object has no attribute 'seek'. in notebook

    AttributeError: 'Args' object has no attribute 'seek'. in notebook

    Using FBA Matting.ipynb in Gooble Colab, at the source code below,

    class Args:
      encoder = 'resnet50_GN_WS'
      decoder = 'fba_decoder'
      weights = 'FBA.pth'
    args=Args()
    try:
        model = build_model(args)
    except:
        !gdown  https://drive.google.com/uc?id=1T_oiKDE_biWf2kqexMEN7ObWqtXAzbB1
        model = build_model(args)
    

    the following error occurred.

    modifying input layer to accept 11 channels
    Downloading...
    From: https://drive.google.com/uc?id=1T_oiKDE_biWf2kqexMEN7ObWqtXAzbB1
    To: /content/drive/MyDrive/FBA Matting/FBA.pth
    100% 139M/139M [00:01<00:00, 81.2MB/s]
    modifying input layer to accept 11 channels
    ---------------------------------------------------------------------------
    AttributeError                            Traceback (most recent call last)
    [/usr/local/lib/python3.7/dist-packages/torch/serialization.py](https://localhost:8080/#) in _check_seekable(f)
        307     try:
    --> 308         f.seek(f.tell())
        309         return True
    
    AttributeError: 'Args' object has no attribute 'seek'
    
    During handling of the above exception, another exception occurred:
    
    AttributeError                            Traceback (most recent call last)
    14 frames
    AttributeError: 'Args' object has no attribute 'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.
    
    During handling of the above exception, another exception occurred:
    
    AttributeError                            Traceback (most recent call last)
    AttributeError: 'Args' object has no attribute 'seek'
    
    During handling of the above exception, another exception occurred:
    
    AttributeError                            Traceback (most recent call last)
    [/usr/local/lib/python3.7/dist-packages/torch/serialization.py](https://localhost:8080/#) in raise_err_msg(patterns, e)
        302                                 + " Please pre-load the data into a buffer like io.BytesIO and"
        303                                 + " try to load from it instead.")
    --> 304                 raise type(e)(msg)
        305         raise e
        306 
    
    AttributeError: 'Args' object has no attribute 'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.
    

    Please tell me the cause of the error and how to fix it.

    opened by Ishihara-Masabumi 0
  • Radam not converging

    Radam not converging

    Hello, thnak you for your great work I am trying to make the training code. So i started by making alpha loss, composition loss and regression loss, and i used radam optimizer RAdam(group_weight(net, 1e-5, 1e-5, 0.0001)) as you mentionned. To test the convegenve and evrithing is right in my code i took only 2 images with hair from adobe dataset and start training with these 2 images. But it's seems the loss is not converging, however when using the classical adam optimizer i got convergence in few iterations. the group_weight function seems dealing with parameters of the resnet encoder, i think i should use a pretrained resnet. could you share the pretrained resnet you used and if possible the code to load it in matting module. thanks in advance

    opened by ilyaskhanbey 1
  • Random foreground composition vs solving

    Random foreground composition vs solving

    In paper you wrote:

    To further increase the dataset diversity, we randomly composite a new foreground object with 50% probability, as in [34].

    Could you please provide more details on how exactly you did it. Did you compose foregrounds before or after solving colors in areas where alpha is 0. And did you do something (maybe mean?) to save correct foreground colors during composition where both fg's alpha is 0?

    I noticed that if i compose solved foregrounds it results to dominating "background" fg colors where alpha is 0.

    Here is an example of compositioning 2 foregrounds in different order. Look at "orange"/green background. Снимок экрана 2021-04-06 в 09 25 33

    And here is what i've got if solve combined fg one more time (but as i think, double solving could lead to incorrect colors in semitransparent regions) Снимок экрана 2021-04-06 в 09 28 51

    opened by shkarupa-alex 1
  • Possible non-fatal issue in decoder

    Possible non-fatal issue in decoder

    As far as i understand, here https://github.com/MarcoForte/FBA_Matting/blob/master/networks/models.py#L230 resnet backbone will return such feature maps: [original_image, conv_bn_relu out, layer1 out, layer2 out, layer3 out, layer4 out]

    In the decoder https://github.com/MarcoForte/FBA_Matting/blob/master/networks/models.py#L350 you concatenate: (x, conv_out[-6][:, :3], img, two_chan_trimap).

    But conv_out[-6][:, :3] is the same as img. Are you sure that image should be concatenated twice?

    opened by shkarupa-alex 1
  • Is alpha loss calculated for all the pixels or just for certain pixels that are unknown in trimap

    Is alpha loss calculated for all the pixels or just for certain pixels that are unknown in trimap

    Hi in the repo https://github.com/huochaitiantang/pytorch-deep-image-matting/

    the alpha prediction loss is calculated only for pixels that are unknown in trimap

    """ wi = torch.zeros(trimap.shape) wi[trimap == 128] = 1. t_wi = wi.cuda() unknown_region_size = t_wi.sum() """

    while training are you doing something similar or is it for all pixels?

    Sorry couldn't find this in paper or code

    opened by kartikwar 1
Owner
Marco Forte
Twitter @mearcoforte
Marco Forte
Official implementation of the paper "Lightweight Deep CNN for Natural Image Matting via Similarity Preserving Knowledge Distillation"

Lightweight-Deep-CNN-for-Natural-Image-Matting-via-Similarity-Preserving-Knowledge-Distillation Introduction Accepted at IEEE Signal Processing Letter

DongGeun-Yoon 19 Jun 7, 2022
AlphaNet Improved Training of Supernet with Alpha-Divergence

AlphaNet: Improved Training of Supernet with Alpha-Divergence This repository contains our PyTorch training code, evaluation code and pretrained model

Facebook Research 87 Oct 10, 2022
Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression

Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression YOLOv5 with alpha-IoU losses implemented in PyTorch. Example r

Jacobi(Jiabo He) 147 Dec 5, 2022
Quick program made to generate alpha and delta tables for Hidden Markov Models

HMM_Calc Functions for generating Alpha and Delta tables from a Hidden Markov Model. Parameters: a: Matrix of transition probabilities. a[i][j] = a_{i

Adem Odza 1 Dec 4, 2021
SymPy-powered, Wolfram|Alpha-like answer engine totally in your browser, without backend computation

SymPy Beta SymPy Beta is a fork of SymPy Gamma. The purpose of this project is to run a SymPy-powered, Wolfram|Alpha-like answer engine totally in you

Liumeo 25 Dec 21, 2022
Alpha-Zero - Telegram Group Manager Bot Written In Python Using Pyrogram

✨ Alpha Zero Bot ✨ Telegram Group Manager Bot + Userbot Written In Python Using

null 1 Feb 17, 2022
Github project for Attention-guided Temporal Coherent Video Object Matting.

Attention-guided Temporal Coherent Video Object Matting This is the Github project for our paper Attention-guided Temporal Coherent Video Object Matti

null 71 Dec 19, 2022
[IJCAI'21] Deep Automatic Natural Image Matting

Deep Automatic Natural Image Matting [IJCAI-21] This is the official repository of the paper Deep Automatic Natural Image Matting. Introduction | Netw

Jizhizi_Li 316 Jan 6, 2023
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Peter Lin 6.5k Jan 4, 2023
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Robust Video Matting (RVM) English | 中文 Official repository for the paper Robust High-Resolution Video Matting with Temporal Guidance. RVM is specific

flow-dev 2 Aug 21, 2022
MODNet: Trimap-Free Portrait Matting in Real Time

MODNet is a model for real-time portrait matting with only RGB image input.

Zhanghan Ke 2.8k Dec 30, 2022
Real-Time High-Resolution Background Matting

Real-Time High-Resolution Background Matting Official repository for the paper Real-Time High-Resolution Background Matting. Our model requires captur

Peter Lin 6.1k Jan 3, 2023
Video Matting Refinement For Python

Video-matting refinement Library (use pip to install) scikit-image numpy av matplotlib Run Static background python path_to_video.mp4 Moving backgroun

null 3 Jan 11, 2022
Rethinking Portrait Matting with Privacy Preserving

Rethinking Portrait Matting with Privacy Preserving This is the official repository of the paper Rethinking Portrait Matting with Privacy Preserving.

null 184 Jan 3, 2023
Official repository for the ICLR 2021 paper Evaluating the Disentanglement of Deep Generative Models with Manifold Topology

Official repository for the ICLR 2021 paper Evaluating the Disentanglement of Deep Generative Models with Manifold Topology Sharon Zhou, Eric Zelikman

Stanford Machine Learning Group 34 Nov 16, 2022
The repository offers the official implementation of our paper in PyTorch.

Cloth Interactive Transformer (CIT) Cloth Interactive Transformer for Virtual Try-On Bin Ren1, Hao Tang1, Fanyang Meng2, Runwei Ding3, Ling Shao4, Phi

Bingoren 49 Dec 1, 2022
Official code repository of the paper Learning Associative Inference Using Fast Weight Memory by Schlag et al.

Learning Associative Inference Using Fast Weight Memory This repository contains the offical code for the paper Learning Associative Inference Using F

Imanol Schlag 18 Oct 12, 2022
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

selfcontact This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] It includes the main function

Lea Müller 68 Dec 6, 2022
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

SMPLify-XMC This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] License Software Copyright Lic

Lea Müller 83 Dec 14, 2022