Indices Matter: Learning to Index for Deep Image Matting

Hao Lu

Last update: Nov 26, 2022

Related tags

Deep Learning indexnet_matting

Overview

IndexNet Matting

This repository includes the official implementation of IndexNet Matting for deep image matting, presented in our paper:

Indices Matter: Learning to Index for Deep Image Matting

Proc. IEEE/CVF International Conference on Computer Vision (ICCV), 2019

Hao Lu¹, Yutong Dai¹, Chunhua Shen¹, Songcen Xu²

¹The University of Adelaide, Australia

²Noah's Ark Lab, Huawei Technologies

Updates

8 June 2020: The journal version of this work has been accepted to TPAMI! We further report many interesting results on other dense prediction tasks and extend our insights on generic upsampling operators.
4 April 2020: Training code is released!
16 Aug 2019: The supplementary material is finalized and released!
5 Aug 2019: Inference code of IndexNet Matting is released!

Highlights

Simple and effective: IndexNet Matting only deals with the upsampling stage but exhibits at least 16.1% relative improvements, compared to the Deep Matting baseline;
Memory-efficient: IndexNet Matting builds upon MobileNetV2. It can process an image with a resolution up to 1980x1080 on a single GTX 1070;
Easy to use: This framework also includes our re-implementation of Deep Matting and the pretrained model presented in the Adobe's CVPR17 paper.

Installation

Our code has been tested on Python 3.6.8/3.7.2 and PyTorch 0.4.1/1.1.0. Please follow the official instructions to configure your environment. See other required packages in requirements.txt.

A Quick Demo

We have included our pretrained model in ./pretrained and several images and trimaps from the Adobe Image Dataset in ./examples. Run the following command for a quick demonstration of IndexNet Matting. The inferred alpha mattes are in the folder ./examples/mattes.

python scripts/demo.py

Prepare Your Data

Please contact Brian Price ([email protected]) requesting for the Adobe Image Matting dataset;
Composite the dataset using provided foreground images, alpha mattes, and background images from the COCO and Pascal VOC datasets. I slightly modified the provided compositon_code.py to improve the efficiency, included in the scripts folder. Note that, since the image resolution is quite high, the dataset will be over 100 GB after composition.
The final path structure used in my code looks like this:

$PATH_TO_DATASET/Combined_Dataset
├──── Training_set
│    ├──── alpha (431 images)
│    ├──── fg (431 images)
│    └──── merged (43100 images)
├──── Test_set
│    ├──── alpha (50 images)
│    ├──── fg (50 images)
│    ├──── merged (1000 images)
│    └──── trimaps (1000 images)

Inference

Run the following command to do inference of IndexNet Matting/Deep Matting on the Adobe Image Matting dataset:

python scripts/demo_indexnet_matting.py

python scripts/demo_deep_matting.py

Please note that:

DATA_DIR should be modified to your dataset directory;
Images used in Deep Matting has been downsampled by 1/2 to enable the GPU inference. To reproduce the full-resolution results, the inference can be executed on CPU, which takes about 2 days.

Here is the results of IndexNet Matting and our reproduced results of Deep Matting on the Adobe Image Dataset:

Methods	Remark	#Param.	GFLOPs	SAD	MSE	Grad	Conn	Model
Deep Matting	Paper	--	--	54.6	0.017	36.7	55.3	--
Deep Matting	Re-implementation	130.55M	32.34	55.8	0.018	34.6	56.8	Google Drive (522MB)
IndexNet Matting	Ours	8.15M	6.30	45.8	0.013	25.9	43.7	Included

The original paper reported that there were 491 images, but the released dataset only includes 431 images. Among missing images, 38 of them were said double counted, and the other 24 of them were not released. As a result, we at least use 4.87% fewer training data than the original paper. Thus, the small differerce in performance should be normal.
The evaluation code (Matlab code implemented by the Deep Image Matting's author) placed in the ./evaluation_code folder is used to report the final performance for a fair comparion. We have also implemented a python version. The numerial difference is subtle.

Training

Run the following command to train IndexNet Matting:

sh train.sh

--data-dir should be modified to your dataset directory.
I was able to train the model on a single GTX 1080ti (12 GB). The training takes about 5 days. The current bottleneck appears to be the dataloader.

Citation

If you find this work or code useful for your research, please cite:

@inproceedings{hao2019indexnet,
  title={Indices Matter: Learning to Index for Deep Image Matting},
  author={Lu, Hao and Dai, Yutong and Shen, Chunhua and Xu, Songcen},
  booktitle={Proc. IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2019}
}

@article{hao2020indexnet,
  title={Index Networks},
  author={Lu, Hao and Dai, Yutong and Shen, Chunhua and Xu, Songcen},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2020}
}

Permission and Disclaimer

This code is only for non-commercial purposes. As covered by the ADOBE IMAGE DATASET LICENSE AGREEMENT, the trained models included in this repository can only be used/distributed for non-commercial purposes. Anyone who violates this rule will be at his/her own risk.

Comments

Reproducing results
I tried to reproduce the results on the Adobe 1k Dataset and got exactly the same numbers when using the pretrained model. Very good job with that :)

I also tried to train the model from scratch, but did not succeed yet. Do you have any tips?

What I got so far:

What it should look like:

As you can see, your model produces much sharper results.

My training procedure:

Find all x/y coordinate pairs which have alpha values between 0 and 1.

Pick a random x/y pair

Crop a 320x320, 480x480 or 640x640 foreground/alpha image centered on that pixel pair

Resize them to 320x320

Generate randomly dilated trimap from alpha using distance fields

Choose a random background image and resize it to 320x320

Blend foreground and background image

Train network with Adam optimizer, learning rate 0.01 and otherwise default parameters for 90000 batches of size 16, decaying the learning rate by a factor of 10 at the 60000th and 78000th batch

L1 loss on alpha and compositional L1 loss, both on unknown image region only

Model:

net = hlmobilenetv2( pretrained=True, freeze_bn=True, output_stride=32, apply_aspp=True, conv_operator='std_conv', decoder='indexnet', decoder_kernel_size=5, indexnet='depthwise', index_mode='m2o', use_nonlinear=True, use_context=True )

I've also tried:

training for 200000 batches instead of just 90000

L2 loss instead of L1 loss

only alpha loss

only compositional loss

pretrained=False

freeze_bn=True

I am not sure about first cropping and then resizing, as described in Deep Image Matting, because every batch it produces a few trimaps which have 100% unknown region. Also, it is impossible to crop a 640x640 image from some alpha mattes because they don't have unknown pixels to center the cropped region on.
opened by 983 20
How to generate Trimap?

Hi, Thanks for the implementation, results are really good, and look similar to the paper indeed.

Any suggestion on how to generate Trimap masks? Also, is there a plan to release training code?

opened by ofirkris 6
about the evaluation code

@poppinace hi Thank you for your great work You mentioned you have implemented a python version of the evaluation_code. But I can't find it in the folder. Would you mind to tell me where is it?

opened by wrrJasmine 5
Number of parameters inconsistent with the paper?

Hello! Thanks for your sharing this awesome repository! When I try your get_model_summary function in the demo file, with the default model (m2o DIN nonlinear+context) I find the parameter size is 5,953,515. According to your paper, the parameters should be about 8.15M, right? And I notice that when setting use_nonlinear=False, the parameters and GFLOPs are the same with the data on your paper (with or without context). Do you have a clue about this discrepancy?

Here's what the summary function returns: Total Parameters: 5,953,515

Total Multiply Adds (For Convolution and Linear Layers only): 5.189814209938049 GFLOPs

Number of Layers Conv2d : 109 layers BatchNorm2d : 88 layers ReLU6 : 71 layers DepthwiseM2OIndexBlock : 5 layers InvertedResidual : 17 layers _ASPPModule : 4 layers AdaptiveAvgPool2d : 1 layers Dropout : 1 layers ASPP : 1 layers IndexedUpsamlping : 7 layers

opened by kungtalon 4
Train Own Dataset

hello,

i want to train my own dataset to generate indexnet_matting.pth.tar .

what do i have to do ? i see first trimaps are needed, do you have a complete matting project that generates trimaps then runs training ?

or how do i first generate these trimaps ? Thanks

opened by cmdrootaccess 3
training from scratch
Thanks for great work~ I am trying to training your model from scratch(only use pretrained mobilenet weights). I encountered two problems:

The first problem is shown by the red arrow: the alpha value of the unkonwn region is not large enough.

The second problem is shown by the blue arrow: outside the unknown region, there are always white scattered dots.

For the first problem, I think my training epochs is not enough (just trained to the 6th epoch). For the second problem, I am very confused and have no ideas. Do you have any suggestions on these two problems?
opened by liminn 3
Questions about B11 model

Thank you for your great work!

I have two questions regarding your B11 model (Unet). (1) It seems that your code uses both conv and pooling for downsampling, which maybe a typo? So which downsampling module do you use? (2) Is the crop size 320 or 321 in your training of B11?

opened by kfeng123 2
How to solve this error? When I run this net to predict an image，I have GPU，But I want not to use it.

RuntimeError: module must have its parameters and buffers on device cuda:0 (device_ids[0]) but found one of them on device: cpu Traceback: File "/data/User/ch/CNN/view/image_app.py", line 80, in matting_view alpha = matting.predict(image, trimap) File "/root/anaconda3/lib/python3.7/site-packages/graphics/function/Matting.py", line 81, in predict outputs = self.net(inputs).squeeze().cpu().data.numpy() File "/root/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "/root/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 146, in forward "them on device: {}".format(self.src_device_obj, t.device))

opened by CachCheng 2
errors when use model.train()

Traceback (most recent call last): File "/Applications/PyCharm CE.app/Contents/helpers/pydev/pydevd.py", line 1741, in main() File "/Applications/PyCharm CE.app/Contents/helpers/pydev/pydevd.py", line 1735, in main globals = debugger.run(setup['file'], None, None, is_module) File "/Applications/PyCharm CE.app/Contents/helpers/pydev/pydevd.py", line 1135, in run pydev_imports.execfile(file, globals, locals) # execute the script File "/Applications/PyCharm CE.app/Contents/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/Users/CNN/IndexNet-master/algorithm/demo.py", line 36, in detector.train() File "/Users/CNN/IndexNet-master/algorithm/train/train.py", line 100, in train outputs = self.net(image).squeeze().cpu().data.numpy() File "/Users/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, **kwargs) File "/Users/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 140, in forward return self.module(*inputs, **kwargs) File "/Users/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, **kwargs) File "/Users/CNN/IndexNet-master/algorithm/network/hlmobilenetv2.py", line 1134, in forward l = self.dconv_pp(l7) # 160x10x10 File "/Users/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, **kwargs) File "/Users/CNN/IndexNet-master/algorithm/network/hlaspp.py", line 139, in forward x5 = self.global_avg_pool(x) File "/Users/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, **kwargs) File "/Users/anaconda3/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward input = module(input) File "/Users/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, **kwargs) File "/Users/anaconda3/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 83, in forward exponential_average_factor, self.eps) File "/Users/anaconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 1693, in batch_norm raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size)) ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])

opened by CachCheng 2
The file pretrained/indexnet_matting.pth.tar is corrupted

The file indexnet_matting.pth.tar downloaded from github is not correct which means I cannot unzip the tar file. Could you kindly reupdate your pretrained model of the indexnet as the pth.tar file is corrupted right now?

opened by tianmingdu 2
Correct the relative path and switch to the CPU.

Question: when I run demo.py https://github.com/poppinace/indexnet_matting/blob/867720dee1bc74212ec77bf2ad1c43af3bbc1c8f/scripts/demo.py#L45 Error will be reported when loading the model. Reason The relative path of pretrained and example is wrong.

opened by leijue222 1
How can I train the DIM?

Hello, thank you for your great job. I wonder how can I train DIM, since I can only find the train.sh for IndexNet Matting. Can you provide me with the sh file for DIM? Thank you.

opened by caihaunqai 0
freeze_bn seems to be an invalid option

Dear author,

I am trying to read and reproduce your codes, but I found some possible issue with batch normalization.

In current code, you define a freeze_bn() function to change all batch normalization layers to eval mode, like

https://github.com/poppinace/indexnet_matting/blob/4beb06a47db2eecca87b8003a11f0b268506cea3/scripts/hlmobilenetv2.py#L824

But you neither rewrite the train function of nn.Module nor call this function every time before the training cycle.

This means when train() function call net.train(), these BN layers becomes training mode again and freeze_bn actually takes no effect and all training is conducted under with BN enabled. Is this right?

opened by wangruohui 3
Possible reasons for different performance between DIM Re-implementation and original implementation

Thanks for your great work! When I'm trying to reproduce the result of DIM(Deep Image Matting), I found that using vgg model without BN and setting batch-size to 1 will give the same or even better performance, comparing to the results reported by the original DIM paper.

So I think that fewer training data is not supposed to be the reason for different performance.

opened by VitoChien 1
error in loading pre-trained model to train on multi-gpu
I am trying to load the pretrained model to fine tune on multi-gpu setting. However, I am getting an error message . Here is my code:

checkpoint = torch.load(args.restore_from) pretrained_dict = OrderedDict() for key, value in checkpoint['state_dict'].items(): if 'module' in key: key = key[7:] pretrained_dict[key] = value net.load_state_dict(pretrained_dict)

The error message:

RuntimeError: Error(s) in loading state_dict for hlMobileNetV2UNetDecoderIndexLearning: Missing key(s) in state_dict: "layer0.1._tmp_running_mean", "layer0.1._tmp_running_var", "layer0.1._running_iter", "layer1.0.conv.1._tmp_running_mean", "layer1.0.conv.1._tmp_running_var", "layer1.0.conv.1._running_iter", "layer1.0.conv.4._tmp_running_mean", "layer1.0.conv.4._tmp_running_var", "layer1.0.conv.4._running_iter", "layer2.0.conv.1._tmp_running_mean", "layer2.0.conv.1._tmp_running_var", "layer2.0.conv.1._running_iter", "layer2.0.conv.4._tmp_running_mean", "layer2.0.conv.4._tmp_running_var", "layer2.0.conv.4._running_iter", "layer2.0.conv.7._tmp_running_mean", "layer2.0.conv.7._tmp_running_var", "layer2.0.conv.7._running_iter", "layer2.1.conv.1._tmp_running_mean", "layer2.1.conv.1._tmp_running_var", "layer2.1.conv.1._running_iter", "layer2.1.conv.4._tmp_running_mean", "layer2.1.conv.4._tmp_running_var", "layer2.1.conv.4._running_iter", "layer2.1.conv.7._tmp_running_mean", "layer2.1.conv.7._tmp_running_var", "layer2.1.conv.7._running_iter", "layer3.0.conv.1._tmp_running_mean", "layer3.0.conv.1._tmp_running_var", "layer3.0.conv.1._running_iter", "layer3.0.conv.4._tmp_running_mean", "layer3.0.conv.4._tmp_running_var", "layer3.0.conv.4._running_iter", "layer3.0.conv.7._tmp_running_mean", "layer3.0.conv.7._tmp_running_var", "layer3.0.conv.7._running_iter", "layer3.1.conv.1._tmp_running_mean", "layer3.1.conv.1._tmp_running_var", "layer3.1.conv.1._running_iter", "layer3.1.conv.4._tmp_running_mean", "layer3.1.conv.4._tmp_running_var", "layer3.1.conv.4._running_iter", "layer3.1.conv.7._tmp_running_mean", "layer3.1.conv.7._tmp_running_var", "layer3.1.conv.7._running_iter", "layer3.2.conv.1._tmp_running_mean", "layer3.2.conv.1._tmp_running_var", "layer3.2.conv.1._running_iter", "layer3.2.conv.4._tmp_running_mean", "layer3.2.conv.4._tmp_running_var", "layer3.2.conv.4._running_iter", "layer3.2.conv.7._tmp_running_mean", "layer3.2.conv.7._tmp_running_var", "layer3.2.conv.7._running_iter", "layer4.0.conv.1._tmp_running_mean", "layer4.0.conv.1._tmp_running_var", "layer4.0.conv.1._running_iter", "layer4.0.conv.4._tmp_running_mean", "layer4.0.conv.4._tmp_running_var", "layer4.0.conv.4._running_iter", "layer4.0.conv.7._tmp_running_mean", "layer4.0.conv.7._tmp_running_var", "layer4.0.conv.7._running_iter", "layer4.1.conv.1._tmp_running_mean", "layer4.1.conv.1._tmp_running_var", "layer4.1.conv.1._running_iter", "layer4.1.conv.4._tmp_running_mean", "layer4.1.conv.4._tmp_running_var", "layer4.1.conv.4._running_iter", "layer4.1.conv.7._tmp_running_mean", "layer4.1.conv.7._tmp_running_var", "layer4.1.conv.7._running_iter", "layer4.2.conv.1._tmp_running_mean", "layer4.2.conv.1._tmp_running_var", "layer4.2.conv.1._running_iter", "layer4.2.conv.4._tmp_running_mean", "layer4.2.conv.4._tmp_running_var", "layer4.2.conv.4._running_iter", "layer4.2.conv.7._tmp_running_mean", "layer4.2.conv.7._tmp_running_var", "layer4.2.conv.7._running_iter", "layer4.3.conv.1._tmp_running_mean", "layer4.3.conv.1._tmp_running_var", "layer4.3.conv.1._running_iter", "layer4.3.conv.4._tmp_running_mean", "layer4.3.conv.4._tmp_running_var", "layer4.3.conv.4._running_iter", "layer4.3.conv.7._tmp_running_mean", "layer4.3.conv.7._tmp_running_var", "layer4.3.conv.7._running_iter", "layer5.0.conv.1._tmp_running_mean", "layer5.0.conv.1._tmp_running_var", "layer5.0.conv.1._running_iter", "layer5.0.conv.4._tmp_running_mean", "layer5.0.conv.4._tmp_running_var", "layer5.0.conv.4._running_iter", "layer5.0.conv.7._tmp_running_mean", "layer5.0.conv.7._tmp_running_var", "layer5.0.conv.7._running_iter", "layer5.1.conv.1._tmp_running_mean", "layer5.1.conv.1._tmp_running_var", "layer5.1.conv.1._running_iter", "layer5.1.conv.4._tmp_running_mean", "layer5.1.conv.4._tmp_running_var", "layer5.1.conv.4._running_iter", "layer5.1.conv.7._tmp_running_mean", "layer5.1.conv.7._tmp_running_var", "layer5.1.conv.7._running_iter", "layer5.2.conv.1._tmp_running_mean", "layer5.2.conv.1._tmp_running_var", "layer5.2.conv.1._running_iter", "layer5.2.conv.4._tmp_running_mean", "layer5.2.conv.4._tmp_running_var", "layer5.2.conv.4._running_iter", "layer5.2.conv.7._tmp_running_mean", "layer5.2.conv.7._tmp_running_var", "layer5.2.conv.7._running_iter", "layer6.0.conv.1._tmp_running_mean", "layer6.0.conv.1._tmp_running_var", "layer6.0.conv.1._running_iter", "layer6.0.conv.4._tmp_running_mean", "layer6.0.conv.4._tmp_running_var", "layer6.0.conv.4._running_iter", "layer6.0.conv.7._tmp_running_mean", "layer6.0.conv.7._tmp_running_var", "layer6.0.conv.7._running_iter", "layer6.1.conv.1._tmp_running_mean", "layer6.1.conv.1._tmp_running_var", "layer6.1.conv.1._running_iter", "layer6.1.conv.4._tmp_running_mean", "layer6.1.conv.4._tmp_running_var", "layer6.1.conv.4._running_iter", "layer6.1.conv.7._tmp_running_mean", "layer6.1.conv.7._tmp_running_var", "layer6.1.conv.7._running_iter", "layer6.2.conv.1._tmp_running_mean", "layer6.2.conv.1._tmp_running_var", "layer6.2.conv.1._running_iter", "layer6.2.conv.4._tmp_running_mean", "layer6.2.conv.4._tmp_running_var", "layer6.2.conv.4._running_iter", "layer6.2.conv.7._tmp_running_mean", "layer6.2.conv.7._tmp_running_var", "layer6.2.conv.7._running_iter", "layer7.0.conv.1._tmp_running_mean", "layer7.0.conv.1._tmp_running_var", "layer7.0.conv.1._running_iter", "layer7.0.conv.4._tmp_running_mean", "layer7.0.conv.4._tmp_running_var", "layer7.0.conv.4._running_iter", "layer7.0.conv.7._tmp_running_mean", "layer7.0.conv.7._tmp_running_var", "layer7.0.conv.7._running_iter", "index0.indexnet1.1._tmp_running_mean", "index0.indexnet1.1._tmp_running_var", "index0.indexnet1.1._running_iter", "index0.indexnet2.1._tmp_running_mean", "index0.indexnet2.1._tmp_running_var", "index0.indexnet2.1._running_iter", "index0.indexnet3.1._tmp_running_mean", "index0.indexnet3.1._tmp_running_var", "index0.indexnet3.1._running_iter", "index0.indexnet4.1._tmp_running_mean", "index0.indexnet4.1._tmp_running_var", "index0.indexnet4.1._running_iter", "index2.indexnet1.1._tmp_running_mean", "index2.indexnet1.1._tmp_running_var", "index2.indexnet1.1._running_iter", "index2.indexnet2.1._tmp_running_mean", "index2.indexnet2.1._tmp_running_var", "index2.indexnet2.1._running_iter", "index2.indexnet3.1._tmp_running_mean", "index2.indexnet3.1._tmp_running_var", "index2.indexnet3.1._running_iter", "index2.indexnet4.1._tmp_running_mean", "index2.indexnet4.1._tmp_running_var", "index2.indexnet4.1._running_iter", "index3.indexnet1.1._tmp_running_mean", "index3.indexnet1.1._tmp_running_var", "index3.indexnet1.1._running_iter", "index3.indexnet2.1._tmp_running_mean", "index3.indexnet2.1._tmp_running_var", "index3.indexnet2.1._running_iter", "index3.indexnet3.1._tmp_running_mean", "index3.indexnet3.1._tmp_running_var", "index3.indexnet3.1._running_iter", "index3.indexnet4.1._tmp_running_mean", "index3.indexnet4.1._tmp_running_var", "index3.indexnet4.1._running_iter", "index4.indexnet1.1._tmp_running_mean", "index4.indexnet1.1._tmp_running_var", "index4.indexnet1.1._running_iter", "index4.indexnet2.1._tmp_running_mean", "index4.indexnet2.1._tmp_running_var", "index4.indexnet2.1._running_iter", "index4.indexnet3.1._tmp_running_mean", "index4.indexnet3.1._tmp_running_var", "index4.indexnet3.1._running_iter", "index4.indexnet4.1._tmp_running_mean", "index4.indexnet4.1._tmp_running_var", "index4.indexnet4.1._running_iter", "index6.indexnet1.1._tmp_running_mean", "index6.indexnet1.1._tmp_running_var", "index6.indexnet1.1._running_iter", "index6.indexnet2.1._tmp_running_mean", "index6.indexnet2.1._tmp_running_var", "index6.indexnet2.1._running_iter", "index6.indexnet3.1._tmp_running_mean", "index6.indexnet3.1._tmp_running_var", "index6.indexnet3.1._running_iter", "index6.indexnet4.1._tmp_running_mean", "index6.indexnet4.1._tmp_running_var", "index6.indexnet4.1._running_iter", "dconv_pp.aspp1.atrous_conv.1._tmp_running_mean", "dconv_pp.aspp1.atrous_conv.1._tmp_running_var", "dconv_pp.aspp1.atrous_conv.1._running_iter", "dconv_pp.aspp2.atrous_conv.1._tmp_running_mean", "dconv_pp.aspp2.atrous_conv.1._tmp_running_var", "dconv_pp.aspp2.atrous_conv.1._running_iter", "dconv_pp.aspp2.atrous_conv.4._tmp_running_mean", "dconv_pp.aspp2.atrous_conv.4._tmp_running_var", "dconv_pp.aspp2.atrous_conv.4._running_iter", "dconv_pp.aspp3.atrous_conv.1._tmp_running_mean", "dconv_pp.aspp3.atrous_conv.1._tmp_running_var", "dconv_pp.aspp3.atrous_conv.1._running_iter", "dconv_pp.aspp3.atrous_conv.4._tmp_running_mean", "dconv_pp.aspp3.atrous_conv.4._tmp_running_var", "dconv_pp.aspp3.atrous_conv.4._running_iter", "dconv_pp.aspp4.atrous_conv.1._tmp_running_mean", "dconv_pp.aspp4.atrous_conv.1._tmp_running_var", "dconv_pp.aspp4.atrous_conv.1._running_iter", "dconv_pp.aspp4.atrous_conv.4._tmp_running_mean", "dconv_pp.aspp4.atrous_conv.4._tmp_running_var", "dconv_pp.aspp4.atrous_conv.4._running_iter", "dconv_pp.global_avg_pool.2._tmp_running_mean", "dconv_pp.global_avg_pool.2._tmp_running_var", "dconv_pp.global_avg_pool.2._running_iter", "dconv_pp.bottleneck_conv.1._tmp_running_mean", "dconv_pp.bottleneck_conv.1._tmp_running_var", "dconv_pp.bottleneck_conv.1._running_iter", "decoder_layer6.dconv.1._tmp_running_mean", "decoder_layer6.dconv.1._tmp_running_var", "decoder_layer6.dconv.1._running_iter", "decoder_layer5.dconv.1._tmp_running_mean", "decoder_layer5.dconv.1._tmp_running_var", "decoder_layer5.dconv.1._running_iter", "decoder_layer4.dconv.1._tmp_running_mean", "decoder_layer4.dconv.1._tmp_running_var", "decoder_layer4.dconv.1._running_iter", "decoder_layer3.dconv.1._tmp_running_mean", "decoder_layer3.dconv.1._tmp_running_var", "decoder_layer3.dconv.1._running_iter", "decoder_layer2.dconv.1._tmp_running_mean", "decoder_layer2.dconv.1._tmp_running_var", "decoder_layer2.dconv.1._running_iter", "decoder_layer1.dconv.1._tmp_running_mean", "decoder_layer1.dconv.1._tmp_running_var", "decoder_layer1.dconv.1._running_iter", "decoder_layer0.dconv.1._tmp_running_mean", "decoder_layer0.dconv.1._tmp_running_var", "decoder_layer0.dconv.1._running_iter", "pred.0.1._tmp_running_mean", "pred.0.1._tmp_running_var", "pred.0.1._running_iter".

Any idea on how to resolve this? Thanks
opened by yxt132 3

Owner

Hao Lu

I am currently an Associate Professor with Huazhong University of Science and Technology, China.

GitHub

Only a Matter of Style: Age Transformation Using a Style-Based Regression Model

Only a Matter of Style: Age Transformation Using a Style-Based Regression Model The task of age transformation illustrates the change of an individual

444 Dec 30, 2022

[IJCAI'21] Deep Automatic Natural Image Matting

Deep Automatic Natural Image Matting [IJCAI-21] This is the official repository of the paper Deep Automatic Natural Image Matting. Introduction | Netw

316 Jan 6, 2023

Official implementation of the paper "Lightweight Deep CNN for Natural Image Matting via Similarity Preserving Knowledge Distillation"

Lightweight-Deep-CNN-for-Natural-Image-Matting-via-Similarity-Preserving-Knowledge-Distillation Introduction Accepted at IEEE Signal Processing Letter

19 Jun 7, 2022

A library for building and serving multi-node distributed faiss indices.

About Distributed faiss index service. A lightweight library that lets you work with FAISS indexes which don't fit into a single server memory. It fol

170 Dec 30, 2022

On-device speech-to-index engine powered by deep learning.

30 Nov 24, 2022

PyMatting: A Python Library for Alpha Matting

Given an input image and a hand-drawn trimap (top row), alpha matting estimates the alpha channel of a foreground object which can then be composed onto a different background (bottom row).

1.4k Dec 30, 2022

Github project for Attention-guided Temporal Coherent Video Object Matting.

Attention-guided Temporal Coherent Video Object Matting This is the Github project for our paper Attention-guided Temporal Coherent Video Object Matti

71 Dec 19, 2022

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

6.5k Jan 4, 2023

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Robust Video Matting (RVM) English | 中文 Official repository for the paper Robust High-Resolution Video Matting with Temporal Guidance. RVM is specific

2 Aug 21, 2022

MODNet: Trimap-Free Portrait Matting in Real Time

MODNet is a model for real-time portrait matting with only RGB image input.

2.8k Dec 30, 2022

Real-Time High-Resolution Background Matting

Real-Time High-Resolution Background Matting Official repository for the paper Real-Time High-Resolution Background Matting. Our model requires captur

6.1k Jan 3, 2023

Video Matting Refinement For Python

Video-matting refinement Library (use pip to install) scikit-image numpy av matplotlib Run Static background python path_to_video.mp4 Moving backgroun

3 Jan 11, 2022

Rethinking Portrait Matting with Privacy Preserving

Rethinking Portrait Matting with Privacy Preserving This is the official repository of the paper Rethinking Portrait Matting with Privacy Preserving.

184 Jan 3, 2023

Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.

Deep Image Search - AI-Based Image Search Engine Deep Image Search is an AI-based image search engine that includes deep transfer learning features Ex

139 Jan 1, 2023

A python script to lookup Passport Index Dataset

visa-cli A python script to lookup Passport Index Dataset Installation pip install visa-cli Usage usage: visa-cli [-h] [-d DESTINATION_COUNTRY] [-f]

16 Oct 18, 2022

Hand gesture recognition based whiteboard that allows you to write on live webcam. This is the first version and has features like 4 different colors, eraser and a recording option that records your session and saves it in a "recordings" folder. Use index finger to draw and two or more fingers to move around and select items. Future version will contain more functionalities like changeable thickness, color palette, integration with zoom and google meet etc.

hand-write Hand gesture recognition based whiteboard that allows you to write on live webcam. This is the first version and has features like 4 differ

27 Dec 16, 2022

This is a virtual picture dragging application. Users may virtually slide photos across the screen. The distance between the index and middle fingers determines the movement. Smaller distances indicate click and motion, whereas bigger distances indicate only hand movement.

Virtual_Image_Dragger This is a virtual picture dragging application. Users may virtually slide photos across the screen. The distance between the ind

17 Dec 17, 2022

A set of simple scripts to process the Imagenet-1K dataset as TFRecords and make index files for NVIDIA DALI.

Overview This is a set of simple scripts to process the Imagenet-1K dataset as TFRecords and make index files for NVIDIA DALI. Make TFRecords To run t

8 Nov 1, 2022

Ivy is a templated deep learning framework which maximizes the portability of deep learning codebases.

Ivy is a templated deep learning framework which maximizes the portability of deep learning codebases. Ivy wraps the functional APIs of existing frameworks. Framework-agnostic functions, libraries and layers can then be written using Ivy, with simultaneous support for all frameworks. Ivy currently supports Jax, TensorFlow, PyTorch, MXNet and Numpy. Check out the docs for more info!

8.2k Jan 2, 2023

Indices Matter: Learning to Index for Deep Image Matting

Related tags

Overview

IndexNet Matting

Updates

Highlights

Installation

A Quick Demo

Prepare Your Data

Inference

Training

Citation

Permission and Disclaimer

Comments

What I got so far:

What it should look like:

Here's what the summary function returns: Total Parameters: 5,953,515

Total Multiply Adds (For Convolution and Linear Layers only): 5.189814209938049 GFLOPs

Owner

Hao Lu

Only a Matter of Style: Age Transformation Using a Style-Based Regression Model

[IJCAI'21] Deep Automatic Natural Image Matting

Official implementation of the paper "Lightweight Deep CNN for Natural Image Matting via Similarity Preserving Knowledge Distillation"

A library for building and serving multi-node distributed faiss indices.

On-device speech-to-index engine powered by deep learning.

PyMatting: A Python Library for Alpha Matting

Github project for Attention-guided Temporal Coherent Video Object Matting.

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

MODNet: Trimap-Free Portrait Matting in Real Time

Real-Time High-Resolution Background Matting

Video Matting Refinement For Python

Rethinking Portrait Matting with Privacy Preserving

Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.

A python script to lookup Passport Index Dataset

This is a virtual picture dragging application. Users may virtually slide photos across the screen. The distance between the index and middle fingers determines the movement. Smaller distances indicate click and motion, whereas bigger distances indicate only hand movement.

A set of simple scripts to process the Imagenet-1K dataset as TFRecords and make index files for NVIDIA DALI.

Ivy is a templated deep learning framework which maximizes the portability of deep learning codebases.