Tensorflow Implementation of Pixel Transposed Convolutional Networks (PixelTCN and PixelTCL)

Overview

Pixel Transposed Convolutional Networks

Created by Hongyang Gao, Hao Yuan, Zhengyang Wang and Shuiwang Ji at Texas A&M University.

Introduction

Pixel transposed convolutional layer (PixelTCL) is a more effective way to perform up-sampling operations than transposed convolutional layer.

Detailed information about PixelTCL is provided in [arXiv tech report] (https://arxiv.org/abs/1705.06820).

Citation

If using this code, please cite our paper.

@article{gao2017pixel,
  title={Pixel Transposed Convolutional Networks},
  author={Hongyang Gao and Hao Yuan and Zhengyang Wang and Shuiwang Ji},
  journal={arXiv preprint arXiv:1705.06820},
  year={2017}
}

Results

Semantic segmentation

model

Comparison of semantic segmentation results. The first and second rows are images and ground true labels, respectively. The third and fourth rows are the results of using regular transposed convolution and our proposed pixel transposed convolution, respectively.

Generate real images (VAE)

model

Sample face images generated by VAEs when trained on the CelebA dataset. The first two rows are images generated by a standard VAE with transposed convolutional layers for up-sampling. The last two rows are images generated by the same VAE model, but using PixelTCL for up-sampling in the generator network.

System requirement

Programming language

Python 3.5+

Python Packages

tensorflow (CPU) or tensorflow-gpu (GPU), numpy, h5py, progressbar, PIL, scipy

Prepare data

In this project, we provided a set of sample datasets for training, validation, and testing. If want to train on other data such as PASCAL, prepare the h5 files as required. utils/h5_utils.py could be used to generate h5 files.

Configure the network

All network hyperparameters are configured in main.py.

Training

max_step: how many iterations or steps to train

test_step: how many steps to perform a mini test or validation

save_step: how many steps to save the model

summary_step: how many steps to save the summary

Data

data_dir: data directory

train_data: h5 file for training

valid_data: h5 file for validation

test_data: h5 file for testing

batch: batch size

channel: input image channel number

height, width: height and width of input image

Debug

logdir: where to store log

modeldir: where to store saved models

sampledir: where to store predicted samples, please add a / at the end for convinience

model_name: the name prefix of saved models

reload_step: where to return training

test_step: which step to test or predict

random_seed: random seed for tensorflow

Network architecture

network_depth: how deep of the U-Net including the bottom layer

class_num: how many classes. Usually number of classes plus one for background

start_channel_num: the number of channel for the first conv layer

conv_name: use which convolutional layer in decoder. We have conv2d for standard convolutional layer, and ipixel_cl for input pixel convolutional layer proposed in our paper.

deconv_name: use which upsampling layer in decoder. We have deconv for standard transposed convolutional layer, ipixel_dcl for input pixel transposed convolutional layer, and pixel_dcl for pixel transposed convolutional layer proposed in our paper.

Training and Testing

Start training

After configure the network, we can start to train. Run

python main.py

The training of a U-Net for semantic segmentation will start.

Training process visualization

We employ tensorboard to visualize the training process.

tensorboard --logdir=logdir/

The segmentation results including training and validation accuracies, and the prediction outputs are all available in tensorboard.

Testing and prediction

Select a good point to test your model based on validation or other measures.

Fill the test_step in main.py with the checkpoint you want to test, run

python main.py --action=test

The final output include accuracy and mean_iou.

If you want to make some predictions, run

python main.py --action=predict

The predicted segmentation results will be in sampledir set in main.py, colored.

Use PixelDCL in other models

If you want to use pixel transposed convolutional layer in other models, just copy the file

utils/pixel_dcn.py

and use it in your model:


from pixel_dcn import pixel_dcl, ipixel_dcl, ipixel_cl


outputs = pixel_dcl(inputs, out_num, kernel_size, scope)

Currently, this version only support up-sampling by factor 2 such as from 2x2 to 4x4. We may provide more flexible version in the future.

Comments
  • Is there randomness when predicting the Segmentation Mask

    Is there randomness when predicting the Segmentation Mask

    Hi @HongyangGao , Help me please!

    So I trained Fire Segmentation Network (Below I will paste the conf), the loss is decreased from 0.72 to 0.005 which is good enough to expect something cool from --option=predict.

    However, every time I run the main.py --option=predict I get different predictions even though test step(the weights) are the same. Here is the results from the two runs: Run 1: 2 2_cv img__2 fire117_gt

    Run 2: img__2 2 2_cv fire117_gt

    If you see them Run 1 is doing well in regards to seperating fire, but when I run second time I am getting results of Run 2 which is not what I want.

    My question is why I am getting two different results for the same input. I also saw there is a random_seed is used in configure() method, is it because of this variable.

    This is my Confs for both of Run1 and Run2:

    # training
      flags = tf.app.flags
      flags.DEFINE_integer('max_step', 60000, '# of step for training')
      flags.DEFINE_integer('test_interval', 100, '# of interval to test a model')
      flags.DEFINE_integer('save_interval', 100, '# of interval to save  model')
      flags.DEFINE_integer('summary_interval', 100, '# of step to save summary')
      flags.DEFINE_float('learning_rate', 1e-4, 'learning rate')
      # data
      flags.DEFINE_string('data_dir', r'E:\dataset\patchBasedFireDataset\BowDataset/', 'Name of data directory')
      flags.DEFINE_string('train_data', 'BowDatasettraining.h5', 'Training data')
      flags.DEFINE_string('valid_data', 'BowDatasetvalidation.h5', 'Validation data')
      flags.DEFINE_string('test_data', 'BowDatasettesting.h5', 'Testing data')
      flags.DEFINE_string('data_type', '2D', '2D data or 3D data')
      flags.DEFINE_integer('batch', 1, 'batch size')
      flags.DEFINE_integer('channel', 3, 'channel size')
      flags.DEFINE_integer('depth', 1, 'depth size')
      flags.DEFINE_integer('height', 256, 'height size')
      flags.DEFINE_integer('width', 256, 'width size')
      # Debug
      flags.DEFINE_string('logdir', './logdir', 'Log dir')
      flags.DEFINE_string('modeldir', './modeldir', 'Model dir')
      flags.DEFINE_string('sampledir', './samples/', 'Sample directory')
      flags.DEFINE_string('model_name', 'model', 'Model file name')
      flags.DEFINE_integer('reload_step', 95000, 'Reload step to continue training')
      flags.DEFINE_integer('test_step', 300000, 'Test or predict model at this step')
      flags.DEFINE_integer('random_seed', int(time.time()), 'random seed')
      # network architecture
      flags.DEFINE_integer('network_depth', 5, 'network depth for U-Net')
      flags.DEFINE_integer('class_num', 2, 'output class number')
      flags.DEFINE_integer('start_channel_num', 16,
                           'start number of outputs for the first conv layer')
    

    Thank you for your help, I do appreciate it!

    opened by Jumabek 8
  • Differences between PixelTCN and PixelShuffle?

    Differences between PixelTCN and PixelShuffle?

    PixelShuffle refers to Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network by Shi et. al (2016). Thanks in advance.

    opened by HolmesShuan 4
  • How does the loss should look like?

    How does the loss should look like?

    For Fire Segmentation problem which has only 2 classes I am getting following loss: batch_size=1 Is that normal. The reason for my suspicion is that I get very different results for neighboring weights for example. Weight of Iter 1540 is very different from 1500 figure_1

    For fire117

    Outputs. iter 1540: 2

    iter 1500: 2

    opened by Jumabek 4
  • 2D training example is not working?

    2D training example is not working?

    So I am trying to do 2D training, using the provided example. But I am afraid I am doing it wrong, could you please check my Configurations

    This is how my Configurations looks like:

       # training
        flags = tf.app.flags
        flags.DEFINE_integer('max_step', 60, '# of step for training')
        flags.DEFINE_integer('test_interval', 100, '# of interval to test a model')
        flags.DEFINE_integer('save_interval', 2, '# of interval to save  model')
        flags.DEFINE_integer('summary_interval', 100, '# of step to save summary')
        flags.DEFINE_float('learning_rate', 1e-3, 'learning rate')
        # data
        flags.DEFINE_string('data_dir', './dataset/', 'Name of data directory')
        flags.DEFINE_string('train_data', 'training.h5', 'Training data')
        flags.DEFINE_string('valid_data', 'validation.h5', 'Validation data')
        flags.DEFINE_string('test_data', 'testing.h5', 'Testing data')
        flags.DEFINE_string('data_type', '2D', '2D data or 3D data')
        flags.DEFINE_integer('batch', 2, 'batch size')
        flags.DEFINE_integer('channel', 3, 'channel size')
        flags.DEFINE_integer('depth', 1, 'depth size')
        flags.DEFINE_integer('height', 256, 'height size')
        flags.DEFINE_integer('width', 256, 'width size')
        # Debug
        flags.DEFINE_string('logdir', './logdir', 'Log dir')
        flags.DEFINE_string('modeldir', './modeldir', 'Model dir')
        flags.DEFINE_string('sampledir', './samples/', 'Sample directory')
        flags.DEFINE_string('model_name', 'model', 'Model file name')
        flags.DEFINE_integer('reload_step', 0, 'Reload step to continue training')
        flags.DEFINE_integer('test_step', 60, 'Test or predict model at this step')
        flags.DEFINE_integer('random_seed', int(time.time()), 'random seed')
        # network architecture
        flags.DEFINE_integer('network_depth', 5, 'network depth for U-Net')
        flags.DEFINE_integer('class_num', 2, 'output class number')
        flags.DEFINE_integer('start_channel_num', 3,
                             'start number of outputs for the first conv layer')
    

    This is the training log:

    ----training loss 0.439823
    ----training loss 0.685655
    ---->saving 2
    ----training loss 0.716789
    ----training loss 0.712096
    ---->saving 4
    ----training loss 0.683432
    ----training loss 0.523199
    ---->saving 6
    ----training loss 0.49661
    ----training loss 0.695421
    ---->saving 8
    ----training loss 0.356408
    ----training loss 0.653522
    ---->saving 10
    ----training loss 0.483054
    ----training loss 0.420301
    ---->saving 12
    ----training loss 0.442111
    ----training loss 0.685215
    ---->saving 14
    ----training loss 0.356957
    ----training loss 0.671164
    ---->saving 16
    ----training loss 0.594261
    ----training loss 0.472929
    ---->saving 18
    ----training loss 0.531491
    ----training loss 0.481012
    ---->saving 20
    ----training loss 0.348508
    ----training loss 0.599136
    ---->saving 22
    ----training loss 0.505479
    ----training loss 0.665116
    ---->saving 24
    ----training loss 0.703314
    ----training loss 0.580268
    ---->saving 26
    ----training loss 0.465344
    ----training loss 0.482133
    ---->saving 28
    ----training loss 0.333282
    ----training loss 0.454517
    ---->saving 30
    ----training loss 0.475901
    ----training loss 0.517551
    ---->saving 32
    ----training loss 0.410749
    ----training loss 0.619331
    ---->saving 34
    ----training loss 0.488531
    ----training loss 0.296446
    ---->saving 36
    ----training loss 0.642942
    ----training loss 0.59157
    ---->saving 38
    ----training loss 0.449556
    ----training loss 0.448444
    ---->saving 40
    ----training loss 0.421643
    ----training loss 0.516128
    ---->saving 42
    ----training loss 0.445137
    ----training loss 0.632879
    ---->saving 44
    ----training loss 0.631831
    ----training loss 0.427582
    ---->saving 46
    ----training loss 0.468836
    ----training loss 0.417944
    ---->saving 48
    ----training loss 0.505529
    ----training loss 0.662951
    ---->saving 50
    ----training loss 0.604587
    ----training loss 0.330602
    ---->saving 52
    ----training loss 0.444277
    ----training loss 0.393231
    ---->saving 54
    ----training loss 0.291731
    ----training loss 0.460036
    ---->saving 56
    ----training loss 0.550641
    ----training loss 0.62895
    ---->saving 58
    ----training loss 0.47633
    ----training loss 0.411844
    ---->saving 60
    
    opened by Jumabek 3
  • Questions about kernel_size

    Questions about kernel_size

    Thanks for your contribution. I want to use the pixel_dcl(inputs, out_num, kernel_size, scope, activation_fn=tf.nn.relu, d_format='NHWC') in pixel_dcn.py as the deconvolution layer,can you explain how should I decide the kernel_size?

    opened by daixiaogang 3
  • How to obtain 3D example data

    How to obtain 3D example data

    Hi @HongyangGao , Could you please share the code for obtaining 3D data from 10 VOC example images,

    I want to make fire segmentation app using your source code, cuz other code do not support windows platform.

    But since I am new to segmentation, I wondering should I use 2D or 3D data and network.

    If not code, any reference articles would be fine too.

    Thank you, Jumabek

    opened by Jumabek 2
  • Error when saving predictions of PixelDCN

    Error when saving predictions of PixelDCN

    Hi thx for the amazing work!

    I am trying prediction example on the source code and as usual I have an error which I cannot solve (at least not yet). Could you please help me Debug this.

    The error is occuring in the following line: https://github.com/HongyangGao/PixelDCN/blob/master/network.py#L280

    Thank you.

    And this is the call Stack:

    ---->predicting
    inputs.shape:  (1, 16, 256, 256, 1)
    ----->saving predictions
    Traceback (most recent call last):
      File "main.py", line 28, in <module>
        tf.app.run()
      File "C:\Anaconda3\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run
        _sys.exit(main(_sys.argv[:1] + flags_passthrough))
      File "main.py", line 22, in main
        getattr(model, args.option)()
      File "E:\code\FireSegmentation\PixelDCN\network.py", line 287, in predict
        str(index*prediction.shape[0]+i)+'.png')
      File "E:\code\FireSegmentation\PixelDCN\utils\img_utils.py", line 44, in imsave
        if k < 21:
    ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
    
    opened by Jumabek 2
  •  parameter question

    parameter question

    Hello, I want to replace Transposed Convolution with Pixel Transposed Convolution, which you share. I used pixel_dcl of your pixel_dcn directly, but there is a parameter scope that I don't know what to fill in. What does this parameter relate to and how to fill in?

    opened by watchmexiang 1
  • keep NHWC format in file

    keep NHWC format in file

    The model reads the h5 file this way. As mentioned in my last pull request #42 ^^ Since the model reads the h5 file in NHWC as tensorflow does, the wh-tuple should be inverted for the h5dataset creation. Now it works as expected, erasing my previous fix for non-squared pictures.

    opened by FiLeonard 1
  • Update h5_util.py

    Update h5_util.py

    columns and rows are switched by the image to numpy conversion. I had some errors. Since Tensorflow reads data NHWC: if the hf5 file should be kept in the NHWC format. instead of my changes the tuple should be flipped in the hf5 datset creation. if the tuple should be read as (height, width) it has to be flipped in the resize operation...

    opened by FiLeonard 0
  • Cannot infer num from shape

    Cannot infer num from shape

    the code in there:

    with tf.variable_scope("upsampling_logits"): ###############################0430##################### encoder_output_dcl = pixel_dcl(encoder_output, 256, [2,2], scope='pixel_dcl_1',activation_fn=tf.nn.relu,d_format='NHWC') # net = pixel_dcl(encoder_output_dcl, 256, [2,2], scope='pixel_dcl_2') ########################################################## net = tf.image.resize_bilinear(encoder_output_dcl, low_level_features_size, name='upsample_1')

    traning is normal,but eval is error

    opened by serendipity999 0
Owner
Hongyang Gao
I am currently an Assistant Professor of Iowa State University. My research interest is deep learning.
Hongyang Gao
Some code of the implements of Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network

3D-GMPDCNN Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network PyTorch implementation of "Geological Modeling Usin

null 5 Nov 21, 2022
Unofficial TensorFlow implementation of Protein Interface Prediction using Graph Convolutional Networks.

[TensorFlow] Protein Interface Prediction using Graph Convolutional Networks Unofficial TensorFlow implementation of Protein Interface Prediction usin

YeongHyeon Park 9 Oct 25, 2022
Code for the ICCV 2021 paper "Pixel Difference Networks for Efficient Edge Detection" (Oral).

Pixel Difference Convolution This repository contains the PyTorch implementation for "Pixel Difference Networks for Efficient Edge Detection" by Zhuo

Alex 236 Dec 21, 2022
CoSMA: Convolutional Semi-Regular Mesh Autoencoder. From Paper "Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes"

Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes Implementation of CoSMA: Convolutional Semi-Regular Mesh Autoencoder arXiv p

Fraunhofer SCAI 10 Oct 11, 2022
A unofficial pytorch implementation of PAN(PSENet2): Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network Requirements pytorch 1.1+ torchvision 0.3+ pyclipper opencv3 gcc

zhoujun 400 Dec 26, 2022
Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

CIPS -- Official Pytorch Implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis Requirements pip install -r requi

Multimodal Lab @ Samsung AI Center Moscow 201 Dec 21, 2022
Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Transformer in Transformer Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image c

Phil Wang 272 Dec 23, 2022
The implementation of ICASSP 2020 paper "Pixel-level self-paced learning for super-resolution"

Pixel-level Self-Paced Learning for Super-Resolution This is an official implementaion of the paper Pixel-level Self-Paced Learning for Super-Resoluti

Elon Lin 41 Dec 15, 2022
This is an official implementation of "Polarized Self-Attention: Towards High-quality Pixel-wise Regression"

Polarized Self-Attention: Towards High-quality Pixel-wise Regression This is an official implementation of: Huajun Liu, Fuqiang Liu, Xinyi Fan and Don

DeLightCMU 212 Jan 8, 2023
Pytorch Implementation for NeurIPS (oral) paper: Pixel Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation

Pixel-Level Cycle Association This is the Pytorch implementation of our NeurIPS 2020 Oral paper Pixel-Level Cycle Association: A New Perspective for D

null 87 Oct 19, 2022
PyTorch implementation of Graph Convolutional Networks in Feature Space for Image Deblurring and Super-resolution, IJCNN 2021.

GCResNet PyTorch implementation of Graph Convolutional Networks in Feature Space for Image Deblurring and Super-resolution, IJCNN 2021. The code will

null 11 May 19, 2022
A PyTorch implementation of "Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" (KDD 2019).

ClusterGCN ⠀⠀ A PyTorch implementation of "Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" (KDD 2019). A

Benedek Rozemberczki 697 Dec 27, 2022
An implementation of paper `Real-time Convolutional Neural Networks for Emotion and Gender Classification` with PaddlePaddle.

简介 通过PaddlePaddle框架复现了论文 Real-time Convolutional Neural Networks for Emotion and Gender Classification 中提出的两个模型,分别是SimpleCNN和MiniXception。利用 imdb_crop

null 8 Mar 11, 2022
This project is a loose implementation of paper "Algorithmic Financial Trading with Deep Convolutional Neural Networks: Time Series to Image Conversion Approach"

Stock Market Buy/Sell/Hold prediction Using convolutional Neural Network This repo is an attempt to implement the research paper titled "Algorithmic F

Asutosh Nayak 136 Dec 28, 2022
Pytorch implementation of AngularGrad: A New Optimization Technique for Angular Convergence of Convolutional Neural Networks

AngularGrad Optimizer This repository contains the oficial implementation for AngularGrad: A New Optimization Technique for Angular Convergence of Con

mario 124 Sep 16, 2022
PyTorch implementation of "ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context" (INTERSPEECH 2020)

ContextNet ContextNet has CNN-RNN-transducer architecture and features a fully convolutional encoder that incorporates global context information into

Sangchun Ha 24 Nov 24, 2022
A PyTorch implementation of " EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks."

EfficientNet A PyTorch implementation of EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. [arxiv] [Official TF Repo] Implemen

AhnDW 298 Dec 10, 2022
Official PyTorch Implementation of Convolutional Hough Matching Networks, CVPR 2021 (oral)

Convolutional Hough Matching Networks This is the implementation of the paper "Convolutional Hough Matching Network" by J. Min and M. Cho. Implemented

Juhong Min 70 Nov 22, 2022