Code release for Hu et al. Segmentation from Natural Language Expressions. in ECCV, 2016

Ronghang Hu

Last update: May 24, 2022

Related tags

Deep Learning text_objseg

Overview

Segmentation from Natural Language Expressions

This repository contains the code for the following paper:

R. Hu, M. Rohrbach, T. Darrell, Segmentation from Natural Language Expressions. in ECCV, 2016. (PDF)

@article{hu2016segmentation,
  title={Segmentation from Natural Language Expressions},
  author={Hu, Ronghang and Rohrbach, Marcus and Darrell, Trevor},
  journal={Proceedings of the European Conference on Computer Vision (ECCV)},
  year={2016}
}

Project Page: http://ronghanghu.com/text_objseg

Installation

Install Google TensorFlow (v1.0.0 or higher) following the instructions here.
Download this repository or clone with Git, and then cd into the root directory of the repository.

Demo

Download the trained models:
exp-referit/tfmodel/download_trained_models.sh.
Run the language-based segmentation model demo in ./demo/text_objseg_demo.ipynb with Jupyter Notebook (IPython Notebook).

Training and evaluation on ReferIt Dataset

Download dataset and VGG network

Download ReferIt dataset:
exp-referit/referit-dataset/download_referit_dataset.sh.
Download VGG-16 network parameters trained on ImageNET 1000 classes:
models/convert_caffemodel/params/download_vgg_params.sh.

Training

You may need to add the repository root directory to Python's module path: export PYTHONPATH=.:$PYTHONPATH.
Build training batches for bounding boxes:
python exp-referit/build_training_batches_det.py.
Build training batches for segmentation:
python exp-referit/build_training_batches_seg.py.
Select the GPU you want to use during training:
export GPU_ID=<gpu id>. Use 0 for <gpu id> if you only have one GPU on your machine.
Train the language-based bounding box localization model:
python exp-referit/exp_train_referit_det.py $GPU_ID.
Train the low resolution language-based segmentation model (from the previous bounding box localization model):
python exp-referit/init_referit_seg_lowres_from_det.py && python exp-referit/exp_train_referit_seg_lowres.py $GPU_ID.
Train the high resolution language-based segmentation model (from the previous low resolution segmentation model):
python exp-referit/init_referit_seg_highres_from_lowres.py && python exp-referit/exp_train_referit_seg_highres.py $GPU_ID.

Alternatively, you may skip the training procedure and download the trained models directly:
exp-referit/tfmodel/download_trained_models.sh.

Evaluation

Select the GPU you want to use during testing: export GPU_ID=<gpu id>. Use 0 for <gpu id> if you only have one GPU on your machine. Also, you may need to add the repository root directory to Python's module path: export PYTHONPATH=.:$PYTHONPATH.
Run evaluation for the high resolution language-based segmentation model:
python exp-referit/exp_test_referit_seg.py $GPU_ID
This should reproduce the results in the paper.
You may also evaluate the language-based bounding box localization model:
python exp-referit/exp_test_referit_det.py $GPU_ID
The results can be compared to this paper.

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

============================================================================================================ `MILA will stop developing Theano https:

9.6k Jan 6, 2023

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

============================================================================================================ `MILA will stop developing Theano https:

9.3k Feb 12, 2021

Turning SymPy expressions into PyTorch modules.

sympytorch A micro-library as a convenience for turning SymPy expressions into PyTorch Modules. All SymPy floats become trainable parameters. All SymP

89 Dec 13, 2022

Turning SymPy expressions into JAX functions

sympy2jax Turn SymPy expressions into parametrized, differentiable, vectorizable, JAX functions. All SymPy floats become trainable input parameters. S

38 Dec 11, 2022

Code for our paper at ECCV 2020: Post-Training Piecewise Linear Quantization for Deep Neural Networks

PWLQ Updates 2020/07/16 - We are working on getting permission from our institution to release our source code. We will release it once we are granted

54 Dec 15, 2022

Code for the paper: Adversarial Training Against Location-Optimized Adversarial Patches. ECCV-W 2020.

Adversarial Training Against Location-Optimized Adversarial Patches arXiv | Paper | Code | Video | Slides Code for the paper: Sukrut Rao, David Stutz,

32 Dec 13, 2022

Code for ECCV 2020 paper "Contacts and Human Dynamics from Monocular Video".

Contact and Human Dynamics from Monocular Video This is the official implementation for the ECCV 2020 spotlight paper by Davis Rempe, Leonidas J. Guib

207 Jan 5, 2023

Code for Towards Streaming Perception (ECCV 2020) :car:

sAP — Code for Towards Streaming Perception ECCV Best Paper Honorable Mention Award Feb 2021: Announcing the Streaming Perception Challenge (CVPR 2021

85 Dec 22, 2022

Code for paper ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop.

Who Left the Dogs Out? Evaluation and demo code for our ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization

29 Dec 28, 2022

Comments

Understanding of spatial feature map

https://github.com/ronghanghu/text_objseg/blob/be664d66c72319363691628958727124ea38cad8/models/processing_tools.py#L6

Hi, may I ask why spatial channel number is eight rather than two, which mismatches the claim in the paper:

...we obtain a w×h×(D_im +2) representation containing local image descriptors and spatial coordinates...

Could you help to clarify it or am I missing anything else? Thank you.

opened by czhang0528 0
about LSTM

Hi ronghang, I am reading your excellent work "Segmentation from Natural Language Expressions" , I am wondering someting about the Dtext = 1000 dimensional hidden state h, which is obtained by LSTM. Seems like that you encode the word vectors with LSTM, but is there some physical definition of the output h? and how it makes sense on segmentation task?

Thank you very much for your time and consideration. I look forward to hearing from you earlier.

opened by YsSue 0
Step 8

Hey ronghang This is ravi , i am really amazed by the work you are doing, specially this segmentation model using NLP, i had some query regarding it. I am trying to train the low resolution model using your code. I followed instructions in the README, since i just want to train the segmentation model, do i have to first train bounding box model in step 7 as well ? Is it possible to train seg without it? If you have time please reply. Thanks

opened by ravichoudhary123 0
Low accuracy of own trained model

Hi ronghang

I am trying to train the high resolution model using your code. I followed all the instructions in the README, and did not change any parameters in the code, but the the performance of trained model is extremely low, just about 4.5% for overall IoU. Is it possible that you updated the code afterward, but the modified code is not uploaded to the github?

Thanks

opened by yichunk 8

Code release for Hu et al. Segmentation from Natural Language Expressions. in ECCV, 2016

Related tags

Overview

Segmentation from Natural Language Expressions

Installation

Demo

Training and evaluation on ReferIt Dataset

Download dataset and VGG network

Training

Evaluation

You might also like...

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

Turning SymPy expressions into PyTorch modules.

Turning SymPy expressions into JAX functions

Code for our paper at ECCV 2020: Post-Training Piecewise Linear Quantization for Deep Neural Networks

Code for the paper: Adversarial Training Against Location-Optimized Adversarial Patches. ECCV-W 2020.

Code for ECCV 2020 paper "Contacts and Human Dynamics from Monocular Video".

Code for Towards Streaming Perception (ECCV 2020) :car:

Code for paper ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop.

Comments

Understanding of spatial feature map

about LSTM

Step 8

Low accuracy of own trained model

Owner

Ronghang Hu

Release of SPLASH: Dataset for semantic parse correction with natural language feedback in the context of text-to-SQL parsing

Code for the paper "Improving Vision-and-Language Navigation with Image-Text Pairs from the Web" (ECCV 2020)

Source code for "Progressive Transformers for End-to-End Sign Language Production" (ECCV 2020)

Pytorch implementation of Value Iteration Networks (NIPS 2016 best paper)

MacroTools provides a library of tools for working with Julia code and expressions.

[NeurIPS2021] Code Release of K-Net: Towards Unified Image Segmentation

Sign Language Translation with Transformers (COLING'2020, ECCV'20 SLRTP Workshop)

Code release for SLIP Self-supervision meets Language-Image Pre-training

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.