Caffe implementation for Hu et al. Segmentation for Natural Language Expressions

Last update: Jul 27, 2021

Related tags

Deep Learning text_objseg_caffe

Overview

Segmentation from Natural Language Expressions

This repository contains the Caffe reimplementation of the following paper:

R. Hu, M. Rohrbach, T. Darrell, Segmentation from Natural Language Expressions. in arXiv:1603.06180, 2016. (PDF)

@article{hu2016segmentation,
  title={Segmentation from Natural Language Expressions},
  author={Hu, Ronghang and Rohrbach, Marcus and Darrell, Trevor},
  journal={arXiv preprint arXiv:1603.06180},
  year={2016}
}

Project Page: http://ronghanghu.com/text_objseg

Installation

Install Caffe following the instructions here.
Download this repository or clone with Git, and then cd into the root directory of the repository.

Training and evaluation on ReferIt Dataset

Download dataset and VGG network

Download ReferIt dataset:

./referit/referit-dataset/download_referit_dataset.sh

Download the caffemodel for VGG-16 network parameters trained on ImageNET 1000 classes.

Training

You may need to add the repository root directory to Python's module path:

export PYTHONPATH=/path/to/text_objseg_caffe/:$PYTHONPATH

Build training batches for bounding boxes:

python referit/build_training_batches_det.py

Build training batches for segmentation:

python referit/build_training_batches_seg.py

Configure the config.py file in the directory det_model and train the language-based bounding box localization model:

python det_model/train_det_model.py

Configure the config.py file in the directory seg_low_res_model and train the low resolution language-based segmentation model (from the previous bounding box localization model):

python seg_low_res_model/train_low_res_model.py

Configure the config.py file in the directory seg_model and train the high resolution language-based segmentation model (from the previous low resolution segmentation model):

python seg_model/train_seg_model.py

Evaluation

You may need to add the repository root directory to Python's module path:

export PYTHONPATH=path/to/text_objseg_caffe:$PYTHONPATH

Configure the test_config.py file in the directory seg_model and run evaluation for the high resolution language-based segmentation model:

python seg_model/test_seg_model.py

This should reproduce the results in the paper. You may also evaluate the language-based bounding box localization model:

python det_model/test_det_model.py

The results can be compared to this paper.

Demo

There is a demo that you can try! Run the demo in ./demo/text_objseg_demo.ipynb with Jupyter Notebook (IPython Notebook).

PyTorch implementation of Memory-based semantic segmentation for off-road unstructured natural environments.

MemSeg: Memory-based semantic segmentation for off-road unstructured natural environments Introduction This repository is a PyTorch implementation of

11 Nov 28, 2022

Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Language Emergence in Multi Agent Dialog Code for the Paper Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog Satwik Kottur, José M.

105 Nov 25, 2022

Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Language Emergence in Multi Agent Dialog Code for the Paper Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog Satwik Kottur, José M.

105 Nov 25, 2022

Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

Pattern Pattern is a web mining module for Python. It has tools for: Data Mining: web services (Google, Twitter, Wikipedia), web crawler, HTML DOM par

Computational Linguistics Research Group

8.4k Jan 3, 2023

Uncertain natural language inference

Uncertain Natural Language Inference This repository hosts the code for the following paper: Tongfei Chen*, Zhengping Jiang*, Adam Poliak, Keisuke Sak

14 Sep 1, 2022

NaturalProofs: Mathematical Theorem Proving in Natural Language

NaturalProofs: Mathematical Theorem Proving in Natural Language NaturalProofs: Mathematical Theorem Proving in Natural Language Sean Welleck, Jiacheng

83 Jan 5, 2023

Release of SPLASH: Dataset for semantic parse correction with natural language feedback in the context of text-to-SQL parsing

SPLASH: Semantic Parsing with Language Assistance from Humans SPLASH is dataset for the task of semantic parse correction with natural language feedba

Microsoft Research - Language and Information Technologies (MSR LIT)

35 Oct 31, 2022

The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

Cutoff: A Simple Data Augmentation Approach for Natural Language This repository contains source code necessary to reproduce the results presented in

49 Dec 22, 2022

🏆 The 1st Place Submission to AICity Challenge 2021 Natural Language-Based Vehicle Retrieval Track (Alibaba-UTS submission)

AI City 2021: Connecting Language and Vision for Natural Language-Based Vehicle Retrieval 🏆 The 1st Place Submission to AICity Challenge 2021 Natural

82 Dec 29, 2022

Comments

L2Normalize Layer?

Hi @Seth-Park . Thank you for the code. I was wondering which Caffe version are you using? The network structure uses L2Normalize layer, which I cannot find in the BVLC Caffe. Thanks!

opened by chenxi116 1
Cannot copy param 0 weights from layer 'fc6'

when I running train_det_model.py，I meet this question: Cannot copy param 0 weights from layer 'fc6'; shape mismatch. Source param shape is 1024 512 3 3 (4718592); target param shape is 4096 25088 (102760448). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer. how can i solve this problem?

opened by WangLanxiao 0
Unknown layer type: L2Normalize

Hi Seth: Thank you for your great work. When I train the detection model. I got the following error:

I0322 11:15:13.847193 15945 net.cpp:129] Top shape: 50 1000 (50000) I0322 11:15:13.847195 15945 net.cpp:137] Memory required for data: 5770873800 I0322 11:15:13.847198 15945 layer_factory.hpp:77] Creating layer img_l2norm F0322 11:15:13.847214 15945 layer_factory.hpp:81] Check failed: registry.count(type) == 1 (0 vs. 1) Unknown layer type: L2Normalize (known types: AbsVal, Accuracy, ArgMax, BNLL, BatchNorm, BatchReindex, Bias, Concat, ContrastiveLoss, Convolution, Crop, Data, Deconvolution, Dropout, DummyData, ELU, Eltwise, Embed, EuclideanLoss, Exp, Filter, Flatten, HDF5Data, HDF5Output, HingeLoss, Im2col, ImageData, InfogainLoss, InnerProduct, Input, LRN, LSTM, LSTMUnit, Log, MVN, MemoryData, MultinomialLogisticLoss, PReLU, Parameter, Pooling, Power, Python, RNN, ReLU, Reduction, Reshape, SPP, Scale, Sigmoid, SigmoidCrossEntropyLoss, Silence, Slice, Softmax, SoftmaxWithLoss, Split, TanH, Threshold, Tile, WindowData) *** Check failure stack trace: *** Aborted (core dumped)

I am using official version of caffe and successfully compiled the caffe with pycaffe and anaconda. and followed your steps.But I still got the error. Is there anything I am missing? My system is Ubuntu16.04 LTS with Nvidia Cuda8.0 Titan Xp I appreciate if you can help me.

opened by derkbreeze 4
Image input channel order

Hi @Seth-Park . It seems from https://github.com/Seth-Park/text_objseg_caffe/blob/master/seg_model/referit_data_provider.py#L47 that in the data-providing Python layer, the image is in RGB order. However, shouldn't image input to Caffe be in BGR order? Is there any channel-switching step that I missed? Thank you very much!

opened by chenxi116 2

Caffe implementation for Hu et al. Segmentation for Natural Language Expressions

Related tags

Overview

Segmentation from Natural Language Expressions

Installation

Training and evaluation on ReferIt Dataset

Download dataset and VGG network

Training

Evaluation

Demo

You might also like...

PyTorch implementation of Memory-based semantic segmentation for off-road unstructured natural environments.

Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

Uncertain natural language inference

NaturalProofs: Mathematical Theorem Proving in Natural Language

Release of SPLASH: Dataset for semantic parse correction with natural language feedback in the context of text-to-SQL parsing

The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

🏆 The 1st Place Submission to AICity Challenge 2021 Natural Language-Based Vehicle Retrieval Track (Alibaba-UTS submission)

Comments

L2Normalize Layer?

Cannot copy param 0 weights from layer 'fc6'

Unknown layer type: L2Normalize

Image input channel order

Owner

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.

a reimplementation of LiteFlowNet in PyTorch that matches the official Caffe version

The original weights of some Caffe models, ported to PyTorch.

Caffe-like explicit model constructor. C(onfig)Model

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

Turning SymPy expressions into PyTorch modules.

Turning SymPy expressions into JAX functions

MacroTools provides a library of tools for working with Julia code and expressions.