Tree LSTM implementation in PyTorch

Riddhiman Dasgupta

Last update: Dec 10, 2022

Related tags

Deep Learning machine-learning deep-learning pytorch machinelearning deeplearning recursive-neural-networks tree-lstm treelstm

Overview

Tree-Structured Long Short-Term Memory Networks

This is a PyTorch implementation of Tree-LSTM as described in the paper Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks by Kai Sheng Tai, Richard Socher, and Christopher Manning. On the semantic similarity task using the SICK dataset, this implementation reaches:

Pearson's coefficient: 0.8492 and MSE: 0.2842 using hyperparameters --lr 0.010 --wd 0.0001 --optim adagrad --batchsize 25
Pearson's coefficient: 0.8674 and MSE: 0.2536 using hyperparameters --lr 0.025 --wd 0.0001 --optim adagrad --batchsize 25 --freeze_embed
Pearson's coefficient: 0.8676 and MSE: 0.2532 are the numbers reported in the original paper.
Known differences include the way the gradients are accumulated (normalized by batchsize or not).

Requirements

Python (tested on 3.6.5, should work on >=2.7)
Java >= 8 (for Stanford CoreNLP utilities)
Other dependencies are in requirements.txt Note: Currently works with PyTorch 0.4.0. Switch to the pytorch-v0.3.1 branch if you want to use PyTorch 0.3.1.

Usage

Before delving into how to run the code, here is a quick overview of the contents:

Use the script fetch_and_preprocess.sh to download the SICK dataset, Stanford Parser and Stanford POS Tagger, and Glove word vectors (Common Crawl 840) -- Warning: this is a 2GB download!), and additionally preprocees the data, i.e. generate dependency parses using Stanford Neural Network Dependency Parser.
main.pydoes the actual heavy lifting of training the model and testing it on the SICK dataset. For a list of all command-line arguments, have a look at config.py.
- The first run caches GLOVE embeddings for words in the SICK vocabulary. In later runs, only the cache is read in during later runs.
- Logs and model checkpoints are saved to the checkpoints/ directory with the name specified by the command line argument --expname.

Next, these are the different ways to run the code here to train a TreeLSTM model.

Local Python Environment

If you have a working Python3 environment, simply run the following sequence of steps:

- bash fetch_and_preprocess.sh
- pip install -r requirements.txt
- python main.py

Pure Docker Environment

If you want to use a Docker container, simply follow these steps:

- docker build -t treelstm .
- docker run -it treelstm bash
- bash fetch_and_preprocess.sh
- python main.py

Local Filesystem + Docker Environment

If you want to use a Docker container, but want to persist data and checkpoints in your local filesystem, simply follow these steps:

- bash fetch_and_preprocess.sh
- docker build -t treelstm .
- docker run -it --mount type=bind,source="$(pwd)",target="/root/treelstm.pytorch" treelstm bash
- python main.py

NOTE: Setting the environment variable OMP_NUM_THREADS=1 usually gives a speedup on the CPU. Use it like OMP_NUM_THREADS=1 python main.py. To run on a GPU, set the CUDA_VISIBLE_DEVICES instead. Usually, CUDA does not give much speedup here, since we are operating at a batchsize of 1.

Notes

(Apr 02, 2018) Added Dockerfile
(Apr 02, 2018) Now works on PyTorch 0.3.1 and Python 3.6, removed dependency on Python 2.7
(Nov 28, 2017) Added frozen embeddings, closed gap to paper.
(Nov 08, 2017) Refactored model to get 1.5x - 2x speedup.
(Oct 23, 2017) Now works with PyTorch 0.2.0.
(May 04, 2017) Added support for sparse tensors. Using the --sparse argument will enable sparse gradient updates for nn.Embedding, potentially reducing memory usage.
- There are a couple of caveats, however, viz. weight decay will not work in conjunction with sparsity, and results from the original paper might not be reproduced using sparse embeddings.

Acknowledgements

Shout-out to Kai Sheng Tai for the original LuaTorch implementation, and to the Pytorch team for the fun library.

Contact

Riddhiman Dasgupta

This is my first PyTorch based implementation, and might contain bugs. Please let me know if you find any!

License

MIT

Comments

No such file or directory: data/sick/train/a.toks

Hi: I need help! IOError: [Errno 2] No such file or directory: 'data/sick/train/a.toks' run python main.py with default paras, then throw this error.

Where this file a.toks comes from?

opened by adrianhust 7

map_label_to_target should init zero tensor

Your map_label_to_target for SICK dataset init random tensor.

def map_label_to_target(label,num_classes):
    target = torch.Tensor(1,num_classes) # this is not zero tensor
    ceil = int(math.ceil(label))
    floor = int(math.floor(label))
    if ceil==floor:
        target[0][floor-1] = 1
    else:
        target[0][floor-1] = ceil - label
        target[0][ceil-1] = label - floor
    return target

However, in treelstm , the author init zero tensor

local targets = torch.zeros(batch_size, self.num_classes)
for j = 1, batch_size do
  local sim = dataset.labels[indices[i + j - 1]] * (self.num_classes - 1) + 1
  local ceil, floor = math.ceil(sim), math.floor(sim)
  if ceil == floor then
    targets[{j, floor}] = 1
  else
    targets[{j, floor}] = ceil - sim
    targets[{j, ceil}] = sim - floor
  end

opened by ttpro1995 5

Matrix problem

  File ".../treelstm.pytorch/model.py", line 36, in node_forward
    u = F.tanh(self.ux(inputs)+self.uh(child_h_sum))
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 206, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/linear.py", line 54, in forward
    return self._backend.Linear.apply(input, self.weight, self.bias)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/_functions/linear.py", line 12, in forward
    output.addmm_(0, 1, input, weight.t())
RuntimeError: matrix and matrix expected at

I think you may miss the unsqueeze operation?

        i = F.sigmoid(self.ix(inputs)+self.ih(child_h_sum.unsqueeze(0)))
        o = F.sigmoid(self.ox(inputs)+self.oh(child_h_sum.unsqueeze(0)))
        u = F.tanh(self.ux(inputs)+self.uh(child_h_sum.unsqueeze(0)))

opened by gujiuxiang 3

can not find packages

lib\CollapseUnaryTransformer.java:3: 错误: 程序包edu.stanford.nlp.ling不存在 import edu.stanford.nlp.ling.Label; ^ lib\CollapseUnaryTransformer.java:4: 错误: 程序包edu.stanford.nlp.trees不存在 import edu.stanford.nlp.trees.Tree; ^ lib\CollapseUnaryTransformer.java:5: 错误: 程序包edu.stanford.nlp.trees不存在 import edu.stanford.nlp.trees.TreeTransformer; ^ lib\CollapseUnaryTransformer.java:6: 错误: 程序包edu.stanford.nlp.util不存在 import edu.stanford.nlp.util.Generics;

... What can I do ??

opened by venusafroid 2
how to run it in GPU？？？

i run pip install -r req....

but ,couldnt run it with python main.py --cuda

with the trace back: AssertionError: Torch not compiled with CUDA enabled

opened by herbertchen1 2
Can the

I run the sentiment model successfully. My gpus are double 1080ti, and get a 14% in gpu0. Is there an extra way to run it on multigpu? I implement a model in tensorflow fold, but it seems that it can't support multigpu.

opened by KazuhiraDZ 2
Sizes do not match

When I run python main.py I met the following error message

Namespace(batchsize=25, cuda=True, data='data/sick/', epochs=15, expname='test', glove='data/glove/', hidden_dim=50, input_dim=150, lr=0.01, mem_dim=75, num_classes=5, >optim='adagrad', save='checkpoints/', seed=123, sparse=False, wd=0.0001) ==> SICK vocabulary size : 2412 ==> Size of train data : 4500 ==> Size of dev data : 500 ==> Size of test data : 4927 Traceback (most recent call last): File "main.py", line 157, in main() File "main.py", line 126, in main model.childsumtreelstm.emb.state_dict()['weight'].copy_(emb)

RuntimeError: sizes do not match at /py/conda-bld/pytorch_1493676237139/work/torch/lib/THC/THCTensorCopy.cu:31

The platform is Arch Linux and CUDA8.0

I would appreciate it for any reply.

opened by Jarvx 2
How to make it with dynamic batching?

This implementation can only process one sample at a time. The performance is limited since the usage of the GPU is low. Is there possibility to make treelstm support dynamic batching such that the GPU can be fully utilized?

opened by xuehy 2

ChildSumTreeLSTM : fx and fh linear layer are declare but is not used

Line 21, 22

self.fx = nn.Linear(self.in_dim,self.mem_dim)
self.fh = nn.Linear(self.mem_dim,self.mem_dim)

But it is never use

I think you intend to use in line 38, 39. (perhaps typo ix with fx )

fx = F.torch.unsqueeze(self.ix(inputs),1)
f = F.torch.cat([self.ih(child_hi)+fx for child_hi in child_h], 0)

opened by ttpro1995 2

classpath error
my current environment: windows 10 python 3.6 pytorch 0.4 IDE pycharm I try to run the code preprocess-sick.py and get an error cannot find or load the class then I try to copy the java cmd to windows cmd window there is an same error raised error code line:

cmd = ('java -cp %s DependencyParse -tokpath %s -parentpath %s -relpath %s %s < %s' % (cp, tokpath, parentpath, relpath, tokenize_flag, filepath)) os.system(cmd)
opened by dgai91 1
Fix RuntimeError: a leaf Variable that requires grad has been used in…

I just tried to run the training locally, and met with the following error when not to freeze the embedding.

Traceback (most recent call last): File "main.py", line 189, in main() File "main.py", line 138, in main emb) File "/Users/Jizg/git/treelstm.pytorch/treelstm/model.py", line 76, in init self.emb.weight.copy_(init_emb) RuntimeError: a leaf Variable that requires grad has been used in an in-place operation.

I updated the main script a little bit and it seems to be OK now.

opened by jizg 1

IndexError: index 54 is out of bounds for dimension 0 with size 54

tree.state = self.node_forward(inputs[tree.idx], child_c, child_h)

len(inputs) == 54 tree.idx == 54

https://github.com/dasguptar/treelstm.pytorch/blob/228a314add09fc7f39ea752aa7b1fcf756cfe277/treelstm/dataset.py#L70

more informations

inputs[tree.idx] tensor([[ 3.7410e-02,  5.7619e-02,  3.3822e-01,  ..., -3.5774e-02,
         -7.8579e-02,  1.0644e-02],
        [-2.5287e-02, -2.5835e-01, -7.5715e-02,  ...,  1.2864e-01,
          1.3856e-01,  3.3581e-01],
        [-5.4430e-02, -1.6442e-01, -6.7605e-02,  ...,  1.7388e-01,
         -3.9886e-01, -1.3006e-02],
        ...,
        [-2.5433e-02, -8.0709e-02,  6.2163e-01,  ...,  2.7345e-01,
         -5.6782e-02,  1.8956e-01],
        [-2.4587e-01,  8.9087e-03, -1.5240e-03,  ..., -3.2474e-01,
          1.1630e-02, -1.3252e-01],
        [ 4.9405e-04, -3.5795e-01, -2.2226e-01,  ..., -9.1428e-02,
          2.2649e-01, -2.0806e-01]], device='cuda:0',
       grad_fn=<EmbeddingBackward>)
Traceback (most recent call last):
  File "main.py", line 185, in <module>
    main()
  File "main.py", line 155, in main
    train_loss = trainer.train(train_dataset)
  File "/home/qingdujun/public/runtime/models/treelstm.pytorch/treelstm/trainer.py", line 29, in train
    output = self.model(linput, rtree, rinput)
  File "/home/qingdujun/Applications/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/qingdujun/public/runtime/models/treelstm.pytorch/treelstm/model.py", line 90, in forward
    rstate, rhidden = self.childsumtreelstm(rtree, rinputs)
  File "/home/qingdujun/Applications/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/qingdujun/public/runtime/models/treelstm.pytorch/treelstm/model.py", line 38, in forward
    self.forward(tree.children[idx], inputs)
  File "/home/qingdujun/public/runtime/models/treelstm.pytorch/treelstm/model.py", line 38, in forward
    self.forward(tree.children[idx], inputs)
  File "/home/qingdujun/public/runtime/models/treelstm.pytorch/treelstm/model.py", line 38, in forward
    self.forward(tree.children[idx], inputs)
  [Previous line repeated 10 more times]
  File "/home/qingdujun/public/runtime/models/treelstm.pytorch/treelstm/model.py", line 48, in forward
    tree.state = self.node_forward(inputs[tree.idx], child_c, child_h)
IndexError: index 54 is out of bounds for dimension 0 with size 54

opened by qingdujun 0

Error while Compiling

Ubuntu 18.04 Java 11.0.3 running (as part of fetch_and_preprocess.sh) javac -cp $CLASSPATH lib/*.java -Xlint:unchecked

lib/CollapseUnaryTransformer.java:17: error: error while writing CollapseUnaryTransform
er: /home/eduard_ergenzinger/treelstm.pytorch/lib/CollapseUnaryTransformer.class
public class CollapseUnaryTransformer implements TreeTransformer {
       ^
lib/ConstituencyParse.java:58: warning: [unchecked] unchecked call to PTBTokenizer(Read
er,LexedTokenFactory<T>,String) as a member of the raw type PTBTokenizer
      PTBTokenizer<Word> tokenizer = new PTBTokenizer(new StringReader(line), new WordT
okenFactory(), "");
                                     ^
  where T is a type-variable:
    T extends HasWord declared in class PTBTokenizer
lib/ConstituencyParse.java:58: warning: [unchecked] unchecked conversion
      PTBTokenizer<Word> tokenizer = new PTBTokenizer(new StringReader(line), new WordT
okenFactory(), "");
                                     ^
  required: PTBTokenizer<Word>
  found:    PTBTokenizer
lib/DependencyParse.java:57: warning: [unchecked] unchecked call to PTBTokenizer(Reader
,LexedTokenFactory<T>,String) as a member of the raw type PTBTokenizer
        PTBTokenizer<Word> tokenizer = new PTBTokenizer(
                                       ^
  where T is a type-variable:
    T extends HasWord declared in class PTBTokenizer
lib/DependencyParse.java:57: warning: [unchecked] unchecked conversion
        PTBTokenizer<Word> tokenizer = new PTBTokenizer(
                                       ^
  required: PTBTokenizer<Word>
  found:    PTBTokenizer
1 error
4 warnings

opened by 121eddie 3

Nodes' hidden representations?

Hello, not an issue, but what's the easiest way to extract the learned hidden embeddings for each node in a ChildSum tree? New to PyTorch, so forgive my ignorance.

Thanks!

opened by chriswtanner 0
Batch support for TreeLSTM
Existing implementation doesn't support forward/backward with batch of trees as inputs, which is slow in training and inference. The pull requests support batch operation for TreeLSTM, and reproduces the exact same results as without batch.

To run with batch:

- python main.py --use_batch
opened by jinfengr 0

Does current TreeLSTM support batch size?

It seems batch size is still not supported from the code? In the forward function of ChildSumTreeLSTM, it seems that it only support process a single tree in one forward.

 def forward(self, tree, inputs):
    for idx in range(tree.num_children):
        self.forward(tree.children[idx], inputs)

    if tree.num_children == 0:
        child_c = inputs[0].detach().new(1, self.mem_dim).fill_(0.).requires_grad_()
        child_h = inputs[0].detach().new(1, self.mem_dim).fill_(0.).requires_grad_()
    else:
        child_c, child_h = zip(* map(lambda x: x.state, tree.children))
        child_c, child_h = torch.cat(child_c, dim=0), torch.cat(child_h, dim=0)

    tree.state = self.node_forward(inputs[tree.idx], child_c, child_h)
    return tree.state

opened by jinfengr 0

Owner

Riddhiman Dasgupta

Deep Learning, Science Fiction, Comic Books

GitHub

Pytorch implementation of the paper Progressive Growing of Points with Tree-structured Generators (BMVC 2021)

PGpoints Pytorch implementation of the paper Progressive Growing of Points with Tree-structured Generators (BMVC 2021) Hyeontae Son, Young Min Kim Pre

9 Jun 6, 2022

Multi-layer convolutional LSTM with Pytorch

Convolution_LSTM_pytorch Thanks for your attention. I haven't got time to maintain this repo for a long time. I recommend this repo which provides an

734 Jan 3, 2023

A3C LSTM Atari with Pytorch plus A3G design

NEWLY ADDED A3G A NEW GPU/CPU ARCHITECTURE OF A3C FOR SUBSTANTIALLY ACCELERATED TRAINING!! RL A3C Pytorch NEWLY ADDED A3G!! New implementation of A3C

532 Jan 2, 2023

LSTM and QRNN Language Model Toolkit for PyTorch

LSTM and QRNN Language Model Toolkit This repository contains the code used for two Salesforce Research papers: Regularizing and Optimizing LSTM Langu

1.9k Jan 8, 2023

Multi-layer convolutional LSTM with Pytorch

Convolution_LSTM_pytorch Thanks for your attention. I haven't got time to maintain this repo for a long time. I recommend this repo which provides an

733 Dec 30, 2022

LSTM model trained on a small dataset of 3000 names written in PyTorch

LSTM model trained on a small dataset of 3000 names. Model generates names from model by selecting one out of top 3 letters suggested by model at a time until an EOS (End Of Sentence) character is not encountered.

1 Dec 20, 2021

Implementation of fast algorithms for Maximum Spanning Tree (MST) parsing that includes fast ArcMax+Reweighting+Tarjan algorithm for single-root dependency parsing.

Fast MST Algorithm Implementation of fast algorithms for (Maximum Spanning Tree) MST parsing that includes fast ArcMax+Reweighting+Tarjan algorithm fo

11 Oct 14, 2022

Using LSTM write Tang poetry

本教程将通过一个示例对LSTM进行介绍。通过搭建训练LSTM网络，我们将训练一个模型来生成唐诗。本文将对该实现进行详尽的解释，并阐明此模型的工作方式和原因。并不需要过多专业知识，但是可能需要新手花一些时间来理解的模型训练的实际情况。为了节省时间，请尽量选择GPU进行训练。

56 Dec 15, 2022

OHLC Average Prediction of Apple Inc. Using LSTM Recurrent Neural Network

Stock Price Prediction of Apple Inc. Using Recurrent Neural Network OHLC Average Prediction of Apple Inc. Using LSTM Recurrent Neural Network Dataset:

410 Jan 5, 2023

A resource for learning about deep learning techniques from regression to LSTM and Reinforcement Learning using financial data and the fitness functions of algorithmic trading

A tour through tensorflow with financial data I present several models ranging in complexity from simple regression to LSTM and policy networks. The s

195 Dec 7, 2022

Using multidimensional LSTM neural networks to create a forecast for Bitcoin price

Multidimensional LSTM BitCoin Time Series Using multidimensional LSTM neural networks to create a forecast for Bitcoin price. For notes around this co

318 Dec 14, 2022

Incorporating Transformer and LSTM to Kalman Filter with EM algorithm

Deep learning based state estimation: incorporating Transformer and LSTM to Kalman Filter with EM algorithm Overview Kalman Filter requires the true p

57 Dec 27, 2022

Forecasting directional movements of stock prices for intraday trading using LSTM and random forest

Forecasting directional movements of stock-prices for intraday trading using LSTM and random-forest https://arxiv.org/abs/2004.10178 Pushpendu Ghosh,

270 Dec 24, 2022

Deep learning based hand gesture recognition using LSTM and MediaPipie.

Hand Gesture Recognition Deep learning based hand gesture recognition using LSTM and MediaPipie. Demo video using PingPong Robot Files Pretrained mode

24 Nov 11, 2022

Sign Language is detected in realtime using video sequences. Our approach involves MediaPipe Holistic for keypoints extraction and LSTM Model for prediction.

RealTime Sign Language Detection using Action Recognition Approach Real-Time Sign Language is commonly predicted using models whose architecture consi

15 Aug 20, 2022

a reccurrent neural netowrk that when trained on a peice of text and fed a starting prompt will write its on 250 character text using LSTM layers

RNN-Playwrite a reccurrent neural netowrk that when trained on a peice of text and fed a starting prompt will write its on 250 character text using LS

1 Oct 29, 2021

Tree LSTM implementation in PyTorch

Related tags

Overview

Tree-Structured Long Short-Term Memory Networks

Requirements

Usage

Local Python Environment

Pure Docker Environment

Local Filesystem + Docker Environment

Notes

Acknowledgements

Contact

License

Comments

Owner

Riddhiman Dasgupta

Pytorch implementation of the paper Progressive Growing of Points with Tree-structured Generators (BMVC 2021)

Multi-layer convolutional LSTM with Pytorch

A3C LSTM Atari with Pytorch plus A3G design

LSTM and QRNN Language Model Toolkit for PyTorch

Multi-layer convolutional LSTM with Pytorch

LSTM model trained on a small dataset of 3000 names written in PyTorch

Implementation of fast algorithms for Maximum Spanning Tree (MST) parsing that includes fast ArcMax+Reweighting+Tarjan algorithm for single-root dependency parsing.

Using LSTM write Tang poetry

OHLC Average Prediction of Apple Inc. Using LSTM Recurrent Neural Network

A resource for learning about deep learning techniques from regression to LSTM and Reinforcement Learning using financial data and the fitness functions of algorithmic trading

Using multidimensional LSTM neural networks to create a forecast for Bitcoin price

Incorporating Transformer and LSTM to Kalman Filter with EM algorithm

Forecasting directional movements of stock prices for intraday trading using LSTM and random forest

Deep learning based hand gesture recognition using LSTM and MediaPipie.

Sign Language is detected in realtime using video sequences. Our approach involves MediaPipe Holistic for keypoints extraction and LSTM Model for prediction.

a reccurrent neural netowrk that when trained on a peice of text and fed a starting prompt will write its on 250 character text using LSTM layers

LSTM Neural Networks for Spectroscopic Studies of Type Ia Supernovae

A Simple LSTM-Based Solution for "Heartbeat Signal Classification and Prediction" in Tianchi

Using LSTM to detect spoofing attacks in an Air-Ground network