Pytorch implementations of various Deep NLP models in cs-224n(Stanford Univ)

Kim SungDong

Last update: Dec 24, 2022

Related tags

PyTorch Learning Resources nlp natural-language-processing deep-learning neural-network pytorch rnn deep-nlp-models stanford-univ cs-224n

Overview

DeepNLP-models-Pytorch

Pytorch implementations of various Deep NLP models in cs-224n(Stanford Univ: NLP with Deep Learning)

This is not for Pytorch beginners. If it is your first time to use Pytorch, I recommend these awesome tutorials.
If you're interested in DeepNLP, I strongly recommend you to work with this awesome lecture.
- cs-224n-slides
- cs-224n-videos

This material is not perfect but will help your study and research:) Please feel free to pull requests!!

Model	Links
01. Skip-gram-Naive-Softmax	[notebook / data / paper]
02. Skip-gram-Negative-Sampling	[notebook / data / paper]
03. GloVe	[notebook / data / paper]
04. Window-Classifier-for-NER	[notebook / data / paper]
05. Neural-Dependancy-Parser	[notebook / data / paper]
06. RNN-Language-Model	[notebook / data / paper]
07. Neural-Machine-Translation-with-Attention	[notebook / data / paper]
08. CNN-for-Text-Classification	[notebook / data / paper]
09. Recursive-NN-for-Sentiment-Classification	[notebook / data / paper]
10. Dynamic-Memory-Network-for-Question-Answering	[notebook / data / paper]

Requirements

Python 3.5
Pytorch 0.2+
nltk 3.2.2
gensim 2.2.0
sklearn_crfsuite

Getting started

git clone https://github.com/DSKSD/cs-224n-Pytorch.git

prepare dataset

cd script
chmod u+x prepare_dataset.sh
./prepare_dataset.sh

docker env

ubuntu 16.04 python 3.5.2 with various of ML/DL packages including tensorflow, sklearn, pytorch

docker pull dsksd/deepstudy:0.2

pip3 install docker-compose
cd script
docker-compose up -d

cloud setting

not yet

References

Author

Sungdong Kim / @DSKSD

Comments

some data is not available now

` download dependency parser dataset... (clone from https://github.com/rguthrie3/DeepDependencyParsingProblemSet mkdir: created directory '../dataset/dparser' --2018-03-02 15:08:08-- https://raw.githubusercontent.com/rguthrie3/DeepDependencyParsingProblemSet/master/data/train.txt Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.12.133 Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.12.133|:443... connected. HTTP request sent, awaiting response... 404 Not Found 2018-03-02 15:08:09 ERROR 404: Not Found.

--2018-03-02 15:08:09-- https://raw.githubusercontent.com/rguthrie3/DeepDependencyParsingProblemSet/master/data/vocab.txt Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.12.133 Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.12.133|:443... connected. HTTP request sent, awaiting response... 404 Not Found 2018-03-02 15:08:09 ERROR 404: Not Found.

--2018-03-02 15:08:09-- https://raw.githubusercontent.com/rguthrie3/DeepDependencyParsingProblemSet/master/data/dev.txt Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.12.133 Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.12.133|:443... connected. HTTP request sent, awaiting response... 404 Not Found 2018-03-02 15:08:09 ERROR 404: Not Found. `

It seems like @rguthrie3 has deleted the repo... Could you please update a new address for us? Thanks!

opened by RasinGue 1
Not able to reproduce results for CNN-for-Text-Classification
Hey,

I am trying to reproduce this notebook, but the loss do not go down as advertised.

[0/5] mean_loss : 1.80 [1/5] mean_loss : 1.64 [2/5] mean_loss : 1.64 [3/5] mean_loss : 1.62 [4/5] mean_loss : 1.76

I am using PyTorch v0.4 on CUDA. My hypothesis is the newer version broke something.

>torch.__version__ '0.4.0a0+a3e9151'

Thanks!
opened by sksq96 1
update details to make more useful and readable
I think this repository is really useful, so I spent many time, and found something.

This pull request is mainly doing 5 things as follows:

Make the code fit pep8. I noticed your code fit pep8 somewhere, but somewhere not. To make it more readable to more people as a tutorial and keep coherent, I make all the ten notebooks fit pep8.

Add the random seed by random.seed(1024) to make code can be Repeatable. As a tutorial, repeatability is very important. I run the notebook, and everytime get a different result. repeatability can help understand better.

Add gpu device_ids support by gpus=[0];torch.cuda.set_device(gpus[0]). pytorch use all gpus as default if torch.cuda.is_available(). Sometimes, some gpus are used fully. And then, the code raises errors.

Use dict.get(key) to instead if key in dict.keys() when building word vocabulary. The get method time complexity is O(1), the in method is O(n) due to loop. And always the word vocabulary is huge.

Use (3, 4, 5)to replace [3, 4, 5] in notebook 08:CNN-for-Text-classification->CNNClassifier. It's dangerous to use mutable list as the function's default params. It may cause unexpected error. For safty use, I replaced it by tuple, and don't need to change other lines.
opened by oneTaken 0
about the negative example loss in the Skip-gram-Negative-Sampling algorithm

I have learned a lot from this elegant project. Thanks a lots! Based on the equation in the Skip-gram-Negative-Sampling algorithm below,

I think the negative example loss calculated by

negative_score = torch.sum(neg_embeds.bmm(center_embeds.transpose(1, 2)).squeeze(2), 1).view(negs.size(0), -1) # BxK -> Bx1 loss = self.logsigmoid(positive_score) + self.logsigmoid(negative_score)

maybe change to negative_score = neg_embeds.bmm(center_embeds.transpose(1, 2)) loss = self.logsigmoid(positive_score) + torch.sum(self.logsigmoid(negative_score), 1) since based on the equation, the negative_socre first goes through a logsigmoid operation, and then sums up.

opened by xiaopengguo 0
1. fixing q-type leak 2. tweaks to run with Pytroch 1.x

Hi, thank you for the great repository!

However, I found that for QA type classification, the code includes the sub-type into the training/testing data. Needless to say it's a perfect predictor. So, I am fixing this, plus make a couple of tweaks to make it compatible with the latest Python.

I couldn't get rid of the annoying GenSim warning, but maybe you'll have more luck if you re-run the notebook.

Predictably, after removing the leak the accuracy drops. However, I verified that with ALL the data, it goes up again, at least when you randomly sample 10% from the training set to be test data. Maybe, the numbers will be different if you use the official test set. However, you data download script does not download all the data, so I kept this whole thing as is.

BTW, it takes only a few seconds to train on a MacBook pro (and a couple of minutes on all the data). Not long at all!

opened by searchivarius 0
about padding sequence

Hi, In file 08.CNN-for-Text-Classification.ipynb, where do you pad the input? Is it in [110], line 7: x_p.append(torch.cat([x[i], Variable(LongTensor([word2index['']] * (max_x - x[i].size(1)))).view(1, -1)], 1))? Thanks!

opened by ShellingFord221 0
about pretrained embeddings

Hi, I have a little question about file 08.CNN-for-Text-Classification.ipynb, [96], line 4: pretrained.append(model[word2index[key]]). word2index[key] means to find key's index, then you should find its pretrained embedding in GoogleNews-vectors-negative300.bin. But the index in this bin file should be different from the index generated from TREC dataset, i.e. model[key's index] may not be this key's (word's) embedding. Thanks!

opened by ShellingFord221 1
08. CNN-for-Text-Classification LogSoftmax와 Cross-entropy

안녕하세요

좋은 자료 공유해주셔서 정말 감사합니다. 관련 내용을 공부하면서 정말 많은 도움을 받고 있습니다.

Issue에 글을 쓰게된 이유는 다름이 아니라 08.CNN 예제에서 logsoftmax와 cross-entropy의 중복과 관련된 내용을 문의드리기 위함입니다.

CNNClassifier의 output은 모델의 출력값에 log_softmax를 취한 결과를 리턴한다고 되어 있는데요.

후에 모델의 출력 값을 pred라는 변수로 받아서, loss_function(Cross-Entropy)에 input으로 넣어주게 되는데, Pytorch의 Cross-Entropy 함수는 softmax 함수를 통과하기전 raw score의 결과를 input으로 받는다고 알고 있습니다.

따라서, 예제의 코드는 혹시 softmax가 2번 중첩되어 적용되는 것이 아닌지 궁금하여, 문의를 드리게 되었습니다.

감사합니다.

opened by DonghyungKo 2
How to save model for Neural Machine Translation ?

I want to save model for Neural Machine Translation (https://nbviewer.jupyter.org/github/DSKSD/DeepNLP-models-Pytorch/blob/master/notebooks/07.Neural-Machine-Translation-with-Attention.ipynb). Can you help me ?

opened by wannaphong 1

Owner

Kim SungDong

Naver AI LAB Researcher Interested in NLP / Representation Learning / Reinforcement Learning

GitHub

Open source guides/codes for mastering deep learning to deploying deep learning in production in PyTorch, Python, C++ and more.

Deep Learning Materials by Deep Learning Wizard Start Learning Now Please head to www.deeplearningwizard.com to start learning! It is mobile/tablet fr

572 Dec 28, 2022

Deep Learning (with PyTorch)

Deep Learning (with PyTorch) This notebook repository now has a companion website, where all the course material can be found in video and textual for

6.2k Jan 2, 2023

PyTorch Tutorial for Deep Learning Researchers

This repository provides tutorial code for deep learning researchers to learn PyTorch. In the tutorial, most of the models were implemented with less

25.4k Jan 5, 2023

Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 200 universities.

D2L.ai: Interactive Deep Learning Book with Multi-Framework Code, Math, and Discussions Book website | STAT 157 Course at UC Berkeley | Latest version

16k Jan 3, 2023

An IPython Notebook tutorial on deep learning for natural language processing, including structure prediction.

Table of Contents: Introduction to Torch's Tensor Library Computation Graphs and Automatic Differentiation Deep Learning Building Blocks: Affine maps,

1.8k Jan 4, 2023

PyTorch tutorials.

PyTorch Tutorials All the tutorials are now presented as sphinx style documentation at: https://pytorch.org/tutorials Contributing We use sphinx-galle

6.6k Jan 2, 2023

A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.

PyTorch Examples WARNING: if you fork this repo, github actions will run daily on it. To disable this, go to /examples/settings/actions and Disable Ac

19.4k Jan 1, 2023

C++ Implementation of PyTorch Tutorials for Everyone

C++ Implementation of PyTorch Tutorials for Everyone OS (Compiler)\LibTorch 1.9.0 macOS (clang 10.0, 11.0, 12.0) Linux (gcc 8, 9, 10, 11) Windows (msv

1.5k Jan 4, 2023

Simple examples to introduce PyTorch

This repository introduces the fundamental concepts of PyTorch through self-contained examples. At its core, PyTorch provides two main features: An n-

4.4k Jan 7, 2023

Minimal tutorials for PyTorch

Minimal tutorials for PyTorch adapted from Alec Radford's Theano tutorials. Tensor multiplication Linear Regression Logistic Regression Neural Network

321 Oct 25, 2022

PyTorch Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

pytorch-fcn PyTorch implementation of Fully Convolutional Networks. Requirements pytorch >= 0.2.0 torchvision >= 0.1.8 fcn >= 6.1.5 Pillow scipy tqdm

1.6k Jan 4, 2023

Simple PyTorch Tutorials Zero to ALL!

PyTorchZeroToAll Quick 3~4 day lecture materials for HKUST students. Video Lectures: (RNN TBA) Youtube Bilibili Slides Lecture Slides @GoogleDrive If

3.7k Dec 30, 2022

PyTorch tutorials and best practices.

Effective PyTorch Table of Contents Part I: PyTorch Fundamentals PyTorch basics Encapsulate your model with Modules Broadcasting the good and the ugly

1.5k Jan 4, 2023

A scalable template for PyTorch projects, with examples in Image Segmentation, Object classification, GANs and Reinforcement Learning.

PyTorch Project Template is being sponsored by the following tool; please help to support us by taking a look and signing up to a free trial PyTorch P

740 Dec 23, 2022

Pytorch implementations of various Deep NLP models in cs-224n(Stanford Univ)

Related tags

Overview

DeepNLP-models-Pytorch

Contents

Requirements

Getting started

prepare dataset

docker env

cloud setting

References

Author

Comments

Owner

Kim SungDong

Open source guides/codes for mastering deep learning to deploying deep learning in production in PyTorch, Python, C++ and more.

Deep Learning (with PyTorch)

PyTorch Tutorial for Deep Learning Researchers

Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 200 universities.

An IPython Notebook tutorial on deep learning for natural language processing, including structure prediction.

PyTorch tutorials.

A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.

C++ Implementation of PyTorch Tutorials for Everyone

Simple examples to introduce PyTorch

Minimal tutorials for PyTorch

PyTorch Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

Simple PyTorch Tutorials Zero to ALL!

PyTorch tutorials and best practices.

A scalable template for PyTorch projects, with examples in Image Segmentation, Object classification, GANs and Reinforcement Learning.

Some example scripts on pytorch

Example of network fine-tuning in pytorch for the kaggle competition Dogs vs. Cats Redux: Kernels Edition

ConvNet training using pytorch

simple generative adversarial network (GAN) using PyTorch

Torch Containers simplified in PyTorch