An assignment on creating a minimalist neural network toolkit for CS11-747

Graham Neubig

Last update: Dec 29, 2022

Related tags

Text Data & NLP minnn-assignment

Overview

minnn

by Graham Neubig, Zhisong Zhang, and Divyansh Kaushik

This is an exercise in developing a minimalist neural network toolkit for NLP, part of Carnegie Mellon University's CS11-747: Neural Networks for NLP.

The most important files it contains are the following:

minnn.py: This is what you'll need to implement. It implements a very minimalist version of a dynamic neural network toolkit (like PyTorch or Dynet). Some code is provided, but important functionality is not included.
classifier.py: training code for a Deep Averaging Network for text classification using minnn. You can feel free to make any modifications to make it a better model, but the original version of classifier.py must also run with your minnn.py implementation.
setup.py: this is blank, but if your classifier implementation needs to do some sort of data downloading (e.g. of pre-trained word embeddings) you can implement this here. It will be run before running your implementation of classifier.py.
data/: Two datasets, one from the Stanford Sentiment Treebank with tree info removed and another from IMDb reviews.

Assignment Details

Important Notes:

There is a detailed description of the code structure in structure.md, including a description of which parts you will need to implement.
The only allowed external library is numpy or cupy, no other external libraries are allowed.
We will run your code with the following commands, so make sure that whatever your best results are are reproducible using these commands (where you replace ANDREWID with your andrew ID):
- mkdir -p ANDREWID
- python classifier.py --train=data/sst-train.txt --dev=data/sst-dev.txt --test=data/sst-test.txt --dev_out=ANDREWID/sst-dev-output.txt --test_out=ANDREWID/sst-test-output.txt
- python classifier.py --train=data/cfimdb-train.txt --dev=data/cfimdb-dev.txt --test=data/cfimdb-test.txt --dev_out=ANDREWID/cfimdb-dev-output.txt --test_out=ANDREWID/cfimdb-test-output.txt
Reference accuracies: with our implementation and the default hyper-parameters, the mean(std) of accuracies with 10 different random seeds on sst is dev=0.4045(0.0070), test=0.4069(0.0105), and on cfimdb dev=0.8792(0.0084). If you implement things exactly in our way and use the default random seed and use the same environment (python 3.8 + numpy 1.18 or 1.19), you may get the accuracies of dev=0.4114, test=0.4253, and on cfimdb dev=0.8857.

The submission file should be a zip file with the following structure (assuming the andrew id is ANDREWID):

ANDREWID/
ANDREWID/minnn.py # completed minnn.py
ANDREWID/classifier.py.py # completed classifier.py with any of your modifications
ANDREWID/sst-dev-output.txt # output of the dev set for SST data
ANDREWID/sst-test-output.txt # output of the test set for SST data
ANDREWID/cfimdb-dev-output.txt # output of the dev set for CFIMDB data
ANDREWID/cfimdb-test-output.txt # output of the test set for CFIMDB data
ANDREWID/report.pdf # (optional), report. here you can describe anything particularly new or interesting that you did

Grading information:

A+: Submissions that implement something new and achieve particularly large accuracy improvements (e.g. 2% over the baseline on SST)
A: You additionally implement something else on top of the missing pieces, some examples include:
- Implementing another optimizer such as Adam
- Incorporating pre-trained word embeddings, such as those from fasttext
- Changing the model architecture significantly
A-: You implement all the missing pieces and the original classifier.py code achieves comparable accuracy to our reference implementation (about 41% on SST)
B+: All missing pieces are implemented, but accuracy is not comparable to the reference.
B or below: Some parts of the missing pieces are not implemented.

References

Stanford Sentiment Treebank: https://www.aclweb.org/anthology/D13-1170.pdf

IMDb Reviews: https://openreview.net/pdf?id=Sklgs0NFvr

Comments

unittests: move test_minnn.py to using unittests instead of manually written tests

This PR moves the minnn repo to using Python's inbuilt unittests module to do testing, along with some complementary changes. In particular, this PR creates a TestCase under which all 9 of the old tests appear. It also removes the use of assert statements, instead moving to unittest's self.assert* calls for individual values, and np.testing.assert_is_allclose calls for approximate array equality. This creates both better error reporting and the ability to run all tests and collect errors before the script fails.

This refactor no longer allows specifying the script path of minnn.py: any arguments to unittest are interpreted as paths/modules to test. I cleared this with @neubig in office hours; if it's helpful, I can add an option to specify the module for dynamic import with an environment variable.

This module can either be run as before from the console (python -m test_minnn, or python test_minnn.py) or with the unittest runners found in many IDEs (tested in PyCharm).

Example screenshot of running from console:

opened by gsireesh 0

Unsupervised text tokenizer for Neural Network-based text generation.

SentencePiece SentencePiece is an unsupervised text tokenizer and detokenizer mainly for Neural Network-based text generation systems where the vocabu

6.4k Jan 1, 2023

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

textgenrnn Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly tr

4.8k Dec 30, 2022

4.8k Feb 18, 2021

4.3k Feb 18, 2021

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

This codebase is being actively maintained, please create and issue if you have issues using it Basics All data files are included under losses and ea

32 Nov 9, 2021

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

UIS-RNN Overview This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm. UIS-RNN solves the problem of s

1.4k Dec 28, 2022

In this repository, I have developed an end to end Automatic speech recognition project. I have developed the neural network model for automatic speech recognition with PyTorch and used MLflow to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry.

End to End Automatic Speech Recognition In this repository, I have developed an end to end Automatic speech recognition project. I have developed the

22 Nov 13, 2022

neural network based speaker embedder

Content What is deepaudio-speaker? Installation Get Started Model Architecture How to contribute to deepaudio-speaker? Acknowledge What is deepaudio-s

20 Dec 29, 2022

Neural network models for joint POS tagging and dependency parsing (CoNLL 2017-2018)

Neural Network Models for Joint POS Tagging and Dependency Parsing Implementations of joint models for POS tagging and dependency parsing, as describe

152 Sep 2, 2022

An assignment on creating a minimalist neural network toolkit for CS11-747

Related tags

Overview

minnn

Assignment Details

References

You might also like...

Unsupervised text tokenizer for Neural Network-based text generation.

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

Unsupervised text tokenizer for Neural Network-based text generation.

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

neural network based speaker embedder

Neural network models for joint POS tagging and dependency parsing (CoNLL 2017-2018)

Comments

unittests: move test_minnn.py to using unittests instead of manually written tests

Owner

Graham Neubig

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

Yet Another Neural Machine Translation Toolkit

ChatterBot is a machine learning, conversational dialog engine for creating chat bots

ChatterBot is a machine learning, conversational dialog engine for creating chat bots

Creating a Feed of MISP Events from ThreatFox (by abuse.ch)

Creating an LSTM model to generate music

This repository details the steps in creating a Part of Speech tagger using Trigram Hidden Markov Models and the Viterbi Algorithm without using external libraries.

Creating a python chatbot that Starbucks users can text to place an order + help cut wait time of a normal coffee.

Creating an Audiobook (mp3 file) using a Ebook (epub) using BeautifulSoup and Google Text to Speech

Creating a chess engine using GPT-3