An assignment on creating a minimalist neural network toolkit for CS11-747

Overview

minnn

by Graham Neubig, Zhisong Zhang, and Divyansh Kaushik

This is an exercise in developing a minimalist neural network toolkit for NLP, part of Carnegie Mellon University's CS11-747: Neural Networks for NLP.

The most important files it contains are the following:

  1. minnn.py: This is what you'll need to implement. It implements a very minimalist version of a dynamic neural network toolkit (like PyTorch or Dynet). Some code is provided, but important functionality is not included.
  2. classifier.py: training code for a Deep Averaging Network for text classification using minnn. You can feel free to make any modifications to make it a better model, but the original version of classifier.py must also run with your minnn.py implementation.
  3. setup.py: this is blank, but if your classifier implementation needs to do some sort of data downloading (e.g. of pre-trained word embeddings) you can implement this here. It will be run before running your implementation of classifier.py.
  4. data/: Two datasets, one from the Stanford Sentiment Treebank with tree info removed and another from IMDb reviews.

Assignment Details

Important Notes:

  • There is a detailed description of the code structure in structure.md, including a description of which parts you will need to implement.
  • The only allowed external library is numpy or cupy, no other external libraries are allowed.
  • We will run your code with the following commands, so make sure that whatever your best results are are reproducible using these commands (where you replace ANDREWID with your andrew ID):
    • mkdir -p ANDREWID
    • python classifier.py --train=data/sst-train.txt --dev=data/sst-dev.txt --test=data/sst-test.txt --dev_out=ANDREWID/sst-dev-output.txt --test_out=ANDREWID/sst-test-output.txt
    • python classifier.py --train=data/cfimdb-train.txt --dev=data/cfimdb-dev.txt --test=data/cfimdb-test.txt --dev_out=ANDREWID/cfimdb-dev-output.txt --test_out=ANDREWID/cfimdb-test-output.txt
  • Reference accuracies: with our implementation and the default hyper-parameters, the mean(std) of accuracies with 10 different random seeds on sst is dev=0.4045(0.0070), test=0.4069(0.0105), and on cfimdb dev=0.8792(0.0084). If you implement things exactly in our way and use the default random seed and use the same environment (python 3.8 + numpy 1.18 or 1.19), you may get the accuracies of dev=0.4114, test=0.4253, and on cfimdb dev=0.8857.

The submission file should be a zip file with the following structure (assuming the andrew id is ANDREWID):

  • ANDREWID/
  • ANDREWID/minnn.py # completed minnn.py
  • ANDREWID/classifier.py.py # completed classifier.py with any of your modifications
  • ANDREWID/sst-dev-output.txt # output of the dev set for SST data
  • ANDREWID/sst-test-output.txt # output of the test set for SST data
  • ANDREWID/cfimdb-dev-output.txt # output of the dev set for CFIMDB data
  • ANDREWID/cfimdb-test-output.txt # output of the test set for CFIMDB data
  • ANDREWID/report.pdf # (optional), report. here you can describe anything particularly new or interesting that you did

Grading information:

  • A+: Submissions that implement something new and achieve particularly large accuracy improvements (e.g. 2% over the baseline on SST)
  • A: You additionally implement something else on top of the missing pieces, some examples include:
    • Implementing another optimizer such as Adam
    • Incorporating pre-trained word embeddings, such as those from fasttext
    • Changing the model architecture significantly
  • A-: You implement all the missing pieces and the original classifier.py code achieves comparable accuracy to our reference implementation (about 41% on SST)
  • B+: All missing pieces are implemented, but accuracy is not comparable to the reference.
  • B or below: Some parts of the missing pieces are not implemented.

References

Stanford Sentiment Treebank: https://www.aclweb.org/anthology/D13-1170.pdf

IMDb Reviews: https://openreview.net/pdf?id=Sklgs0NFvr

You might also like...
Unsupervised text tokenizer for Neural Network-based text generation.

SentencePiece SentencePiece is an unsupervised text tokenizer and detokenizer mainly for Neural Network-based text generation systems where the vocabu

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

textgenrnn Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly tr

Unsupervised text tokenizer for Neural Network-based text generation.

SentencePiece SentencePiece is an unsupervised text tokenizer and detokenizer mainly for Neural Network-based text generation systems where the vocabu

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

textgenrnn Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly tr

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

This codebase is being actively maintained, please create and issue if you have issues using it Basics All data files are included under losses and ea

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

UIS-RNN Overview This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm. UIS-RNN solves the problem of s

In this repository, I have developed an end to end Automatic speech recognition project. I have developed the neural network model for automatic speech recognition with PyTorch and used MLflow to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry.
neural network based speaker embedder

Content What is deepaudio-speaker? Installation Get Started Model Architecture How to contribute to deepaudio-speaker? Acknowledge What is deepaudio-s

Neural network models for joint POS tagging and dependency parsing (CoNLL 2017-2018)
Neural network models for joint POS tagging and dependency parsing (CoNLL 2017-2018)

Neural Network Models for Joint POS Tagging and Dependency Parsing Implementations of joint models for POS tagging and dependency parsing, as describe

Comments
  • unittests: move test_minnn.py to using unittests instead of manually written tests

    unittests: move test_minnn.py to using unittests instead of manually written tests

    This PR moves the minnn repo to using Python's inbuilt unittests module to do testing, along with some complementary changes. In particular, this PR creates a TestCase under which all 9 of the old tests appear. It also removes the use of assert statements, instead moving to unittest's self.assert* calls for individual values, and np.testing.assert_is_allclose calls for approximate array equality. This creates both better error reporting and the ability to run all tests and collect errors before the script fails.

    This refactor no longer allows specifying the script path of minnn.py: any arguments to unittest are interpreted as paths/modules to test. I cleared this with @neubig in office hours; if it's helpful, I can add an option to specify the module for dynamic import with an environment variable.

    This module can either be run as before from the console (python -m test_minnn, or python test_minnn.py) or with the unittest runners found in many IDEs (tested in PyCharm).

    Example screenshot of running from console: image

    opened by gsireesh 0
Owner
Graham Neubig
Graham Neubig
Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

Espresso Espresso is an open-source, modular, extensible end-to-end neural automatic speech recognition (ASR) toolkit based on the deep learning libra

Yiming Wang 919 Jan 3, 2023
Yet Another Neural Machine Translation Toolkit

YANMTT YANMTT is short for Yet Another Neural Machine Translation Toolkit. For a backstory how I ended up creating this toolkit scroll to the bottom o

Raj Dabre 121 Jan 5, 2023
ChatterBot is a machine learning, conversational dialog engine for creating chat bots

ChatterBot ChatterBot is a machine-learning based conversational dialog engine build in Python which makes it possible to generate responses based on

Gunther Cox 12.8k Jan 3, 2023
ChatterBot is a machine learning, conversational dialog engine for creating chat bots

ChatterBot ChatterBot is a machine-learning based conversational dialog engine build in Python which makes it possible to generate responses based on

Gunther Cox 10.8k Feb 18, 2021
Creating a Feed of MISP Events from ThreatFox (by abuse.ch)

ThreatFox2Misp Creating a Feed of MISP Events from ThreatFox (by abuse.ch) What will it do? This will fetch IOCs from ThreatFox by Abuse.ch, convert t

null 17 Nov 22, 2022
Creating an LSTM model to generate music

Music-Generation Creating an LSTM model to generate music music-generator Used to create basic sin wave sounds music-ai Contains the functions to conv

Jerin Joseph 2 Dec 2, 2021
This repository details the steps in creating a Part of Speech tagger using Trigram Hidden Markov Models and the Viterbi Algorithm without using external libraries.

POS-Tagger This repository details the creation of a Part-of-Speech tagger using Trigram Hidden Markov Models to predict word tags in a word sequence.

Raihan Ahmed 1 Dec 9, 2021
Creating a python chatbot that Starbucks users can text to place an order + help cut wait time of a normal coffee.

Creating a python chatbot that Starbucks users can text to place an order + help cut wait time of a normal coffee.

null 2 Jan 20, 2022
Creating an Audiobook (mp3 file) using a Ebook (epub) using BeautifulSoup and Google Text to Speech

epub2audiobook Creating an Audiobook (mp3 file) using a Ebook (epub) using BeautifulSoup and Google Text to Speech Input examples qual a pasta do seu

null 7 Aug 25, 2022
Creating a chess engine using GPT-3

GPT3Chess Creating a chess engine using GPT-3 Code for my article : https://towardsdatascience.com/gpt-3-play-chess-d123a96096a9 My game (white) vs GP

null 19 Dec 17, 2022