Deep Learning for Natural Language Processing - Lectures 2021
This repository contains slides for the course "20-00-0947: Deep Learning for Natural Language Processing" (Technical University of Darmstadt, Summer term 2021).
This online course is taught by Ivan Habernal and Mohsen Mesgar.
The slides are available as PDF as well as LaTeX source code (we've used Beamer because typesetting mathematics in PowerPoint or similar tools is painful)
The content is licenced under Creative Commons CC BY-SA 4.0 which means that you can re-use, adapt, modify, or publish it further, provided you keep the license and give proper credits.
Accompanying video lectures are linked on YouTube
Lecture 1
- Topics: Kick-off (challenges in NLP, Deep Learning in NLP, Terminology, History of DL, Perceptron)
- Slides as PDF (as in video), Slides as PDF (updated)
- YouTube video
- Mandatory reading
- Section 4 from Goldberg, Y. (2016). A Primer on Neural Network Models for Natural Language Processing. Journal of Artificial Intelligence Research, 57, 345–420. https://doi.org/10.1613/jair.4992
Lecture 2
- Topics: Machine learning basics, Cross-validation, Evaluation, Loss functions
- Slides as PDF (as in video), Slides as PDF (updated)
- YouTube video
- Mandatory reading
- Chapter 8 from M. P. Deisenroth, A. Faisal, and C. S. Ong (2021). Mathematics for Machine Learning. Cambridge University Press. https://mml-book.com
Lecture 3
- Topics: Training as optimization and (neural) language models
- Slides as PDF (as in video), Slides as PDF (updated)
- YouTube video
- Mandatory reading
- Section 4.7 and Section 6 (except Section 6.2) from Goldberg, Y. (2016). A Primer on Neural Network Models for Natural Language Processing. Journal of Artificial Intelligence Research, 57, 345–420. https://doi.org/10.1613/jair.4992
Lecture 4
- Topics: Text Representations (I)
- Slides as PDF
- YouTube video
- Code in PyTorch
- Mandatory reading
- Section 5 from Goldberg, Y. (2016). A Primer on Neural Network Models for Natural Language Processing. Journal of Artificial Intelligence Research, 57, 345–420. https://doi.org/10.1613/jair.4992
Lecture 5
- Topics: Bilingual and Syntax-Based Word Embeddings
- Slides as PDF
- YouTube video
- Mandatory reading
- Upadhyay, S., Faruqui, M., Dyer, C., & Roth, D. (2016). Cross-lingual Models of Word Embeddings: An Empirical Comparison. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1661–1670. https://doi.org/10.18653/v1/P16-1157
Lecture 6
- Topics: Convolutional Neural Networks
- Slides as PDF
- YouTube video
- Mandatory reading
- Madasu, A., & Anvesh Rao, V. (2019). Sequential Learning of Convolutional Features for Effective Text Classification. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 5657–5666. https://doi.org/10.18653/v1/D19-1567
Lecture 7
- Topics: Recurrent Neural Networks
- Slides as PDF (as in video), Slides as PDF (updated)
- YouTube video
- Mandatory reading
- Reimers, N. & Gurevych, I. (2017). Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging. In Proc. of EMNLP 2017, 338--348. https://www.aclweb.org/anthology/D17-1035/
Lecture 8
- Topics: Encoder-Decoder Models
- Slides as PDF
- YouTube video
- Mandatory reading
- Cho, K., Gulcehre, B. V. M. C., Bahdanau, D., Schwenk, F. B. H., & Bengio, Y. (2014). Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. In Proc. of EMNLP 2014, 1724-1734. https://www.aclweb.org/anthology/D14-1179/
- Bahdanau, D., Cho, K. & Bengio, Y. (2016). Neural Machine Translation by Jointly Learning to Align and Translate. ArXiv, 1-15. https://arxiv.org/pdf/1409.0473.pdf
- Sutskever, I., Vinyals, O. & Le, Q. V. (2014). Sequence to Sequence Learning with Neural Networks. In Proc. of NIPS 2014, 1-9. https://papers.nips.cc/paper/2014/file/a14ac55a4f27472c5d894ec1c3c743d2-Paper.pdf
Lecture 9
- Topics: Transformer architectures and BERT
- Slides as PDF
- YouTube video
- Mandatory reading
- Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 4171–4186. https://doi.org/10.18653/v1/N19-1423
Lecture 10
- Topics: GPT-2 and GPT-3 (aka. "Disentangling hype from reality")
- Self-study and training critical thinking; no video lecture
- Mandatory reading
- GPT-2 paper: See the PDF in OpenAI blogpost
- GPT-3 paper: Brown et al. (2020). Language Models are Few-Shot Learners. in arXiv. https://arxiv.org/abs/2005.14165
- "Attention is all you need" is all you need for understanding the transformer architecture (see also Lecture 9 on BERT)
- Helpful video: Yannic Kilcher: "GPT-3: Language Models are Few-Shot Learners (Paper Explained)" on YouTube
Lecture 11
- Guest lecture: Nils Reimers (Huggingface)
- Topics: Sentence BERT
- Slides as PDF
- YouTube video part 1, YouTube video part 2, YouTube video part 3
- Mandatory reading
- Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 3980–3990. https://doi.org/10.18653/v1/D19-1410
Compiling slides to PDF
If you run a linux distribution (e.g, Ubuntu 20.04 and newer), all packages are provided as part of texlive
. Install the following packages
$ sudo apt-get install texlive-latex-recommended texlive-pictures texlive-latex-extra \
texlive-fonts-extra texlive-bibtex-extra texlive-humanities texlive-science \
texlive-luatex biber wget -y
Install Fira Sans fonts required by the beamer template locally
$ wget https://github.com/mozilla/Fira/archive/refs/tags/4.106.zip -O 4.106.zip \
&& unzip -o 4.106.zip && mkdir -p ~/.fonts/FiraSans && cp Fira-4.106/otf/Fira* \
~/.fonts/FiraSans/ && rm -rf Fira-4.106 && rm 4.106.zip && fc-cache -f -v && mktexlsr
Compile each lecture's slides using lualatex
$ lualatex dl4nlp2021-lecture*.tex && biber dl4nlp2021-lecture*.bcf && \
lualatex dl4nlp2021-lecture*.tex && lualatex dl4nlp2021-lecture*.tex
Compiling slides using Docker
If you don't run a linux system or don't want to mess up your latex packages, I've tested compiling the slides in a Docker.
Install Docker ( https://docs.docker.com/engine/install/ )
Create a folder to which you clone this repository (for example, $ mkdir -p /tmp/slides
)
Run Docker with Ubuntu 20.04 interactively; mount your slides directory under /mnt
in this Docker container
$ docker run -it --rm --mount type=bind,source=/tmp/slides,target=/mnt \
ubuntu:20.04 /bin/bash
Once the container is running, update, install packages and fonts as above
# apt-get update && apt-get dist-upgrade -y && apt-get install texlive-latex-recommended \
texlive-pictures texlive-latex-extra texlive-fonts-extra texlive-bibtex-extra \
texlive-humanities texlive-science texlive-luatex biber wget -y
Fonts
# wget https://github.com/mozilla/Fira/archive/refs/tags/4.106.zip -O 4.106.zip \
&& unzip -o 4.106.zip && mkdir -p ~/.fonts/FiraSans && cp Fira-4.106/otf/Fira* \
~/.fonts/FiraSans/ && rm -rf Fira-4.106 && rm 4.106.zip && fc-cache -f -v && mktexlsr
And compile
# cd /mnt/dl4nlp/latex/lecture01
# lualatex dl4nlp2021-lecture*.tex && biber dl4nlp2021-lecture*.bcf && \
lualatex dl4nlp2021-lecture*.tex && lualatex dl4nlp2021-lecture*.tex
which generates the PDF in your local folder (e.g, /tmp/slides
).