Efficient Deep Learning Systems course

Max Ryabinin

Last update: Dec 29, 2022

Related tags

Deep Learning efficient-dl-systems

Overview

Efficient Deep Learning Systems

This repository contains materials for the Efficient Deep Learning Systems course taught at the Faculty of Computer Science of HSE University and Yandex School of Data Analysis.

Syllabus

Week 1: Introduction
- Lecture: Course overview and organizational details. Core concepts of the GPU architecture and CUDA API.
- Seminar: CUDA operations in PyTorch. Introduction to benchmarking.
Week 2: Basics of distributed ML
- Lecture: Introduction to distributed training. Process-based communication. Parameter Server architecture.
- Seminar: Multiprocessing basics. Parallel GloVe training.
Week 3: Data-parallel training and All-Reduce
- Lecture: Data-parallel training of neural networks. All-Reduce and its efficient implementations.
- Seminar: Introduction to PyTorch Distributed. Data-parallel training primitives.
Week 4: Memory-efficient and model-parallel training
Week 5: Profiling DL code, training-time optimizations
Week 6: Basics of Python application deployment
Week 7: Software for serving neural networks
Week 8: Optimizing models for faster inference
Week 9: Experiment tracking, model and data versioning
Week 10: Testing, debugging and monitoring of models

Grading

There will be a total of 4 home assignments (some of them spread over several weeks). The final grade is a weighted sum of per-assignment grades. Please refer to the course page of your institution for details.

Staff

Deep Probabilistic Programming Course @ DIKU

52 May 14, 2022

Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning

DSE 314/614: Reinforcement Learning This repository containing reinforcement lea

4 Apr 15, 2022

D2Go is a toolkit for efficient deep learning

D2Go D2Go is a production ready software system from FacebookResearch, which supports end-to-end model training and deployment for mobile platforms. W

744 Jan 4, 2023

A clear, concise, simple yet powerful and efficient API for deep learning.

The Gluon API Specification The Gluon API specification is an effort to improve speed, flexibility, and accessibility of deep learning technology for

2.3k Dec 17, 2022

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

8.4k Jan 1, 2023

PPLNN is a Primitive Library for Neural Network is a high-performance deep-learning inference engine for efficient AI inferencing

943 Jan 7, 2023

Efficient Deep Learning Systems course

Related tags

Overview

Efficient Deep Learning Systems

Syllabus

Grading

Staff

You might also like...

Deep Probabilistic Programming Course @ DIKU

Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning

D2Go is a toolkit for efficient deep learning

A clear, concise, simple yet powerful and efficient API for deep learning.

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

PPLNN is a Primitive Library for Neural Network is a high-performance deep-learning inference engine for efficient AI inferencing

Code release for "Self-Tuning for Data-Efficient Deep Learning" (ICML 2021)

Implementation of "Selection via Proxy: Efficient Data Selection for Deep Learning" from ICLR 2020.

Lorien: A Unified Infrastructure for Efficient Deep Learning Workloads Delivery

Owner

Max Ryabinin

Computer Vision Script to recognize first person motion, developed as final project for the course "Machine Learning and Deep Learning"

UMEC: Unified Model and Embedding Compression for Efficient Recommendation Systems

An efficient PyTorch implementation of the evaluation metrics in recommender systems.

Repository of Jupyter notebook tutorials for teaching the Deep Learning Course at the University of Amsterdam (MSc AI), Fall 2020

AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning

SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems

A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.

NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.

A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.

Efficient-GlobalPointer - Pytorch Efficient GlobalPointer