Teaches a student network from the knowledge obtained via training of a larger teacher network

Abhishek Sinha

Last update: Dec 11, 2022

Related tags

Deep Learning Distilling-the-knowledge-in-neural-network

Overview

Distilling-the-knowledge-in-neural-network

Teaches a student network from the knowledge obtained via training of a larger teacher network

This is an implementation of the paper "Distilling the Knowledge in a Neural Network" arXiv preprint arXiv:1503.02531v1 (2015).

Running distill.py first trains a CNN network till 20k steps and then uses the prediction of this network as soft targets for a student network comprising of a single hidden fc layer . The student network trained using this way achieves a test accuracy of 96.55%.

The student network when trained directly without any knowledge from the teacher network achieves an accuracy of only 94.08% . This can be seen by running student.py.

Thus using the knowledge from another network we see an improvement in test accuracy of around 2.5% .

Enhancing Knowledge Tracing via Adversarial Training

Enhancing Knowledge Tracing via Adversarial Training This repository contains source code for the paper "Enhancing Knowledge Tracing via Adversarial T

14 Oct 24, 2022

[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

DataFree A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation" Authors: Gongfa

47 Jan 9, 2023

Predicting Student Attentiveness using OpenCV

Predicting-Student-Attentiveness-using-OpenCV The model will predict if a student is attentive or not through facial parameter received through the st

2 Aug 20, 2022

Kaggle Feedback Prize - Evaluating Student Writing 15th solution

Kaggle Feedback Prize - Evaluating Student Writing 15th solution First of all, I would like to thank the excellent notebooks and discussions from http

6 Mar 24, 2022

This is the solution for 2nd rank in Kaggle competition: Feedback Prize - Evaluating Student Writing.

Feedback Prize - Evaluating Student Writing This is the solution for 2nd rank in Kaggle competition: Feedback Prize - Evaluating Student Writing. The

41 Dec 14, 2022

ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training

ActNN : Activation Compressed Training This is the official project repository for ActNN: Reducing Training Memory Footprint via 2-Bit Activation Comp

178 Jan 5, 2023

Source code for the paper "PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction" in ACL2021

PLOME:Pre-training with Misspelled Knowledge for Chinese Spelling Correction (ACL2021) This repository provides the code and data of the work in ACL20

197 Nov 26, 2022

Official implementation for (Refine Myself by Teaching Myself : Feature Refinement via Self-Knowledge Distillation, CVPR-2021)

FRSKD Official implementation for Refine Myself by Teaching Myself : Feature Refinement via Self-Knowledge Distillation (CVPR-2021) Requirements Pytho

75 Dec 28, 2022

offical implement of our Lifelong Person Re-Identification via Adaptive Knowledge Accumulation in CVPR2021

LifelongReID Offical implementation of our Lifelong Person Re-Identification via Adaptive Knowledge Accumulation in CVPR2021 by Nan Pu, Wei Chen, Yu L

76 Dec 8, 2022

Comments

Reproduce the accuracy

Excuse me @a7b23 Unfortunately, I can't reach the accuracy in the student network when running distill.py. In my attempt, the result is that test accuracy of the student model is 0.098. Would you please take a look at it?

opened by zcyyc 0
actually the teacher model did not make any sense in the student model

The student model didnot arrive 96% at your code because the learning rate is too small. If try learning rate =0.2 or loss=2*loss, studet model would be accurace>96.5% at this code . The distill.py have a improvement at this code just because the studet loss=loss1+loss2, the loss2 could be replaced by loss1, do not need the teacher model.

opened by gaoliang13 1

Owner

Abhishek Sinha

Deep learning enthusiast.. Lately interested in Self-Supervised Learning and Active Learning

GitHub

This is an unofficial implementation of the paper “Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection”.

32 Oct 26, 2022

Real-Time-Student-Attendence-System - Real Time Student Attendence System

Real-Time-Student-Attendence-System The Student Attendance Management System Pro

1 Feb 15, 2022

AMTML-KD: Adaptive Multi-teacher Multi-level Knowledge Distillation

26 Oct 13, 2022

ICON: Implicit Clothed humans Obtained from Normals (CVPR 2022)

ICON: Implicit Clothed humans Obtained from Normals Yuliang Xiu · Jinlong Yang · Dimitrios Tzionas · Michael J. Black CVPR 2022 News ?? [2022/04/26] H

1.1k Jan 4, 2023

Unet network with mean teacher for altrasound image segmentation

5 Nov 21, 2022

Experiments on Flood Segmentation on Sentinel-1 SAR Imagery with Cyclical Pseudo Labeling and Noisy Student Training

Flood Detection Challenge This repository contains code for our submission to the ETCI 2021 Competition on Flood Detection (Winning Solution #2). Acco

108 Dec 28, 2022

Implementation of momentum^2 teacher

Momentum^2 Teacher: Momentum Teacher with Momentum Statistics for Self-Supervised Learning Requirements All experiments are done with python3.6, torch

121 Sep 26, 2022

PyTorch code for ICLR 2021 paper Unbiased Teacher for Semi-Supervised Object Detection

Unbiased Teacher for Semi-Supervised Object Detection This is the PyTorch implementation of our paper: Unbiased Teacher for Semi-Supervised Object Detection

366 Dec 28, 2022

[ICLR 2021 Spotlight Oral] "Undistillable: Making A Nasty Teacher That CANNOT teach students", Haoyu Ma, Tianlong Chen, Ting-Kuei Hu, Chenyu You, Xiaohui Xie, Zhangyang Wang

Undistillable: Making A Nasty Teacher That CANNOT teach students "Undistillable: Making A Nasty Teacher That CANNOT teach students" Haoyu Ma, Tianlong

71 Dec 28, 2022

Source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated Recurrent Memory Network

KaGRMN-DSG_ABSA This repository contains the PyTorch source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated

4 May 20, 2022

Teaches a student network from the knowledge obtained via training of a larger teacher network

Related tags

Overview

Distilling-the-knowledge-in-neural-network

This is an implementation of the paper "Distilling the Knowledge in a Neural Network" arXiv preprint arXiv:1503.02531v1 (2015).

You might also like...

Enhancing Knowledge Tracing via Adversarial Training

[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

Predicting Student Attentiveness using OpenCV

Kaggle Feedback Prize - Evaluating Student Writing 15th solution

This is the solution for 2nd rank in Kaggle competition: Feedback Prize - Evaluating Student Writing.

ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training

Source code for the paper "PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction" in ACL2021

Official implementation for (Refine Myself by Teaching Myself : Feature Refinement via Self-Knowledge Distillation, CVPR-2021)

offical implement of our Lifelong Person Re-Identification via Adaptive Knowledge Accumulation in CVPR2021

Comments

Reproduce the accuracy

actually the teacher model did not make any sense in the student model

Owner

Abhishek Sinha

This is an unofficial implementation of the paper “Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection”.

Real-Time-Student-Attendence-System - Real Time Student Attendence System

AMTML-KD: Adaptive Multi-teacher Multi-level Knowledge Distillation

ICON: Implicit Clothed humans Obtained from Normals (CVPR 2022)

Unet network with mean teacher for altrasound image segmentation

Experiments on Flood Segmentation on Sentinel-1 SAR Imagery with Cyclical Pseudo Labeling and Noisy Student Training

Implementation of momentum^2 teacher

PyTorch code for ICLR 2021 paper Unbiased Teacher for Semi-Supervised Object Detection

[ICLR 2021 Spotlight Oral] "Undistillable: Making A Nasty Teacher That CANNOT teach students", Haoyu Ma, Tianlong Chen, Ting-Kuei Hu, Chenyu You, Xiaohui Xie, Zhangyang Wang

Source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated Recurrent Memory Network