[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

Overview

DataFree

A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

Authors: Gongfan Fang, Jie Song, Xinchao Wang, Chengchao Shen, Xingen Wang, Mingli Song

CMI (this work) DeepInv
ZSKT DFQ

Results

1. CIFAR-10

Method resnet-34
resnet-18
vgg-11
resnet-18
wrn-40-2
wrn-16-1
wrn-40-2
wrn-40-1
wrn-40-2
wrn-16-2
T. Scratch 95.70 92.25 94.87 94.87 94.87
S. Scratch 95.20 95.20 91.12 93.94 93.95
DAFL 92.22 81.10 65.71 81.33 81.55
ZSKT 93.32 89.46 83.74 86.07 89.66
DeepInv 93.26 90.36 83.04 86.85 89.72
DFQ 94.61 90.84 86.14 91.69 92.01
CMI 94.84 91.13 90.01 92.78 92.52

2. CIFAR-100

Method resnet-34
resnet-18
vgg-11
resnet-18
wrn-40-2
wrn-16-1
wrn-40-2
wrn-40-1
wrn-40-2
wrn-16-2
T. Scratch 78.05 71.32 75.83 75.83 75.83
S. Scratch 77.10 77.01 65.31 72.19 73.56
DAFL 74.47 57.29 22.50 34.66 40.00
ZSKT 67.74 34.72 30.15 29.73 28.44
DeepInv 61.32 54.13 53.77 61.33 61.34
DFQ 77.01 68.32 54.77 62.92 59.01
CMI 77.04 70.56 57.91 68.88 68.75

Quick Start

1. Visualize the inverted samples

Results will be saved as checkpoints/datafree-cmi/synthetic-cmi_for_vis.png

bash scripts/cmi/cmi_cifar10_for_vis.sh

2. Reproduce our results

Note: This repo was refactored from our experimental code and is still under development. I'm struggling to find the appropriate hyperparams for every methods (°ー°〃). So far, we only provide the hyperparameters to reproduce CIFAR-10 results for wrn-40-2 => wrn-16-1. You may need to tune the hyper-parameters for other models and datasets. More resources will be uploaded in the future update.

To reproduce our results, please download pre-trained teacher models from Dropbox-Models (266 MB) and extract them as checkpoints/pretrained. Also a pre-inverted data set with ~50k samples is available for wrn-40-2 teacher on CIFAR-10. You can download it from Dropbox-Data (133 MB) and extract them to run/cmi-preinverted-wrn402/.

  • Non-adversarial CMI: you can train a student model on inverted data directly. It should reach the accuracy of ~87.38% on CIFAR-10 as reported in Figure 3.

    bash scripts/cmi/nonadv_cmi_cifar10_wrn402_wrn161.sh
    
  • Adversarial CMI: or you can apply the adversarial distillation based on the pre-inverted data, where ~10k (256x40) new samples will be generated to improve the student. It should reach the accuracy of ~90.01% on CIFAR-10 as reported in Table 1.

    bash scripts/cmi/adv_cmi_cifar10_wrn402_wrn161.sh
    
  • Scratch CMI: It is OK to run the cmi algorithm wihout any pre-inverted data, but the student may overfit to early samples due to the limited data amount. It should reach the accuracy of ~88.82% on CIFAR-10, slightly worse than our reported results (90.01%).

    bash scripts/cmi/scratch_cmi_cifar10_wrn402_wrn161.sh
    

3. Scratch training

python train_scratch.py --model wrn40_2 --dataset cifar10 --batch-size 256 --lr 0.1 --epoch 200 --gpu 0

4. Vanilla KD

# KD with original training data (beta>0 to use hard targets)
python vanilla_kd.py --teacher wrn40_2 --student wrn16_1 --dataset cifar10 --transfer_set cifar10 --beta 0.1 --batch-size 128 --lr 0.1 --epoch 200 --gpu 0 

# KD with unlabeled data
python vanilla_kd.py --teacher wrn40_2 --student wrn16_1 --dataset cifar10 --transfer_set cifar100 --beta 0 --batch-size 128 --lr 0.1 --epoch 200 --gpu 0 

# KD with unlabeled data from a specified folder
python vanilla_kd.py --teacher wrn40_2 --student wrn16_1 --dataset cifar10 --transfer_set run/cmi --beta 0 --batch-size 128 --lr 0.1 --epoch 200 --gpu 0 

5. Data-free KD

bash scripts/xxx/xxx.sh # e.g. scripts/zskt/zskt_cifar10_wrn402_wrn161.sh

Hyper-parameters used by different methods:

Method adv bn oh balance act cr GAN Example
DAFL - - - scripts/dafl_cifar10.sh
ZSKT - - - - - scripts/zskt_cifar10.sh
DeepInv - - - - scripts/deepinv_cifar10.sh
DFQ - - scripts/dfq_cifar10.sh
CMI - - scripts/cmi_cifar10_scratch.sh

4. Use your models/datasets

You can register your models and datasets in registry.py by modifying NORMALIZE_DICT, MODEL_DICT and get_dataset. Then you can run the above commands to train your own models. As DAFL requires intermediate features from the penultimate layer, your model should accept an return_features=True parameter and return a (logits, features) tuple for DAFL.

5. Implement your algorithms

Your algorithms should inherent datafree.synthesis.BaseSynthesizer to implement two interfaces: 1) BaseSynthesizer.synthesize takes several steps to craft new samples and return an image dict for visualization; 2) BaseSynthesizer.sample fetches a batch of training data for KD.

Citation

If you found this work useful for your research, please cite our paper:

@misc{fang2021contrastive,
      title={Contrastive Model Inversion for Data-Free Knowledge Distillation}, 
      author={Gongfan Fang and Jie Song and Xinchao Wang and Chengchao Shen and Xingen Wang and Mingli Song},
      year={2021},
      eprint={2105.08584},
      archivePrefix={arXiv},
      primaryClass={cs.AI}
}

Reference

Comments
  • About TV loss

    About TV loss

    Hi~ Thank you for this great work.

    My question is about the TV loss. Could you give the reason why you took mean while calculating the TV loss? The paper about that work did not mention the 'mean'. Thank you

    https://github.com/zju-vipa/CMI/blob/9e79fa9e2328205f26dbdb226878f3a28f3bf4cc/datafree/criterions.py#L48

    opened by mountains-high 4
  • The accuracy of code reproduction

    The accuracy of code reproduction

    I try to repruduce your impressive work. I just simply run 'bash scripts/cmi/adv_cmi_cifar10_wrn402_wrn161.sh' but get lower accuracy than paper:

    Use GPU: 0 for training
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: method: cmi
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: adv: 0.5
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: bn: 1.0
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: oh: 0.5
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: act: 0.0
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: balance: 0.0
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: save_dir: run/adv_cmi_cifar10
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: cr: 0.8
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: cr_T: 0.1
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: cmi_init: run/cmi-preinverted-wrn402
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: data_root: data
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: teacher: wrn40_2
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: student: wrn16_1
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: dataset: cifar10
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: lr: 0.1
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: lr_decay_milestones: 25,30,35
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: lr_g: 0.001
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: T: 20.0
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: epochs: 40
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: g_steps: 200
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: kd_steps: 2000
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: ep_steps: 2000
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: resume: 
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: evaluate_only: False
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: batch_size: 128
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: synthesis_batch_size: 256
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: gpu: 0
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: world_size: -1
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: rank: -1
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: dist_url: tcp://224.66.41.62:23456
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: dist_backend: nccl
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: multiprocessing_distributed: False
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: fp16: False
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: seed: None
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: log_tag: -adv_cmi_cifar10
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: workers: 4
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: start_epoch: 0
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: momentum: 0.9
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: weight_decay: 0.0001
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: print_freq: 0
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: pretrained: False
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: distributed: False
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: ngpus_per_node: 1
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: autocast: <function dummy_ctx at 0x7f61b2802560>
    [08/27 20:15:02 cifar10-wrn40_2-wrn16_1]: logger: <Logger cifar10-wrn40_2-wrn16_1 (DEBUG)>
    Files already downloaded and verified
    Files already downloaded and verified
    CMI dims: 2704
    [08/27 20:20:43 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=0 Acc@1=23.6500 Acc@5=76.6300 Loss=4.2118 Lr=0.1000
    [08/27 20:24:23 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=1 Acc@1=31.9100 Acc@5=81.9000 Loss=3.3887 Lr=0.0998
    [08/27 20:27:23 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=2 Acc@1=36.4000 Acc@5=85.6800 Loss=3.0814 Lr=0.0994
    [08/27 20:30:02 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=3 Acc@1=42.9800 Acc@5=88.9000 Loss=2.7402 Lr=0.0986
    [08/27 20:32:29 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=4 Acc@1=45.4600 Acc@5=91.3300 Loss=2.6151 Lr=0.0976
    [08/27 20:34:49 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=5 Acc@1=51.9300 Acc@5=93.0500 Loss=2.3906 Lr=0.0962
    [08/27 20:37:04 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=6 Acc@1=52.5100 Acc@5=91.0300 Loss=2.4513 Lr=0.0946
    [08/27 20:39:13 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=7 Acc@1=56.1100 Acc@5=94.8800 Loss=2.0773 Lr=0.0926
    [08/27 20:41:20 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=8 Acc@1=58.4500 Acc@5=95.4500 Loss=2.1580 Lr=0.0905
    [08/27 20:43:24 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=9 Acc@1=64.0700 Acc@5=96.5000 Loss=1.6752 Lr=0.0880
    [08/27 20:45:27 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=10 Acc@1=61.5900 Acc@5=96.7800 Loss=1.9675 Lr=0.0854
    [08/27 20:47:27 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=11 Acc@1=65.8900 Acc@5=96.1100 Loss=1.6139 Lr=0.0825
    [08/27 20:49:27 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=12 Acc@1=66.5500 Acc@5=96.0800 Loss=1.6001 Lr=0.0794
    [08/27 20:51:26 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=13 Acc@1=69.8900 Acc@5=97.6900 Loss=1.3662 Lr=0.0761
    [08/27 20:53:24 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=14 Acc@1=66.6300 Acc@5=97.1500 Loss=1.5617 Lr=0.0727
    [08/27 20:55:21 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=15 Acc@1=72.4200 Acc@5=98.0800 Loss=1.1806 Lr=0.0691
    [08/27 20:57:19 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=16 Acc@1=72.5100 Acc@5=98.1300 Loss=1.2018 Lr=0.0655
    [08/27 20:59:15 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=17 Acc@1=68.6100 Acc@5=98.2000 Loss=1.5399 Lr=0.0617
    [08/27 21:01:11 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=18 Acc@1=74.9700 Acc@5=98.4700 Loss=1.0825 Lr=0.0578
    [08/27 21:03:06 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=19 Acc@1=70.5000 Acc@5=97.9700 Loss=1.4037 Lr=0.0539
    [08/27 21:05:01 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=20 Acc@1=74.3100 Acc@5=98.4500 Loss=1.1729 Lr=0.0500
    [08/27 21:06:55 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=21 Acc@1=76.1000 Acc@5=98.3600 Loss=1.0403 Lr=0.0461
    [08/27 21:08:49 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=22 Acc@1=76.4900 Acc@5=98.1800 Loss=1.0029 Lr=0.0422
    [08/27 21:10:42 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=23 Acc@1=77.2500 Acc@5=98.5500 Loss=1.0258 Lr=0.0383
    [08/27 21:12:34 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=24 Acc@1=76.9900 Acc@5=98.7200 Loss=0.9811 Lr=0.0345
    [08/27 21:14:27 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=25 Acc@1=77.7800 Acc@5=98.5100 Loss=0.9825 Lr=0.0309
    [08/27 21:16:19 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=26 Acc@1=77.1600 Acc@5=98.5800 Loss=1.0215 Lr=0.0273
    [08/27 21:18:09 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=27 Acc@1=78.0100 Acc@5=98.4900 Loss=0.9878 Lr=0.0239
    [08/27 21:20:00 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=28 Acc@1=77.0600 Acc@5=98.7700 Loss=1.0611 Lr=0.0206
    [08/27 21:21:52 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=29 Acc@1=78.3700 Acc@5=98.7500 Loss=1.0098 Lr=0.0175
    [08/27 21:23:42 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=30 Acc@1=78.0200 Acc@5=98.8700 Loss=1.0001 Lr=0.0146
    [08/27 21:25:32 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=31 Acc@1=79.6000 Acc@5=99.0600 Loss=0.8993 Lr=0.0120
    [08/27 21:27:23 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=32 Acc@1=80.0800 Acc@5=99.1700 Loss=0.8516 Lr=0.0095
    [08/27 21:29:13 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=33 Acc@1=80.1500 Acc@5=99.1100 Loss=0.8723 Lr=0.0074
    [08/27 21:31:03 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=34 Acc@1=81.4100 Acc@5=99.0800 Loss=0.8326 Lr=0.0054
    [08/27 21:32:53 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=35 Acc@1=81.4900 Acc@5=99.1500 Loss=0.8322 Lr=0.0038
    [08/27 21:34:43 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=36 Acc@1=81.2300 Acc@5=99.1200 Loss=0.8257 Lr=0.0024
    [08/27 21:36:32 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=37 Acc@1=81.7300 Acc@5=99.2600 Loss=0.8082 Lr=0.0014
    [08/27 21:38:21 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=38 Acc@1=81.6900 Acc@5=99.2500 Loss=0.7939 Lr=0.0006
    [08/27 21:40:10 cifar10-wrn40_2-wrn16_1]: [Eval] Epoch=39 Acc@1=81.8800 Acc@5=99.2300 Loss=0.7991 Lr=0.0002
    [08/27 21:40:10 cifar10-wrn40_2-wrn16_1]: Best: 81.8800
    
    opened by Sharpiless 4
  • Failed to reproduce.

    Failed to reproduce.

    I replayed the code from pre-training and got the following results:

    |Model|Method|Dataset|top1-accuracy| |:----|:----|:----|:----| |ResNet34*|Teacher|Cifar-10|93.94(95.70)| |wrn_40_2|Teacher|Cifar-10|92.01(94.87)|

    For DFQ algorithm, the results are as follows:

    |Model|Data-Free Method|Student-Loss|Generative-Loss|Dataset|top1-accuracy| |:----|:----|:----|:----|:----|:----| |ResNet34-ResNet18|DFQ(Baseline)|KL|adv+bn+oh|Cifar-10|88.89(94.61)| |ResNet34-ResNet18|DFQ(Baseline)|KL|adv+bn+oh|Cifar-100|1.89(77.01)|

    For the above results, I reproduce DFQ based on your code for some reason. But the results are poor. I would be grateful if you could kindly give me your advice.

    opened by Sharpiless 3
  • About running environment

    About running environment

    Hi there~

    Thank you for the fine work. Can I know the exact torch version, please? I saw the "requirements.txt" file, but versions weren't specified. I'm having some issues regarding it.

    Thank you

    opened by mountains-high 1
  • Some questions about the result of DFQ

    Some questions about the result of DFQ

    I ran the code according to the "Scripts" of DFQ. Teacher - wrn_40_2, Student - wrn_16_1, dataset - CIFAR10 But the result is too much outstanding. The top1 accuracy in testset is more than 92%, which is even better than the result of cmi in the paper. I'm wondering if this is a coincidence or there are something wrong in the code?

    Thx!

    opened by ExcitingYi 0
Owner
ZJU-VIPA
Laboratory of Visual Intelligence and Pattern Analysis
ZJU-VIPA
Code for the IJCAI 2021 paper "Structure Guided Lane Detection"

SGNet Project for the IJCAI 2021 paper "Structure Guided Lane Detection" Abstract Recently, lane detection has made great progress with the rapid deve

Jinming Su 27 Dec 8, 2022
PyTorch implementation of the paper: "Preference-Adaptive Meta-Learning for Cold-Start Recommendation", IJCAI, 2021.

PAML PyTorch implementation of the paper: "Preference-Adaptive Meta-Learning for Cold-Start Recommendation", IJCAI, 2021. (Continuously updating ) Int

null 15 Nov 18, 2022
Omnidirectional Scene Text Detection with Sequential-free Box Discretization (IJCAI 2019). Including competition model, online demo, etc.

Box_Discretization_Network This repository is built on the pytorch [maskrcnn_benchmark]. The method is the foundation of our ReCTs-competition method

Yuliang Liu 266 Nov 24, 2022
Official PyTorch implementation of "RMGN: A Regional Mask Guided Network for Parser-free Virtual Try-on" (IJCAI-ECAI 2022)

RMGN-VITON RMGN: A Regional Mask Guided Network for Parser-free Virtual Try-on In IJCAI-ECAI 2022(short oral). [Paper] [Supplementary Material] Abstra

null 27 Dec 1, 2022
Disentangled Face Attribute Editing via Instance-Aware Latent Space Search, accepted by IJCAI 2021.

Instance-Aware Latent-Space Search This is a PyTorch implementation of the following paper: Disentangled Face Attribute Editing via Instance-Aware Lat

null 67 Dec 21, 2022
The official implementation of CVPR 2021 Paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation This repository is the official implementation of CVPR 2021 paper:

null 9 Nov 14, 2022
[IJCAI'21] Deep Automatic Natural Image Matting

Deep Automatic Natural Image Matting [IJCAI-21] This is the official repository of the paper Deep Automatic Natural Image Matting. Introduction | Netw

Jizhizi_Li 316 Jan 6, 2023
A PyTorch implementation of "Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning", IJCAI-21

MERIT A PyTorch implementation of our IJCAI-21 paper Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning. Depen

Graph Analysis & Deep Learning Laboratory, GRAND 32 Jan 2, 2023
DTCN IJCAI - Sequential prediction learning framework and algorithm

DTCN This is the implementation of our paper "Sequential Prediction of Social Me

Bobby 2 Jan 24, 2022
MGFN: Multi-Graph Fusion Networks for Urban Region Embedding was accepted by IJCAI-2022.

Multi-Graph Fusion Networks for Urban Region Embedding (IJCAI-22) This is the implementation of Multi-Graph Fusion Networks for Urban Region Embedding

null 202 Nov 18, 2022
Official PyTorch implementation of SyntaSpeech (IJCAI 2022)

SyntaSpeech: Syntax-Aware Generative Adversarial Text-to-Speech | | | | 中文文档 This repository is the official PyTorch implementation of our IJCAI-2022

Zhenhui YE 116 Nov 24, 2022
[NeurIPS-2021] Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data

MosaicKD Code for NeurIPS-21 paper "Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data" 1. Motivation Natural images share common l

ZJU-VIPA 37 Nov 10, 2022
Code implementation of Data Efficient Stagewise Knowledge Distillation paper.

Data Efficient Stagewise Knowledge Distillation Table of Contents Data Efficient Stagewise Knowledge Distillation Table of Contents Requirements Image

IvLabs 112 Dec 2, 2022
Official implementation for (Refine Myself by Teaching Myself : Feature Refinement via Self-Knowledge Distillation, CVPR-2021)

FRSKD Official implementation for Refine Myself by Teaching Myself : Feature Refinement via Self-Knowledge Distillation (CVPR-2021) Requirements Pytho

null 75 Dec 28, 2022
Official implementation for (Show, Attend and Distill: Knowledge Distillation via Attention-based Feature Matching, AAAI-2021)

Show, Attend and Distill: Knowledge Distillation via Attention-based Feature Matching Official pytorch implementation of "Show, Attend and Distill: Kn

Clova AI Research 80 Dec 16, 2022
PyTorch implementation of paper A Fast Knowledge Distillation Framework for Visual Recognition.

FKD: A Fast Knowledge Distillation Framework for Visual Recognition Official PyTorch implementation of paper A Fast Knowledge Distillation Framework f

Zhiqiang Shen 129 Dec 24, 2022
Official implementation of the paper "Lightweight Deep CNN for Natural Image Matting via Similarity Preserving Knowledge Distillation"

Lightweight-Deep-CNN-for-Natural-Image-Matting-via-Similarity-Preserving-Knowledge-Distillation Introduction Accepted at IEEE Signal Processing Letter

DongGeun-Yoon 19 Jun 7, 2022
Paper Title: Heterogeneous Knowledge Distillation for Simultaneous Infrared-Visible Image Fusion and Super-Resolution

HKDnet Paper Title: "Heterogeneous Knowledge Distillation for Simultaneous Infrared-Visible Image Fusion and Super-Resolution" Email: 18186470991@163.

wasteland 11 Nov 12, 2022
Codes for SIGIR'22 Paper 'On-Device Next-Item Recommendation with Self-Supervised Knowledge Distillation'

OD-Rec Codes for SIGIR'22 Paper 'On-Device Next-Item Recommendation with Self-Supervised Knowledge Distillation' Paper, saved teacher models and Andro

Xin Xia 11 Nov 22, 2022