CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation (ACMMM'21 Oral Paper)
(Accepted for oral presentation at ACMMM '21)
Paper Link: (arXiv) (ACMMM version)
Overview
We propose Continual Representation using Distillation (CoReD) method that employs the concept of Continual Learning (CL), Representation Learning (RL), and Knowledge Distillation (KD).
Comparison Baselines
- Transfer-Learning (TL) : The first method is Transfer learning, where we perform fine-tuning on the model to learning the new Task.
- Distillaion Loss (DL) : The third method is a part of our ablation study, wherewe only use the distillation loss component from our CoReD loss function to perform incremental learning.
- Transferable GAN-generated Images Detection Framewor (TG) : The second method is a KD-based GAN image detection framework using L2-SP and self-training.
Requirements and Installation
We recommend the installation using the requilrements.txt contained in this Github.
python==3.8.0
torchvision==0.9.1
torch==1.8.1
sklearn
numpy
opencv_python
pip install -r requirements.txt
- Train & Evaluation
- Full Usages
-m Model name = ['CoReD','KD','TG','FT']
-te Turn on test mode True/False
-s Name of 'Source' datasets. one or multiple names. (ex. DeepFake / DeepFake_Face2Face / DeepFake_Face2Face_FaceSwap)
-t Name of 'Target' dataset. only a single name. (ex.DeepFake / Face2Face / FaceSwap / NeuralTextures) / used for Train only')
-folder1 Sub-name of folder in Save path when model save
-folder2 'name of folder that will be made in folder1 (just option)'
-d Folder of path must contains Sources & Target folder names
-w You can select the full path or folder path included in the '.pth' file
-lr Learning late (For training)
-a Alpha of KD-Loss
-nc Number of Classes
-ns Number of Stores
-me Number of Epoch (For training)
-nb Batch-Size
-ng GPU-device can be set as ei 0,1,2 for multi-GPU (default=0)
- Train
To train and evaluate the model(s) in the paper, run this command:
- Task1 We must train pre-trained single model for task1 .
python main.py -s={Source Name} -d={folder_path} -w={weights} python main.py -s=DeepFake -d=./mydrive/dataset/' #Example
- Task2 - 4
python main.py -s={Source Name} -t={Target Name} -d={folder_path} -w={weights} python main.py -s=Face2Face_DeepFake -t=FaceSwap -d=./mydrive/dataset/ -w=./weights' #Example
- Note that If you set -s=Face2Face_DeepFake -t=FaceSwap -d=./mydrive/dataset -w=./weights when you start training, data path "./mydrive/dataset" must include 'Face2Face', 'DeepFake', and 'FaceSwap', and these must be contained the 'train','val' folder which include 'real'&'fake' folders.
- Evaluation
After train the model, you can evaluate the dataset.
- Eval
python main.py -d= -w={weights} --test python main.py -d=./mydrive/dataset/DeepFake/testset -w=./weights/bestmodel.pth --test #Example
- Result
- AUC scores (%) of various methods on compared datasets.
- Task1 (GAN datasets and FaceForensics++ datasets)
- Task2 - 4
Citation
If you find our work useful for your research, please consider citing the following papers :)
@misc{kim2021cored,
title={CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation},
author={Minha Kim and Shahroz Tariq and Simon S. Woo},
year={2021},
eprint={2107.02408},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
- Contect
If you have any questions, please contact us at kimminha/[email protected]
- License
The code is released under the MIT license. Copyright (c) 2021