Benchmarks for semi-supervised domain generalization.

Overview

Semi-Supervised Domain Generalization

This code is the official implementation of the following paper: Semi-Supervised Domain Generalization with Stochastic StyleMatch. The paper addresses a practical and yet under-studied setting for domain generalization: one needs to use limited labeled data along with abundant unlabeled data gathered from multiple distinct domains to learn a generalizable model. This setting greatly challenges existing domain generalization methods, which are not designed to deal with unlabeled data and are thus less scalable in practice. Our approach, StyleMatch, extends the pseudo-labeling-based FixMatch—a state-of-the-art semi-supervised learning framework—in two crucial ways: 1) a stochastic classifier is designed to reduce overfitting and 2) the two-view consistency learning paradigm in FixMatch is upgraded to a multi-view version with style augmentation as the third complementary view. Two benchmarks are constructed for evaluation. Please see the paper at https://arxiv.org/abs/2106.00592 for more details.

How to setup the environment

This code is built on top of Dassl.pytorch. Please follow the instructions provided in https://github.com/KaiyangZhou/Dassl.pytorch to install the dassl environment, as well as to prepare the datasets, PACS and OfficeHome. The five random labeled-unlabeled splits can be downloaded at the following links: pacs, officehome. The splits need to be extracted to the two datasets' folders. Assume you put the datasets under the directory $DATA, the structure should look like

$DATA/
    pacs/
        images/
        splits/
        splits_ssdg/
    office_home_dg/
        art/
        clipart/
        product/
        real_world/
        splits_ssdg/

The style augmentation is based on AdaIN and the implementation is based on this code https://github.com/naoto0804/pytorch-AdaIN. Please download the weights of the decoder and the VGG from https://github.com/naoto0804/pytorch-AdaIN and put them under a new folder ssdg-benchmark/weights.

How to run StyleMatch

The script is provided in ssdg-benchmark/scripts/StyleMatch/run_ssdg.sh. You need to update the DATA variable that points to the directory where you put the datasets. There are three input arguments: DATASET, NLAB (total number of labels), and CFG. See the tables below regarding how to set the values for these variables.

Dataset NLAB
ssdg_pacs 210 or 105
ssdg_officehome 1950 or 975
CFG Description
v1 FixMatch + stochastic classifier + T_style
v2 FixMatch + stochastic classifier + T_style-only (i.e. no T_strong)
v3 FixMatch + stochastic classifier
v4 FixMatch

v1 refers to StyleMatch, which is our final model. See the config files in configs/trainers/StyleMatch for the detailed settings.

Here we give an example. Say you want to run StyleMatch on PACS under the 10-labels-per-class setting (i.e. 210 labels in total), simply run the following commands in your terminal,

conda activate dassl
cd ssdg-benchmark/scripts/StyleMatch
bash run_ssdg.sh ssdg_pacs 210 v1

In this case, the code will run StyleMatch in four different setups (four target domains), each for five times (five random seeds). You can modify the code to run a single experiment instead of all at once if you have multiple GPUs.

At the end of training, you will have

output/
    ssdg_pacs/
        nlab_210/
            StyleMatch/
                resnet18/
                    v1/ # contains results on four target domains
                        art_painting/ # contains five folders: seed1-5
                        cartoon/
                        photo/
                        sketch/

To show the results, simply do

python parse_test_res.py output/ssdg_pacs/nlab_210/StyleMatch/resnet18/v1 --multi-exp

Citation

If you use this code in your research, please cite our paper

@article{zhou2021stylematch,
    title={Semi-Supervised Domain Generalization with Stochastic StyleMatch},
    author={Zhou, Kaiyang and Loy, Chen Change and Liu, Ziwei},
    journal={arXiv preprint arXiv:2106.00592},
    year={2021}
}
Comments
  • Office_home_Dataset

    Office_home_Dataset

    Hi, it's me again. Sorry to disturb you. These days, i ran your codes on office_home data set and found problems. I download office_home data set though your link but it could not be used. And i could not fix it. First is the name of documents and then there is no file named "train" in office_home_dg/art/. Waiting for your help!!

    opened by knight-fzq 12
  • Value of Regex and log.txt not matching.

    Value of Regex and log.txt not matching.

    Looking through the code, I found this

     metric1 = {
            'name': 'accuracy',
            'regex': re.compile(r'\* accuracy: ([\.\deE+-]+)%')
        }
    
        metric2 = {
            'name': 'error',
            'regex': re.compile(r'\* error: ([\.\deE+-]+)%')
        }
    

    Turns out that inside parse_function, the regex isn't matched with what is in the log.txt. This makes it not populate the output variable of the code shown below.

     for metric in metrics:
            match = metric['regex'].search(line)
            if match and good_to_go:
                print("good_to_go")
                if 'file' not in output:
                    output['file'] = fpath
                num = float(match.group(1))
                name = metric['name']
                output[name] = num
    

    What am I doing wrong?

    opened by brainie 11
  • Questions about the validation set

    Questions about the validation set

    Hi, Did you use the validation set in the model training? Considering that in PACS and Office-Home datasets, the number of the validation set is much larger than the number of the training set under semi-supervised DG setting, I think it is unreasonable to use such large validation sets to select hyperparameters or the final models because it is inconsistent with reality. I observe that you keep validation sets in the data splits, what are they used for?

    opened by GA-17a 5
  • Can not reproduce the number reported in the paper.

    Can not reproduce the number reported in the paper.

    Hi, I ran fixmatch about five times for PACS with 210 labels. When the target domain is A, I only got 77.07% accuracy, which is 1% lower than the result in the paper.

    opened by qianlanwyd 3
  • AssertionError: Nothing found in output/ssdg_pacs/nlab_210/StyleMatch/resnet18/v1/art_painting

    AssertionError: Nothing found in output/ssdg_pacs/nlab_210/StyleMatch/resnet18/v1/art_painting

    Good morning. Please i was trying to explore the semi supervised domain Generalization. I have followed the readme file to the last line but it came out with this line

    Parsing files in output/ssdg_pacs/nlab_210/StyleMatch/resnet18/v1/art_painting Traceback (most recent call last): File "parse_test_res.py", line 188, in main(args, end_signal) File "parse_test_res.py", line 147, in main end_signal=end_signal File "parse_test_res.py", line 97, in parse_function assert len(outputs) > 0, f'Nothing found in {directory}' AssertionError: Nothing found in output/ssdg_pacs/nlab_210/StyleMatch/resnet18/v1/art_painting

    Please what can i do?

    opened by brainie 2
  • Dataset number doesnot match the original paper

    Dataset number doesnot match the original paper

    Hi! I run your code and the data number of PACS is 9990 (6926+806+2048), but the paper (https://arxiv.org/pdf/1710.03077) says PACS has 9991 data.

    ***** Dataset statistics ***** Dataset: SSDGPACS Source domains: ['cartoon', 'photo', 'sketch'] Target domains: ['art_painting'] #classes: 7 #train_x: 210 #train_u: 6,926
    #val: 806 #test: 2,048

    opened by kevinbro96 2
  • Key error:

    Key error:"img0"

    Dear author, When I follow your directions and run the command:bash run_ssdg.sh ssdg_pacs 210 v1, there is a error that Key error:"Img0". So I ran the command print(batch_x) and I found that there was no key called "img0". Only "img" and "img2" in that dict. Here is the shot screen: Screenshot from 2021-08-06 13-51-38

    opened by knight-fzq 2
  • Not able to reproduce numbers for PACS dataset.

    Not able to reproduce numbers for PACS dataset.

    I tried running your code and couldn't able to reproduce the numbers for the PACS dataset. The results I obtained for 5 samples per class and the numbers reported in the paper are as follows.

    Art_Painting  77.18   78.54(reported)
    Cartoon  73.74  74.44(reported)
    Photo   89.35    89.25(reported)
    Sketch  76.5   79.06(reported)
    Avg    79.1925 80.32 (reported)
    
    

    I can understand the 1% percentage point difference in the first three domains but

    1. The numbers reported for a sketch are 3 percentage points higher than what I obtained
    2. The number quoted for 5 samples per class is greater than the numbers obtained in the paper for 10 samples per class. I am a bit confused about this.

    Can you please clarify this?

    opened by Griffintaur 5
  • The stop-gradient operation

    The stop-gradient operation

    Hi, How about the experimental results if there is no stop-gradient operation on the weakly augmented branch? Since there is no stop-gradient operation in the original FixMatch paper, what is the role of stop-gradient in SSDG?

    opened by GA-17a 0
Owner
Kaiyang
Researcher in computer vision and machine learning :)
Kaiyang
Implementation for "Domain-Specific Bias Filtering for Single Labeled Domain Generalization"

DSBF Introduction This repository contains the implementation code for paper: Domain-Specific Bias Filtering for Single Labeled Domain Generalization

ScottYuan 7 Jan 5, 2023
The official codes of "Semi-supervised Models are Strong Unsupervised Domain Adaptation Learners".

SSL models are Strong UDA learners Introduction This is the official code of paper "Semi-supervised Models are Strong Unsupervised Domain Adaptation L

Yabin Zhang 26 Dec 26, 2022
Unified unsupervised and semi-supervised domain adaptation network for cross-scenario face anti-spoofing, Pattern Recognition

USDAN The implementation of Unified unsupervised and semi-supervised domain adaptation network for cross-scenario face anti-spoofing, which is accepte

null 11 Nov 3, 2022
Semi-supervised Domain Adaptation via Minimax Entropy

Semi-supervised Domain Adaptation via Minimax Entropy (ICCV 2019) Install pip install -r requirements.txt The code is written for Pytorch 0.4.0, but s

Vision and Learning Group 243 Jan 9, 2023
Implementation of "Semi-supervised Domain Adaptive Structure Learning"

Semi-supervised Domain Adaptive Structure Learning - ASDA This repo contains the source code and dataset for our ASDA paper. Illustration of the propo

null 3 Dec 13, 2021
Training code and evaluation benchmarks for the "Self-Supervised Policy Adaptation during Deployment" paper.

Self-Supervised Policy Adaptation during Deployment PyTorch implementation of PAD and evaluation benchmarks from Self-Supervised Policy Adaptation dur

Nicklas Hansen 101 Nov 1, 2022
UniMoCo: Unsupervised, Semi-Supervised and Full-Supervised Visual Representation Learning

UniMoCo: Unsupervised, Semi-Supervised and Full-Supervised Visual Representation Learning This is the official PyTorch implementation for UniMoCo pape

dddzg 49 Jan 2, 2023
Project looking into use of autoencoder for semi-supervised learning and comparing data requirements compared to supervised learning.

Project looking into use of autoencoder for semi-supervised learning and comparing data requirements compared to supervised learning.

Tom-R.T.Kvalvaag 2 Dec 17, 2021
Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Hybrid-Supervised Object Detection System Object detection system trained by hybrid-supervision/weakly semi-supervision (HSOD/WSSOD): This project is

null 5 Dec 10, 2022
[CVPR'21] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space by Quande Liu, Cheng Chen, Ji

Quande Liu 178 Jan 6, 2023
The code release of paper 'Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization' NIPS 2020.

Domain Generalization for Medical Imaging Classification with Linear Dependency Regularization The code release of paper 'Domain Generalization for Me

Yufei Wang 56 Dec 28, 2022
Domain Generalization with MixStyle, ICLR'21.

MixStyle This repo contains the code of our ICLR'21 paper, "Domain Generalization with MixStyle". The OpenReview link is https://openreview.net/forum?

Kaiyang 208 Dec 28, 2022
Official implementation of paper Gradient Matching for Domain Generalization

Gradient Matching for Domain Generalisation This is the official PyTorch implementation of Gradient Matching for Domain Generalisation. In our paper,

null 94 Dec 23, 2022
The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022

DG-TrajGen The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022. Our Meth

Wang 25 Sep 26, 2022
This repo includes our code for evaluating and improving transferability in domain generalization (NeurIPS 2021)

Transferability for domain generalization This repo is for evaluating and improving transferability in domain generalization (NeurIPS 2021), based on

gordon 9 Nov 29, 2022
This codebase is the official implementation of Test-Time Classifier Adjustment Module for Model-Agnostic Domain Generalization (NeurIPS2021, Spotlight)

Test-Time Classifier Adjustment Module for Model-Agnostic Domain Generalization This codebase is the official implementation of Test-Time Classifier A

null 47 Dec 28, 2022
CSAC - Collaborative Semantic Aggregation and Calibration for Separated Domain Generalization

CSAC Introduction This repository contains the implementation code for paper: Co

ScottYuan 5 Jul 22, 2022
CoSMA: Convolutional Semi-Regular Mesh Autoencoder. From Paper "Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes"

Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes Implementation of CoSMA: Convolutional Semi-Regular Mesh Autoencoder arXiv p

Fraunhofer SCAI 10 Oct 11, 2022