Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

Daniil Pakhomov

Last update: Dec 19, 2022

Related tags

Deep Learning segmentation_in_style

Overview

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

Abstract: We introduce a method that allows to automatically segment images into semantically meaningful regions without human supervision. Derived regions are consistent across different images and coincide with human-defined semantic classes on some datasets. In cases where semantic regions might be hard for human to define and consistently label, our method is still able to find meaningful and consistent semantic classes. In our work, we use pretrained StyleGAN2 generative model: clustering in the feature space of the generative model allows to discover semantic classes. Once classes are discovered, a synthetic dataset with generated images and corresponding segmentation masks can be created. After that a segmentation model is trained on the synthetic dataset and is able to generalize to real images. Additionally, by using CLIP we are able to use prompts defined in a natural language to discover some desired semantic classes. We test our method on publicly available datasets and show state-of-the-art results.

This repository contains the official Pytorch implementation of the following paper:

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP
Daniil Pakhomov, Sanchit Hira, Narayani Wagle, Kemar E. Green, Nassir Navab
https://arxiv.org/abs/2107.12518

You might also like...

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation) Download Synthia dataset The model uses

32 Sep 21, 2022

Only a Matter of Style: Age Transformation Using a Style-Based Regression Model

Only a Matter of Style: Age Transformation Using a Style-Based Regression Model The task of age transformation illustrates the change of an individual

444 Dec 30, 2022

Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

U2Fusion Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal (VIS-IR, medical), multi

129 Dec 11, 2022

PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in clustering (CVPR2021)

Comments

Cluster Classification

Thanks for good work.

I read the paper.

In the paper, there is a paragraph on Cluster Classification.

But I can't find it in the repo.

Where can I find it?

opened by jjeamin 2
How to train the stylegan model?

Hi, that is a great work, but I have a problem. Such as in "kmeans_clustering_search_human.ipynb", you use the trained model "output_path = 'human_ada.pth". My problem is when you use K-means, why it can directly cluster two eyes or two ears together. So when you trained the stylegan model, do you need the classes label?

opened by zackzhao 2
Cluster classification code

Can you please share the code to classify clusters (same issue as #7 )? Also, do you have any intuitive justification for why the technique works despite removing the downsampling layers?

Thank you

opened by aliasvishnu 1
oct_stylegandada_train_256.ipnyb Execution Issue

I get this error on the cell before the last one. The file specified is not to be found in the given folder. I tried to execute it in Google Colab

output path /content/stylegan2-ada/results/00000-oct-mirror-auto1-noaug-resumecustom/network-snapshot-000915.pkl

FileNotFoundError Traceback (most recent call last) in () 40 tflib.init_tf() 41 ---> 42 with open(args.path, "rb") as f: 43 generator, discriminator, g_ema = pickle.load(f) 44 Gs = g_ema

FileNotFoundError: [Errno 2] No such file or directory: '/content/stylegan2-ada/results/00000-oct-mirror-auto1-noaug-resumecustom/network-snapshot-000915.pkl'

opened by gusreic 2

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

Related tags

Overview

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

You might also like...

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Only a Matter of Style: Age Transformation Using a Style-Based Regression Model

Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in clustering (CVPR2021)

Official PyTorch implementation of Retrieve in Style: Unsupervised Facial Feature Transfer and Retrieval.

Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals.

(CVPR2021) DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation

Official implementation of "DSP: Dual Soft-Paste for Unsupervised Domain Adaptive Semantic Segmentation"

TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation

Comments

Cluster Classification

How to train the stylegan model?

Cluster classification code

oct_stylegandada_train_256.ipnyb Execution Issue

output path /content/stylegan2-ada/results/00000-oct-mirror-auto1-noaug-resumecustom/network-snapshot-000915.pkl

Owner

Daniil Pakhomov

[SIGGRAPH'22] StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets

Transfer style api - An API to use with Tranfer Style App, where you can use two image and transfer the style

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

Streamlit Tutorial (ex: stock price dashboard, cartoon-stylegan, vqgan-clip, stylemixing, styleclip, sefa)

CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP

An architecture that makes any doodle realistic, in any specified style, using VQGAN, CLIP and some basic embedding arithmetics.

Fast Neural Style for Image Style Transform by Pytorch

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation

Implementation of StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation in PyTorch

Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018