Toward a Visual Concept Vocabulary for GAN Latent Space, ICCV 2021

Sarah Schwettmann

Last update: Dec 23, 2022

Related tags

Text Data & NLP visual-vocab

Overview

Toward a Visual Concept Vocabulary for GAN Latent Space
_{Code and data from the ICCV 2021 paper}

Sarah Schwettmann, Evan Hernandez, David Bau, Samuel Klein, Jacob Andreas, Antonio Torralba
Paper | Website | arxiv

This repository contains code for finding layer-selective directions, distilling them, and loading the vocabulary of visual concepts in BigGAN used in the original paper.

Notice: This repository is under active development! Expect instability until at least October 25th, 2021.

Installation

The provided code has been tested for Python 3.8 on MacOS and Ubuntu 20.04. It may still work in other environments, but we make no guarantees.

To run the code yourself, start by cloning the repository:

git clone https://github.com/schwettmann/visual-vocab
cd visual-vocab

(Optional) You will probably want to create a conda environment or virtual environment instead of installing the dependencies globally. E.g., to create a new virtual environment you can run:

python3 -m venv env
source env/bin/activate

Finally, install the Python dependencies using pip:

pip3 install -r requirements.txt

Usage

Notice: This section is under construction and will be updated as functionality gets added.

To download any of the various annotated directions from the paper, use datasets.load submodule. It downloads and parses the annoated directions. Example usage:

from visualvocab import datasets

# Download layer-selective directions and annotations used for distilling single-word directions:
dataset = datasets.load('lsd_all')

# Download distilled directions for all BigGAN-Places365 categories:
dataset = datasets.load('distilled_all')

# Download distilled directions for a specific BigGAN-Places365 category:
dataset = datasets.load('distilled_cottage')

See the module for a full list of available annotated directions.

Citation

Sarah Schwettmann, Evan Hernandez, David Bau, Samuel Klein, Jacob Andreas, Antonio Torralba. Toward a Visual Concept Vocabulary for GAN Latent Space, Proceedings of the International Conference on Computer Vision (ICCV), 2021.

Bibtex

@InProceedings{Schwettmann_2021_ICCV,
    author    = {Schwettmann, Sarah and Hernandez, Evan and Bau, David and Klein, Samuel and Andreas, Jacob and Torralba, Antonio},
    title     = {Toward a Visual Concept Vocabulary for GAN Latent Space},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {6804-6812}
}

[ICCV 2021] Instance-level Image Retrieval using Reranking Transformers

Instance-level Image Retrieval using Reranking Transformers Fuwen Tan, Jiangbo Yuan, Vicente Ordonez, ICCV 2021. Abstract Instance-level image retriev

86 Dec 28, 2022

Implementation of TTS with combination of Tacotron2 and HiFi-GAN

Tacotron2-HiFiGAN-master Implementation of TTS with combination of Tacotron2 and HiFi-GAN for Mandarin TTS. Inference In order to inference, we need t

7 Nov 11, 2022

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

Parallel WaveGAN implementation with Pytorch This repository provides UNOFFICIAL pytorch implementations of the following models: Parallel WaveGAN Mel

1.2k Dec 23, 2022

Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing

Token Shift GPT Implementation of Token Shift GPT - An autoregressive model that relies solely on shifting along the sequence dimension and feedforwar

32 Oct 14, 2022

topic modeling on unstructured data in Space news articles retrieved from the Guardian (UK) newspaper using API

NLP Space News Topic Modeling Photos by nasa.gov (1, 2, 3, 4, 5) and extremetech.com Table of Contents Project Idea Data acquisition Primary data sour

1 Jan 3, 2022

Visual Automata is a Python 3 library built as a wrapper for Caleb Evans' Automata library to add more visualization features.

55 Nov 17, 2022

Learning Spatio-Temporal Transformer for Visual Tracking

Toward a Visual Concept Vocabulary for GAN Latent Space, ICCV 2021

Related tags

Overview

Toward a Visual Concept Vocabulary for GAN Latent Space
_{Code and data from the ICCV 2021 paper}

Installation

Usage

Citation

Bibtex

You might also like...

[ICCV 2021] Instance-level Image Retrieval using Reranking Transformers

Implementation of TTS with combination of Tacotron2 and HiFi-GAN

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing

topic modeling on unstructured data in Space news articles retrieved from the Guardian (UK) newspaper using API

Visual Automata is a Python 3 library built as a wrapper for Caleb Evans' Automata library to add more visualization features.

Learning Spatio-Temporal Transformer for Visual Tracking

A simple visual front end to the Maya UE4 RBF plugin delivered with MetaHumans

TalkNet: Audio-visual active speaker detection Model

Owner

Sarah Schwettmann

Pre-training BERT masked language models with custom vocabulary

Semi-automated vocabulary generation from semantic vector models

ACL22 paper: Imputing Out-of-Vocabulary Embeddings with LOVE Makes Language Models Robust with Little Cost

[ICCV 2021] Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification

Line as a Visual Sentence: Context-aware Line Descriptor for Visual Localization

Concept Modeling: Topic Modeling on Images and Text

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

Code for the paper: Sequence-to-Sequence Learning with Latent Neural Grammars

A Multilingual Latent Dirichlet Allocation (LDA) Pipeline with Stop Words Removal, n-gram features, and Inverse Stemming, in Python.

Toward a Visual Concept Vocabulary for GAN Latent Space, ICCV 2021

Related tags

Overview

Toward a Visual Concept Vocabulary for GAN Latent Space Code and data from the ICCV 2021 paper

Installation

Usage

Citation

Bibtex

You might also like...

[ICCV 2021] Instance-level Image Retrieval using Reranking Transformers

Implementation of TTS with combination of Tacotron2 and HiFi-GAN

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing

topic modeling on unstructured data in Space news articles retrieved from the Guardian (UK) newspaper using API

Visual Automata is a Python 3 library built as a wrapper for Caleb Evans' Automata library to add more visualization features.

Learning Spatio-Temporal Transformer for Visual Tracking

A simple visual front end to the Maya UE4 RBF plugin delivered with MetaHumans

TalkNet: Audio-visual active speaker detection Model

Owner

Sarah Schwettmann

Pre-training BERT masked language models with custom vocabulary

Semi-automated vocabulary generation from semantic vector models

ACL22 paper: Imputing Out-of-Vocabulary Embeddings with LOVE Makes Language Models Robust with Little Cost

[ICCV 2021] Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification

Line as a Visual Sentence: Context-aware Line Descriptor for Visual Localization

Concept Modeling: Topic Modeling on Images and Text

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

Code for the paper: Sequence-to-Sequence Learning with Latent Neural Grammars

A Multilingual Latent Dirichlet Allocation (LDA) Pipeline with Stop Words Removal, n-gram features, and Inverse Stemming, in Python.

Toward a Visual Concept Vocabulary for GAN Latent Space
_{Code and data from the ICCV 2021 paper}