Pytorch implementation of the paper Improving Text-to-Image Synthesis Using Contrastive Learning

Related tags

Deep Learning T2I_CL
Overview

T2I_CL

This is the official Pytorch implementation of the paper Improving Text-to-Image Synthesis Using Contrastive Learning

Requirements

  • Linux

  • Python ≥ 3.6

  • PyTorch ≥ 1.4.0

Prepare Data

Download the preprocessed datasets from AttnGAN

Alternatively, another site is from DM-GAN

Training

  • Pretrain DAMSM+CL:

    • For bird dataset: python pretrain_DAMSM.py --cfg cfg/DAMSM/bird.yml --gpu 0
    • For coco dataset: python pretrain_DAMSM.py --cfg cfg/DAMSM/coco.yml --gpu 0
  • Train AttnGAN+CL:

    • For bird dataset: python main.py --cfg cfg/bird_attn2.yml --gpu 0
    • For coco dataset: python main.py --cfg cfg/coco_attn2.yml --gpu 0
  • Train DM-GAN+CL:

    • For bird dataset: python main.py --cfg cfg/bird_DMGAN.yml --gpu 0
    • For coco dataset: python main.py --cfg cfg/coco_DMGAN.yml --gpu 0

Pretrained Models

Evaluation

  • Sampling and get the R-precision:

    • python main.py --cfg cfg/eval_bird.yml --gpu 0
    • python main.py --cfg cfg/eval_coco.yml --gpu 0
  • Inception score:

    • python inception_score_bird.py --image_folder fake_images_bird
    • python inception_score_coco.py fake_images_coco
  • FID:

    • python fid_score.py --gpu 0 --batch-size 50 --path1 real_images_bird --path2 fake_images_bird
    • python fid_score.py --gpu 0 --batch-size 50 --path1 real_images_coco --path2 fake_images_coco

Citation

If you find this work useful in your research, please consider citing:

@article{ye2021improving,
  title={Improving Text-to-Image Synthesis Using Contrastive Learning},
  author={Ye, Hui and Yang, Xiulong and Takac, Martin and Sunderraman, Rajshekhar and Ji, Shihao},
  journal={arXiv preprint arXiv:2107.02423},
  year={2021}
}

Acknowledge

Our work is based on the following works:

Comments
  • Throwing import error

    Throwing import error

    I'm very sorry for posting such a simple bug but for the life of me I can't figure out why its throwing this error, I have tried reinstalling numpy and have all the required packages. Numpy even works when running other python files but not here for some reason.

    Error:

    Traceback (most recent call last):
      File "pretrain_DAMSM.py", line 3, in <module>
        from miscc.utils import mkdir_p
      File "/books-nn/T2I_CL/DM-GAN+CL/code/miscc/utils.py", line 4, in <module>
        from torch.nn import init
    ModuleNotFoundError: No module named 'torch'
    
    opened by Stelath 5
  • assert dataset Assertion Error

    assert dataset Assertion Error

    screen

    Hi. Is there something wrong with my data import? The path was changed in the cofig file, but it seemed that there was a problem reading the text file.

    I saw the original attgan said that the data list should look like this: data/birds Then my dataset is a list like this. Is it correct?

    list

    #if use cd command to explore the dataset:

    !cd /content/T2I_CL/AttnGAN+CL/data/birds !ls

    attributes image_class_labels.txt parts bounding_boxes.txt images README classes.txt images.txt train_test_split.txt

    Is it correct?

    opened by sumorday 4
  • Weight initialization problem

    Weight initialization problem

    hi, I noticed that the code when initializing the weights is different from AttnGAN. Can you tell me the reason for doing this? https://github.com/huiyegit/T2I_CL/blob/6f749b869ac76bc6423bc319adc8f6c7c386f17b/AttnGAN%2BCL/code/miscc/utils.py#L290-L295

    opened by YIRuriZhongtian 4
  • CUDA not executing during runtime

    CUDA not executing during runtime

    CUDA for some reason fails to execute when running, I have the correct version of PyTorch and also have an NVIDIA driver installed on the system.

    Thrown as a result of running the command: python pretrain_DAMSM.py --cfg cfg/DAMSM/book.yml --gpu 0

             'B_DCGAN': False,
             'CONDITION_DIM': 100,
             'DF_DIM': 64,
             'GF_DIM': 128,
             'R_NUM': 2,
             'Z_DIM': 100},
     'GPU_ID': 0,
     'RNN_TYPE': 'LSTM',
     'TEXT': {'CAPTIONS_PER_IMAGE': 1, 'EMBEDDING_DIM': 256, 'WORDS_NUM': 18},
     'TRAIN': {'BATCH_SIZE': 48,
               'B_NET_D': True,
               'DISCRIMINATOR_LR': 0.0002,
               'ENCODER_LR': 0.002,
               'FLAG': True,
               'GENERATOR_LR': 0.0002,
               'MAX_EPOCH': 600,
               'NET_E': '',
               'NET_G': '',
               'RNN_GRAD_CLIP': 0.25,
               'SMOOTH': {'GAMMA1': 4.0,
                          'GAMMA2': 5.0,
                          'GAMMA3': 10.0,
                          'LAMBDA': 1.0},
               'SNAPSHOT_INTERVAL': 50},
     'TREE': {'BASE_SIZE': 299, 'BRANCH_NUM': 1},
     'WORKERS': 1}
    /opt/conda/envs/dm_gan/lib/python3.6/site-packages/torchvision/transforms/transforms.py:220: UserWarning: The use o
    f the transforms.Scale transform is deprecated, please use transforms.Resize instead.                             
      "please use transforms.Resize instead.")
    Load filenames from: ../data/books/train/filenames.pickle (4625)                                                  
    Load filenames from: ../data/books/test/filenames.pickle (1622)                                                   
    Load from:  ../data/books/captions.pickle
    31146 1
    Load filenames from: ../data/books/train/filenames.pickle (4625)                                                  
    Load filenames from: ../data/books/test/filenames.pickle (1622)                                                   
    Load from:  ../data/books/captions.pickle
    /opt/conda/envs/dm_gan/lib/python3.6/site-packages/torch/nn/modules/rnn.py:50: UserWarning: dropout option adds dro
    pout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.5
     and num_layers=1
      "num_layers={}".format(dropout, num_layers))
    Load pretrained model from  https://download.pytorch.org/models/inception_v3_google-1a9a5a14.pth                  
    Traceback (most recent call last):
      File "pretrain_DAMSM.py", line 350, in <module>
        dataset.ixtoword, image_dir, criterion)
      File "pretrain_DAMSM.py", line 87, in train
        words_features, sent_code = cnn_model(imgs[-1])
      File "/opt/conda/envs/dm_gan/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__     
        result = self.forward(*input, **kwargs)
      File "/books-nn/T2I_CL/DM-GAN+CL/code/model.py", line 208, in forward                                           
        x = self.Conv2d_1a_3x3(x)
      File "/opt/conda/envs/dm_gan/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__     
        result = self.forward(*input, **kwargs)
      File "/opt/conda/envs/dm_gan/lib/python3.6/site-packages/torchvision/models/inception.py", line 433, in forward 
        x = self.bn(x)
      File "/opt/conda/envs/dm_gan/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__     
        result = self.forward(*input, **kwargs)
      File "/opt/conda/envs/dm_gan/lib/python3.6/site-packages/torch/nn/modules/batchnorm.py", line 107, in forward   
        exponential_average_factor, self.eps)
      File "/opt/conda/envs/dm_gan/lib/python3.6/site-packages/torch/nn/functional.py", line 1670, in batch_norm      
        training, momentum, eps, torch.backends.cudnn.enabled                                                         
    RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED
    
    opened by Stelath 2
  • pretrained discriminator request

    pretrained discriminator request

    Hello @huiyegit First of all, thank you for providing this code! I am trying to utilize transfer learning (as in http://arxiv.org/pdf/1805.01677v2) to see if I can optimize the model for my test domain. As far as I understand, I would need a pretrained Discriminator as well to perform this task. Unfortunately, due to hardware limitations, I am unable to perform the training myself with reasonable effort. This is why I wanted to ask if it would be possible for you to provide the discriminator model to go with the DM-GAN-CL COCO Generator with 200 Epochs and the AttnGAN-CL COCO Generator with 80 Epochs?

    opened by TimStefany 2
  • file not found

    file not found

    Hi there, I cannot find file inception_score_coco.py when I want to evaluate the inception score of the generated images. Is there anything wrong with that?

    opened by MaxyLee 2
  • How to calculate R-precision?

    How to calculate R-precision?

    @huiyegit

    Is there any other file which I should follow to calculate R-precision ?

    It says that: Sampling and get the R-precision: python main.py --cfg cfg/eval_bird.yml --gpu 0

    but I dont see any functions inside main.py to calculate R-precision.

    opened by priyankaupadhyay090 1
  • RuntimeError: Input, output and indices must be on the current device

    RuntimeError: Input, output and indices must be on the current device

    Hey, I am trying to run AttnGAN+CL main.py for sampling (python main.py --cfg cfg/eval_bird.yml --gpu 0) and getting an error.

    python main.py Using config: {'B_VALIDATION': True, 'CONFIG_NAME': 'attn2', 'CUDA': False, 'DATASET_NAME': 'birds', 'DATA_DIR': 'data/birds', 'GAN': {'B_ATTENTION': True, 'B_DCGAN': False, 'CONDITION_DIM': 100, 'DF_DIM': 64, 'GF_DIM': 32, 'R_NUM': 2, 'Z_DIM': 100}, 'GPU_ID': 0, 'RNN_TYPE': 'LSTM', 'TEXT': {'CAPTIONS_PER_IMAGE': 10, 'EMBEDDING_DIM': 256, 'WORDS_NUM': 25}, 'TRAIN': {'BATCH_SIZE': 10, 'B_NET_D': False, 'DISCRIMINATOR_LR': 0.0002, 'ENCODER_LR': 0.0002, 'FLAG': False, 'GENERATOR_LR': 0.0002, 'MAX_EPOCH': 600, 'NET_E': 'DAMSMencoders/bird/text_encoder200.pth', 'NET_G': 'models/netG_epoch_600.pth', 'RNN_GRAD_CLIP': 0.25, 'SMOOTH': {'GAMMA1': 5.0, 'GAMMA2': 5.0, 'GAMMA3': 10.0, 'LAMBDA': 1.0}, 'SNAPSHOT_INTERVAL': 2000}, 'TREE': {'BASE_SIZE': 64, 'BRANCH_NUM': 3}, 'WORKERS': 1} seed now is : 100 Total filenames: 11788 001.Black_footed_Albatross/Black_Footed_Albatross_0046_18.jpg load images pickles Load filenames from: data/birds/train/filenames.pickle (8855) loading train images load images pickles Load filenames from: data/birds/test/filenames.pickle (2933) loading test images Load from: data/birds/captions.pickle captions file loaded for test 5450 10 generating images for the whole valid dataset self encoder /opt/conda/lib/python3.6/site-packages/torch/nn/modules/rnn.py:61: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.5 and num_layers=1 "num_layers={}".format(dropout, num_layers)) calling text encoder as RNN encoder Load text encoder from: DAMSMencoders/bird/text_encoder200.pth /opt/conda/lib/python3.6/site-packages/torchvision/models/inception.py:77: FutureWarning: The default weight initialization of inception_v3 will be changed in future releases of torchvision. If you wish to keep the old behavior (which leads to long initialization times due to scipy/scipy#11299), please set init_weights=True. ' due to scipy/scipy#11299), please set init_weights=True.', FutureWarning) Load pretrained model from https://download.pytorch.org/models/inception_v3_google-1a9a5a14.pth calling image encoder Load image encoder from: DAMSMencoders/bird/image_encoder200.pth /netscratch/pupadhyay/project/T2I_CL/AttnGAN+CL/trainer.py:465: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead. noise = Variable(torch.FloatTensor(batch_size, nz), volatile=True) Load G from: models/netG_epoch_600.pth cnt: 10 word_emb and sent_emb starts calling RNN encoder forward loop embedding value Traceback (most recent call last): File "main.py", line 193, in algo.sampling(split_dir) # sampling() defined in trainer.py file File "/netscratch/pupadhyay/project/T2I_CL/AttnGAN+CL/trainer.py", line 504, in sampling words_embs, sent_emb = text_encoder(captions, cap_lens, hidden) File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/netscratch/pupadhyay/project/T2I_CL/AttnGAN+CL/model.py", line 139, in forward emb = self.encoder(captions) File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/sparse.py", line 126, in forward self.norm_type, self.scale_grad_by_freq, self.sparse) File "/opt/conda/lib/python3.6/site-packages/torch/nn/functional.py", line 1855, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: Input, output and indices must be on the current device.

    Error is coming from model.py line139 where we are defining RNN_ENCODER, forwad() function: emb = self.encoder(captions)

    I have set WORKERS = 1, GPU_ID = 0 into cfg/bird.yml so that if multi-gpu was giving the error, it should be solve by only using one GPU.

    This is the container command I used to srun --container-image=/netscratch/enroot/dlcc_pytorch_20.10.sqsh --mem=64000M --cpus-per-task=16 --gres=gpu:1 --pty /bin/bash

    Has anyone faced this issue ?

    opened by priyankaupadhyay090 1
  • Cog version

    Cog version

    "😵 Uh oh! This model can't be run on Replicate because it was built with a version of Cog that is no longer supported." https://replicate.com/huiyegit/t2i_cl

    opened by Jakeukalane 0
  • Issue of FID score

    Issue of FID score

    @huiyegit @shihaoji I have calculated the FID score but I am getting very high value : 35.190974412567414

    Am I doing anything wrong? Please let me know if there is any hyper parameter which I need to change to calculate the FID

    opened by priyankaupadhyay090 5
  • RuntimeError: Input, output and indices must be on the current device

    RuntimeError: Input, output and indices must be on the current device

    @huiyegit and @shihaoji thank you for the nice work. I am using code for AttnGAN+CL. I am trying to generated samples by using Sampling and get the R-precision: python main.py --cfg cfg/eval_bird.yml --gpu 0

    While running main.py. I set

    WORKERS = 1 GPU_ID = 0

    I got an error:

    python main.py --cfg cfg/eval_bird.yml --gpu 0 Using config: {'B_VALIDATION': True, 'CONFIG_NAME': 'attn2', 'CUDA': False, 'DATASET_NAME': 'birds', 'DATA_DIR': 'data/birds', 'GAN': {'B_ATTENTION': True, 'B_DCGAN': False, 'CONDITION_DIM': 100, 'DF_DIM': 64, 'GF_DIM': 32, 'R_NUM': 2, 'Z_DIM': 100}, 'GPU_ID': 0, 'RNN_TYPE': 'LSTM', 'TEXT': {'CAPTIONS_PER_IMAGE': 10, 'EMBEDDING_DIM': 256, 'WORDS_NUM': 25}, 'TRAIN': {'BATCH_SIZE': 10, 'B_NET_D': False, 'DISCRIMINATOR_LR': 0.0002, 'ENCODER_LR': 0.0002, 'FLAG': False, 'GENERATOR_LR': 0.0002, 'MAX_EPOCH': 600, 'NET_E': 'DAMSMencoders/bird/text_encoder200.pth', 'NET_G': 'models/netG_epoch_600.pth', 'RNN_GRAD_CLIP': 0.25, 'SMOOTH': {'GAMMA1': 5.0, 'GAMMA2': 5.0, 'GAMMA3': 10.0, 'LAMBDA': 1.0}, 'SNAPSHOT_INTERVAL': 2000}, 'TREE': {'BASE_SIZE': 64, 'BRANCH_NUM': 3}, 'WORKERS': 1} seed now is : 100 Total filenames: 11788 001.Black_footed_Albatross/Black_Footed_Albatross_0046_18.jpg load images pickles Load filenames from: data/birds/train/filenames.pickle (8855) loading train images load images pickles Load filenames from: data/birds/test/filenames.pickle (2933) loading test images Load from: data/birds/captions.pickle captions file loaded for test 5450 10 generating images for the whole valid dataset self encoder /opt/conda/lib/python3.6/site-packages/torch/nn/modules/rnn.py:61: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.5 and num_layers=1 "num_layers={}".format(dropout, num_layers)) calling text encoder as RNN encoder Load text encoder from: DAMSMencoders/bird/text_encoder200.pth /opt/conda/lib/python3.6/site-packages/torchvision/models/inception.py:77: FutureWarning: The default weight initialization of inception_v3 will be changed in future releases of torchvision. If you wish to keep the old behavior (which leads to long initialization times due to scipy/scipy#11299), please set init_weights=True. ' due to scipy/scipy#11299), please set init_weights=True.', FutureWarning) Load pretrained model from https://download.pytorch.org/models/inception_v3_google-1a9a5a14.pth calling image encoder Load image encoder from: DAMSMencoders/bird/image_encoder200.pth /netscratch/pupadhyay/project/T2I_CL/AttnGAN+CL/trainer.py:465: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead. noise = Variable(torch.FloatTensor(batch_size, nz), volatile=True) Load G from: models/netG_epoch_600.pth cnt: 10 word_emb and sent_emb starts calling RNN encoder forward loop embedding value Traceback (most recent call last): File "main.py", line 193, in algo.sampling(split_dir) # sampling() defined in trainer.py file File "/netscratch/pupadhyay/project/T2I_CL/AttnGAN+CL/trainer.py", line 504, in sampling words_embs, sent_emb = text_encoder(captions, cap_lens, hidden) File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/netscratch/pupadhyay/project/T2I_CL/AttnGAN+CL/model.py", line 139, in forward emb = self.encoder(captions) File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/sparse.py", line 126, in forward self.norm_type, self.scale_grad_by_freq, self.sparse) File "/opt/conda/lib/python3.6/site-packages/torch/nn/functional.py", line 1855, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: Input, output and indices must be on the current device

    I used 1 GPU for container as well to avoid multi-gpu uses to solve the error but error remains same.

    srun --container-image=/netscratch/enroot/dlcc_pytorch_20.10.sqsh --container-workdir=pwd -p V100-16GB,V100-32GB,A100,RTX6000,RTX3090,RTXA6000 --mem=64000M --cpus-per-task=16 --gres=gpu:1 --time=08:00:00 --pty /bin/bash

    is there anyway to solve this ?

    opened by priyankaupadhyay090 4
  • issue of IS score (from stu.sdu.edu.kz)

    issue of IS score (from stu.sdu.edu.kz)

    I am sorry my email can not be delivered to your email address. I post my reply here. Please try this source code to calculate the IS score: https://github.com/MinfengZhu/DM-GAN/tree/master/eval/IS For dataset Bird, there is one pre-trained Inception-V3 : https://github.com/hanzhanggit/StackGAN-inception-model For dataset COCO, the Inception-V3 should be downloaded automatically.

    opened by huiyegit 3
  • Add Docker environment & web demo

    Add Docker environment & web demo

    Hey @huiyegit! 👋

    This pull request makes it possible to run your model inside a Docker environment, which makes it easier for other people to run it. We're using an open source tool called Cog to make this process easier.

    This also means we can make a web page where other people can try out your model! View it here: https://replicate.ai/huiyegit/t2i_cl

    That page also has instructions on how to use the Docker image, which is on our registry at r8.im/huiyegit/t2i_cl.

    In case you're wondering who the heck I am, I'm from Replicate, where we're trying to make machine learning reproducible. We got frustrated that we couldn't run all the really interesting ML work being done. So, we're going round implementing models we like. :)

    opened by bfirsh 0
Owner
null
Saeed Lotfi 28 Dec 12, 2022
Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

Noise Contrastive Estimation for pyTorch Overview This repository contains a re-implementation of the Noise Contrastive Estimation algorithm, implemen

Denis Emelin 42 Nov 24, 2022
A PyTorch implementation of the paper "Semantic Image Synthesis via Adversarial Learning" in ICCV 2017

Semantic Image Synthesis via Adversarial Learning This is a PyTorch implementation of the paper Semantic Image Synthesis via Adversarial Learning. Req

Seonghyeon Nam 146 Nov 25, 2022
Improving Contrastive Learning by Visualizing Feature Transformation, ICCV 2021 Oral

Improving Contrastive Learning by Visualizing Feature Transformation This project hosts the codes, models and visualization tools for the paper: Impro

Bingchen Zhao 83 Dec 15, 2022
[NeurIPS 2021] “Improving Contrastive Learning on Imbalanced Data via Open-World Sampling”,

Improving Contrastive Learning on Imbalanced Data via Open-World Sampling Introduction Contrastive learning approaches have achieved great success in

VITA 24 Dec 17, 2022
Code for the paper "Improving Vision-and-Language Navigation with Image-Text Pairs from the Web" (ECCV 2020)

Improving Vision-and-Language Navigation with Image-Text Pairs from the Web Arjun Majumdar, Ayush Shrivastava, Stefan Lee, Peter Anderson, Devi Parikh

Arjun Majumdar 44 Dec 14, 2022
Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch

CoCa - Pytorch Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch. They were able to elegantly fit in contras

Phil Wang 565 Dec 30, 2022
Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

SwinTextSpotter This is the pytorch implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text R

mxin262 183 Jan 3, 2023
Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

StackGAN-v2 StackGAN-v1: Tensorflow implementation StackGAN-v1: Pytorch implementation Inception score evaluation Pytorch implementation for reproduci

Han Zhang 809 Dec 16, 2022
Personal implementation of paper "Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval"

Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval This repo provides personal implementation of paper Approximate Ne

John 8 Oct 7, 2022
Implementation of Analyzing and Improving the Image Quality of StyleGAN (StyleGAN 2) in PyTorch

Implementation of Analyzing and Improving the Image Quality of StyleGAN (StyleGAN 2) in PyTorch

Kim Seonghyeon 2.2k Jan 1, 2023
Cross-Modal Contrastive Learning for Text-to-Image Generation

Cross-Modal Contrastive Learning for Text-to-Image Generation This repository hosts the open source JAX implementation of XMC-GAN. Setup instructions

Google Research 94 Nov 12, 2022
PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

WaveGrad2 - PyTorch Implementation PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis. Status (202

Keon Lee 59 Dec 6, 2022
PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

VAENAR-TTS - PyTorch Implementation PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

Keon Lee 67 Nov 14, 2022
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

Deepvoice3_pytorch PyTorch implementation of convolutional networks-based text-to-speech synthesis models: arXiv:1710.07654: Deep Voice 3: Scaling Tex

Ryuichi Yamamoto 1.8k Jan 8, 2023
Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch

NÜWA - Pytorch (wip) Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch. This repository will be popul

Phil Wang 463 Dec 28, 2022
Official PyTorch implementation of the paper: Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting.

Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting Official PyTorch implementation of the paper: Improving Graph Neural Net

Giorgos Bouritsas 58 Dec 31, 2022
PyTorch implementation code for the paper MixCo: Mix-up Contrastive Learning for Visual Representation

How to Reproduce our Results This repository contains PyTorch implementation code for the paper MixCo: Mix-up Contrastive Learning for Visual Represen

opcrisis 46 Dec 15, 2022
PyTorch implementation of DirectCLR from paper Understanding Dimensional Collapse in Contrastive Self-supervised Learning

DirectCLR DirectCLR is a simple contrastive learning model for visual representation learning. It does not require a trainable projector as SimCLR. It

Meta Research 49 Dec 21, 2022