Text to image synthesis using thought vectors

Overview

Text To Image Synthesis Using Thought Vectors

Join the chat at https://gitter.im/text-to-image/Lobby

This is an experimental tensorflow implementation of synthesizing images from captions using Skip Thought Vectors. The images are synthesized using the GAN-CLS Algorithm from the paper Generative Adversarial Text-to-Image Synthesis. This implementation is built on top of the excellent DCGAN in Tensorflow. The following is the model architecture. The blue bars represent the Skip Thought Vectors for the captions.

Model architecture

Image Source : Generative Adversarial Text-to-Image Synthesis Paper

Requirements

Datasets

  • All the steps below for downloading the datasets and models can be performed automatically by running python download_datasets.py. Several gigabytes of files will be downloaded and extracted.
  • The model is currently trained on the flowers dataset. Download the images from this link and save them in Data/flowers/jpg. Also download the captions from this link. Extract the archive, copy the text_c10 folder and paste it in Data/flowers.
  • Download the pretrained models and vocabulary for skip thought vectors as per the instructions given here. Save the downloaded files in Data/skipthoughts.
  • Make empty directories in Data, Data/samples, Data/val_samples and Data/Models. They will be used for sampling the generated images and saving the trained models.

Usage

  • Data Processing : Extract the skip thought vectors for the flowers data set using :
python data_loader.py --data_set="flowers"
  • Training

    • Basic usage python train.py --data_set="flowers"
    • Options
      • z_dim: Noise Dimension. Default is 100.
      • t_dim: Text feature dimension. Default is 256.
      • batch_size: Batch Size. Default is 64.
      • image_size: Image dimension. Default is 64.
      • gf_dim: Number of conv in the first layer generator. Default is 64.
      • df_dim: Number of conv in the first layer discriminator. Default is 64.
      • gfc_dim: Dimension of gen untis for for fully connected layer. Default is 1024.
      • caption_vector_length: Length of the caption vector. Default is 1024.
      • data_dir: Data Directory. Default is Data/.
      • learning_rate: Learning Rate. Default is 0.0002.
      • beta1: Momentum for adam update. Default is 0.5.
      • epochs: Max number of epochs. Default is 600.
      • resume_model: Resume training from a pretrained model path.
      • data_set: Data Set to train on. Default is flowers.
  • Generating Images from Captions

    • Write the captions in text file, and save it as Data/sample_captions.txt. Generate the skip thought vectors for these captions using:
    python generate_thought_vectors.py --caption_file="Data/sample_captions.txt"
    
    • Generate the Images for the thought vectors using:
    python generate_images.py --model_path=<path to the trained model> --n_images=8
    

    n_images specifies the number of images to be generated per caption. The generated images will be saved in Data/val_samples/. python generate_images.py --help for more options.

Sample Images Generated

Following are the images generated by the generative model from the captions.

Caption Generated Images
the flower shown has yellow anther red pistil and bright red petals
this flower has petals that are yellow, white and purple and has dark lines
the petals on this flower are white with a yellow center
this flower has a lot of small round pink petals.
this flower is orange in color, and has petals that are ruffled and rounded.
the flower has yellow petals and the center of it is brown

Implementation Details

  • Only the uni-skip vectors from the skip thought vectors are used. I have not tried training the model with combine-skip vectors.
  • The model was trained for around 200 epochs on a GPU. This took roughly 2-3 days.
  • The images generated are 64 x 64 in dimension.
  • While processing the batches before training, the images are flipped horizontally with a probability of 0.5.
  • The train-val split is 0.75.

Pre-trained Models

  • Download the pretrained model from here and save it in Data/Models. Use this path for generating the images.

TODO

  • Train the model on the MS-COCO data set, and generate more generic images.
  • Try different embedding options for captions(other than skip thought vectors). Also try to train the caption embedding RNN along with the GAN-CLS model.

References

Alternate Implementations

License

MIT

Comments
  • training error for MS-COCO

    training error for MS-COCO

    When I tried to run training on MS-COCO dataset,

    python data_loader.py --data_set='MS-COCO' --data_dir='MSCOCO-data'
    

    I downloaded the images and captions from [http://mscoco.org/dataset/#download](MS-COCO dataset) The contents of MSCOCO-data/ folder are as follows: MSCOCO-data/ |-- annotations | |-- captions_train2014.json | |-- captions_val2014.json | |-- instances_train2014.json | |-- instances_val2014.json | |-- person_keypoints_train2014.json | |-- person_keypoints_val2014.json |-- meta_train.pkl |-- train2014 | |-- COCO_train2014_000000000009.jpg | |-- COCO_train2014_000000000025.jpg

    I got the below error while running the training code:

    Traceback (most recent call last):
      File "data_loader.py", line 111, in <module>
        main()
      File "data_loader.py", line 108, in main
        save_caption_vectors_ms_coco(args.data_dir, args.split, args.batch_size)
      File "data_loader.py", line 39, in save_caption_vectors_ms_coco
        h5f_tv_batch = h5py.File( join(data_dir, 'tvs/'+split + '_tvs_' + str(batch_no)), 'w')
      File "/usr/local/lib/python2.7/dist-packages/h5py/_hl/files.py", line 272, in __init__
        fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
      File "/usr/local/lib/python2.7/dist-packages/h5py/_hl/files.py", line 98, in make_fid
        fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl)
      File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/tmp/pip-4rPeHA-build/h5py/_objects.c:2684)
      File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/tmp/pip-4rPeHA-build/h5py/_objects.c:2642)
      File "h5py/h5f.pyx", line 96, in h5py.h5f.create (/tmp/pip-4rPeHA-build/h5py/h5f.c:2097)
    IOError: Unable to create file (Unable to open file: name = '/home/nitish/mscoco-data/tvs/train_tvs_0', errno = 2, error message = 'no such file or directory', flags = 13, o_flags = 242)
    
    opened by nitish11 13
  • Same error while generating as well as  training

    Same error while generating as well as training

    Traceback (most recent call last): File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1628, in _create_c_op c_op = c_api.TF_FinishOperation(op_desc) tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimension 1 in both shapes must be equal, but are 100 and 256. Shapes are [64,100] and [64,256]. From merging shape 0 with other shapes. for 'concat/concat_dim' (op: 'Pack') with input shapes: [64,100], [64,256].

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "train.py", line 238, in main() File "train.py", line 76, in main input_tensors, variables, loss, outputs, checks = gan.build_model() File "/home/bsatya/Study/ML/text-to-image/model.py", line 39, in build_model fake_image = self.generator(t_z, t_real_caption) File "/home/bsatya/Study/ML/text-to-image/model.py", line 139, in generator z_concat = tf.concat(1, [t_z, reduced_text_embedding]) File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 1121, in concat dtype=dtypes.int32).get_shape().assert_is_compatible_with( File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1050, in convert_to_tensor as_ref=False) File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1146, in internal_convert_to_tensor ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 971, in _autopacking_conversion_function return _autopacking_helper(v, dtype, name or "packed") File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 923, in _autopacking_helper return gen_array_ops.pack(elems_as_tensors, name=scope) File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 4875, in pack "Pack", values=values, axis=axis, name=name) File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func return func(*args, **kwargs) File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op op_def=op_def) File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1792, in init control_input_ops) File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1631, in _create_c_op raise ValueError(str(e)) ValueError: Dimension 1 in both shapes must be equal, but are 100 and 256. Shapes are [64,100] and [64,256]. From merging shape 0 with other shapes. for 'concat/concat_dim' (op: 'Pack') with input shapes: [64,100], [64,256].

    opened by BhargavSatya 3
  • Add download helper for data files

    Add download helper for data files

    There's quite a few directories to create and files to download/extract, and this script makes it easier to get everything set up. For convenience I also created and included a 6.5MB tar.bz of the flower captions.

    opened by neilsh 3
  • Input of Discriminator

    Input of Discriminator

    Hi, here's another question. Here the discriminator's input is wrong image and right text. However according to the original paper (page 5), they use real image and wrong text which is different from your implementation.

    opened by chingyaoc 2
  • Evaluation Metrics

    Evaluation Metrics

    Hi, First of all, thanks for your awesome work! I'm wondering that is there any evaluation metrics for this kind of generative model since I want to compare the performance between using Skip Thought Vectors and the other embedding options for captions.

    opened by chingyaoc 2
  • Trying to generate images using pre-trained model

    Trying to generate images using pre-trained model

    What I did

    • Downloaded the pre-trained model
    • Created a file (j.caption) with a sample caption
    • Ran: python generate_thought_vectors.py --caption_file=j.caption
    • Got the following error, any ideas?
    ['pink flower with green leaves']
    Loading model parameters...
    Traceback (most recent call last):
      File "generate_thought_vectors.py", line 32, in <module>
        main()
      File "generate_thought_vectors.py", line 23, in main
        model = skipthoughts.load_model()
      File "/Users/jikkujose/Projects/outside_projects/text-to-image/skipthoughts.py", line 38, in load_model
        with open('%s.pkl'%path_to_umodel, 'rb') as f:
    

    Specs

    • Mac OSX 10.11.6
    • Python 2.7.11
    opened by jikkujose 2
  • Data Loader

    Data Loader

    ['image_06288.jpg', 'image_08158.jpg', 'image_05755.jpg', 'image_02589.jpg', 'image_07136.jpg', 'image_00386.jpg', 'image_03853.jpg', 'image_02558.jpg', 'image_02394.jpg', 'image_03728.jpg', 'image_00459.jpg', 'image_02971.jpg', 'image_05165.jpg', 'image_00200.jpg', 'image_05604.jpg', 'image_00165.jpg', 'image_06698.jpg', 'image_07523.jpg', 'image_05897.jpg', 'image_00893.jpg', 'image_07566.jpg', 'image_02521.jpg', 'image_02177.jpg', 'image_08017.jpg', 'image_04124.jpg', 'image_01274.jpg', 'image_01322.jpg', 'image_05418.jpg', 'image_03020.jpg', 'image_04845.jpg', 'image_02937.jpg', 'image_04535.jpg', 'image_05900.jpg', 'image_06085.jpg', 'image_01547.jpg', 'image_01584.jpg', 'image_03256.jpg', 'image_04241.jpg', 'image_07573.jpg', 'image_07429.jpg', 'image_05473.jpg', 'image_03866.jpg', 'image_02641.jpg', 'image_02421.jpg', 'image_03829.jpg', 'image_00172.jpg', 'image_04244.jpg', 'image_01564.jpg', 'image_00103.jpg', 'image_00394.jpg', 'image_06692.jpg', 'image_07503.jpg', 'image_02566.jpg', 'image_07964.jpg', 'image_07431.jpg', 'image_04724.jpg', 'image_02230.jpg', 'image_02434.jpg', 'image_05386.jpg', 'image_05502.jpg', 'image_00485.jpg', 'image_04411.jpg', 'image_01350.jpg', 'image_03127.jpg', 'image_06379.jpg', 'image_05504.jpg', 'image_04690.jpg', 'image_06777.jpg', 'image_04227.jpg', 'image_02020.jpg', 'image_03077.jpg', 'image_07699.jpg', 'image_05176.jpg', 'image_03054.jpg', 'image_03833.jpg', 'image_04077.jpg', 'image_04581.jpg', 'image_01178.jpg', 'image_05925.jpg', 'image_04951.jpg', 'image_01707.jpg', 'image_00215.jpg', 'image_00497.jpg', 'image_01570.jpg', 'image_04317.jpg', 'image_04728.jpg', 'image_00960.jpg', 'image_00775.jpg', 'image_05898.jpg', 'image_00606.jpg', 'image_01223.jpg', 'image_03507.jpg', 'image_00898.jpg', 'image_07854.jpg', 'image_05882.jpg', 'image_06700.jpg', 'image_04863.jpg', 'image_01872.jpg', 'image_06113.jpg', 'image_05639.jpg'] 8189 8189 Loading model parameters... Compiling encoders... Loading tables... Traceback (most recent call last): File "data_loader.py", line 110, in main() File "data_loader.py", line 105, in main save_caption_vectors_flowers(args.data_dir) File "data_loader.py", line 76, in save_caption_vectors_flowers model = skipthoughts.load_model() File "/home/sarah/text-to-image/skipthoughts.py", line 60, in load_model utable, btable = load_tables() File "/home/sarah/text-to-image/skipthoughts.py", line 80, in load_tables utable = numpy.load(path_to_tables + 'utable.npy',encoding='latin1') File "/home/sarah/.local/lib/python3.7/site-packages/numpy/lib/npyio.py", line 453, in load pickle_kwargs=pickle_kwargs) File "/home/sarah/.local/lib/python3.7/site-packages/numpy/lib/format.py", line 739, in read_array raise ValueError("Object arrays cannot be loaded when " ValueError: Object arrays cannot be loaded when allow_pickle=False

    How can I solve it? Python 3.7 Numpy 1.18.4 Tensorflow 2.2.1 Gast 0.3.3

    opened by shiningstar93 1
  • unable to open pickle of skipthoughts

    unable to open pickle of skipthoughts

    XXX@XXX:~/codes/python_codes/text-to-image$ python generate_thought_vectors.py --caption_file="Data/sample_captions.txt" ['the flower shown has yellow anther red pistil and bright red petals'] Loading model parameters... Compiling encoders... Loading tables... Traceback (most recent call last): File "generate_thought_vectors.py", line 33, in main() File "generate_thought_vectors.py", line 23, in main model = skipthoughts.load_model() File "/home/tushar/codes/python_codes/text-to-image/skipthoughts.py", line 60, in load_model utable, btable = load_tables() File "/home/tushar/codes/python_codes/text-to-image/skipthoughts.py", line 80, in load_tables utable = numpy.load(path_to_tables + 'utable.npy') File "/usr/local/lib/python2.7/dist-packages/numpy/lib/npyio.py", line 406, in load pickle_kwargs=pickle_kwargs) File "/usr/local/lib/python2.7/dist-packages/numpy/lib/format.py", line 637, in read_array array = pickle.load(fp, **pickle_kwargs) EOFError

    opened by Exception4U 1
  • Follow-up to feedback on download_datasets.py

    Follow-up to feedback on download_datasets.py

    Incorporated feedback on https://github.com/paarthneekhara/text-to-image/pull/6 and made some other fixes:

    • moved flowers_text_c10.tar.bz2 to Data/
    • corrected some extraction paths
    • modified README.md to mention new download_datasets.py
    • added a requirements.txt
    opened by neilsh 1
  • Did you mean to set reuse=tf.AUTO_REUSE in VarScope?

    Did you mean to set reuse=tf.AUTO_REUSE in VarScope?

    2021-05-26 00:09:24.773721: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found 2021-05-26 00:09:24.781327: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. WARNING:tensorflow:From C:\pythn\envs\abhin\lib\site-packages\tensorflow\python\compat\v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version. Instructions for updating: non-resource variables are not supported in the long term Traceback (most recent call last): File "generate_images.py", line 106, in main() File "generate_images.py", line 64, in main _, _, _, _, _ = gan.build_model() File "C:\Users\Futura Labs\Documents\codes\speechtext_to_image\text-to-image-master\model.py", line 42, in build_model disc_wrong_image, disc_wrong_image_logits = self.discriminator(t_wrong_image, t_real_caption, reuse = True) File "C:\Users\Futura Labs\Documents\codes\speechtext_to_image\text-to-image-master\model.py", line 163, in discriminator h1 = ops.lrelu( self.d_bn1(ops.conv2d(h0, self.options['df_dim']*2, name = 'd_h1_conv'))) #16 File "C:\Users\Futura Labs\Documents\codes\speechtext_to_image\text-to-image-master\Utils\ops.py", line 36, in call ema_apply_op = self.ema.apply([batch_mean, batch_var]) File "C:\pythn\envs\abhin\lib\site-packages\tensorflow\python\training\moving_averages.py", line 469, in apply "Variable", "VariableV2", "VarHandleOp" File "C:\pythn\envs\abhin\lib\site-packages\tensorflow\python\training\slot_creator.py", line 197, in create_zeros_slot colocate_with_primary=colocate_with_primary) File "C:\pythn\envs\abhin\lib\site-packages\tensorflow\python\training\slot_creator.py", line 174, in create_slot_with_initializer dtype) File "C:\pythn\envs\abhin\lib\site-packages\tensorflow\python\training\slot_creator.py", line 73, in _create_slot_var validate_shape=validate_shape) File "C:\pythn\envs\abhin\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1593, in get_variable aggregation=aggregation) File "C:\pythn\envs\abhin\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1336, in get_variable aggregation=aggregation) File "C:\pythn\envs\abhin\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 591, in get_variable aggregation=aggregation) File "C:\pythn\envs\abhin\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 543, in _true_getter aggregation=aggregation) File "C:\pythn\envs\abhin\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 911, in _get_single_variable "reuse=tf.AUTO_REUSE in VarScope?" % name) ValueError: Variable d_bn1/d_bn1_2/moments/Squeeze/ExponentialMovingAverage/ does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=tf.AUTO_REUSE in VarScope?

    opened by thefuturalabs 0
  • data_loader.py

    data_loader.py

    ValueError: ('The following error happened while compiling the node', forall_inplace,cpu,encoder__layers}(Elemwise{Maximum}[(0, 0)].0, Elemwise{sub,no_inplace}.0, InplaceDimShuffle{0,1,x}.0, Subtensor{int64:int64:int8}.0, Subtensor{int64:int64:int8}.0, IncSubtensor{InplaceSet;:int64:}.0, encoder_U, encoder_Ux, ScalarFromTensor.0, ScalarFromTensor.0), '\n', 'numpy.ufunc has the wrong size, try recompiling. Expected 192, got 216')

    when i run this code, this error occurs. Does anyone konw how to solve it?

    opened by silverlilin 0
  • generating images error

    generating images error

    Traceback (most recent call last): File "encode_text.py", line 32, in main() File "encode_text.py", line 16, in main model = skipthoughts.load_model() File "/home/anu/Downloads/TAC-GAN-master/skipthoughts.py", line 60, in load_model utable, btable = load_tables() File "/home/anu/Downloads/TAC-GAN-master/skipthoughts.py", line 80, in load_tables utable = numpy.load(path_to_tables + 'utable.npy', encoding='bytes') File "/home/anu/.local/lib/python3.6/site-packages/numpy/lib/npyio.py", line 453, in load pickle_kwargs=pickle_kwargs) File "/home/anu/.local/lib/python3.6/site-packages/numpy/lib/format.py", line 739, in read_array raise ValueError("Object arrays cannot be loaded when " ValueError: Object arrays cannot be loaded when allow_pickle=False how to solve this this

    opened by anudeekshith 1
  • From merging shape 0 with other shapes. for 'concat/concat_dim' (op: 'Pack') with input shapes: [64,100], [64,256]

    From merging shape 0 with other shapes. for 'concat/concat_dim' (op: 'Pack') with input shapes: [64,100], [64,256]

    /home/naveen/anaconda3/lib/python3.7/site-packages/dask/config.py:168: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. data = yaml.load(f.read()) or {} WARNING:tensorflow:From /home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.init (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version. Instructions for updating: If using Keras pass *_constraint arguments to layers. (64, 256) Traceback (most recent call last): File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1610, in _create_c_op c_op = c_api.TF_FinishOperation(op_desc) tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimension 1 in both shapes must be equal, but are 100 and 256. Shapes are [64,100] and [64,256]. From merging shape 0 with other shapes. for 'concat/concat_dim' (op: 'Pack') with input shapes: [64,100], [64,256].

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "/home/naveen/eclipse-workspace/text-to-image-master/train.py", line 238, in main() File "/home/naveen/eclipse-workspace/text-to-image-master/train.py", line 76, in main input_tensors, variables, loss, outputs, checks = gan.build_model() File "/home/naveen/eclipse-workspace/text-to-image-master/model.py", line 39, in build_model fake_image = self.generator(t_z, t_real_caption) File "/home/naveen/eclipse-workspace/text-to-image-master/model.py", line 141, in generator z_concat = tf.concat(1, [t_z, reduced_text_embedding]) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/util/dispatch.py", line 180, in wrapper return target(*args, **kwargs) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/ops/array_ops.py", line 1427, in concat ops.convert_to_tensor(axis, name="concat_dim",dtype=dtypes.int32).get_shape().assert_has_rank(0) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1184, in convert_to_tensor return convert_to_tensor_v2(value, dtype, preferred_dtype, name) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1242, in convert_to_tensor_v2 as_ref=False) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1296, in internal_convert_to_tensor ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/ops/array_ops.py", line 1278, in _autopacking_conversion_function return _autopacking_helper(v, dtype, name or "packed") File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/ops/array_ops.py", line 1214, in _autopacking_helper return gen_array_ops.pack(elems_as_tensors, name=scope) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/ops/gen_array_ops.py", line 6304, in pack "Pack", values=values, axis=axis, name=name) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/framework/op_def_library.py", line 793, in _apply_op_helper op_def=op_def) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func return func(*args, **kwargs) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3360, in create_op attrs, op_def, compute_device) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3429, in _create_op_internal op_def=op_def) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1773, in init control_input_ops) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1613, in _create_c_op raise ValueError(str(e)) ValueError: Dimension 1 in both shapes must be equal, but are 100 and 256. Shapes are [64,100] and [64,256]. From merging shape 0 with other shapes. for 'concat/concat_dim' (op: 'Pack') with input shapes: [64,100], [64,256].

    opened by krnk111 2
  • generate_thought_vectors.py

    generate_thought_vectors.py

    Write the captions in text file, and save it as Data/sample_captions.txt. Generate the skip thought vectors for these captions using: python generate_thought_vectors.py --caption_file="Data/sample_captions.txt"

    When I want to run this code, there is no sample_captions.txt file in the file. I want to know how this is generated. Do you have a generated file to share? @neilsh @paarthneekhara @gitter-badger @AbhishekNarayanan

    opened by silverlilin 2
  •   From merging shape 0 with other shapes. for 'h3_concat/concat_dim' (op: 'Pack') with input shapes: [8,4,4,512], [8,4,4,256].

    From merging shape 0 with other shapes. for 'h3_concat/concat_dim' (op: 'Pack') with input shapes: [8,4,4,512], [8,4,4,256].

    Traceback (most recent call last): File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\framework\ops.py", line 1628, in _create_c_op c_op = c_api.TF_FinishOperation(op_desc) tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimension 3 in both shapes must be equal, but are 512 and 256. Shapes are [8,4,4,512] and [8,4,4,256]. From merging shape 0 with other shapes. for 'h3_concat/concat_dim' (op: 'Pack') with input shapes: [8,4,4,512], [8,4,4,256].

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "generate_images.py", line 106, in main() File "generate_images.py", line 64, in main _, _, _, _, _ = gan.build_model() File "C:\Users\RanjanNi\Desktop\t2i_skip\Python 3 Codes\model.py", line 39, in build_model disc_real_image, disc_real_image_logits = self.discriminator(t_real_image, t_real_caption) File "C:\Users\RanjanNi\Desktop\t2i_skip\Python 3 Codes\model.py", line 171, in discriminator h3_concat = tf.concat( 3, [h3, tiled_embeddings], name='h3_concat') File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\ops\array_ops.py", line 1121, in concat dtype=dtypes.int32).get_shape().assert_is_compatible_with( File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\framework\ops.py", line 1050, in convert_to_tensor as_ref=False) File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\framework\ops.py", line 1146, in internal_convert_to_tensor ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\ops\array_ops.py", line 971, in _autopacking_conversion_function return _autopacking_helper(v, dtype, name or "packed") File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\ops\array_ops.py", line 923, in _autopacking_helper return gen_array_ops.pack(elems_as_tensors, name=scope) File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 5856, in pack "Pack", values=values, axis=axis, name=name) File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func return func(*args, **kwargs) File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\framework\ops.py", line 3274, in create_op op_def=op_def) File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\framework\ops.py", line 1792, in init control_input_ops) File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\framework\ops.py", line 1631, in _create_c_op raise ValueError(str(e)) ValueError: Dimension 3 in both shapes must be equal, but are 512 and 256. Shapes are [8,4,4,512] and [8,4,4,256]. From merging shape 0 with other shapes. for 'h3_concat/concat_dim' (op: 'Pack') with input shapes: [8,4,4,512], [8,4,4,256].

    opened by conquistador3 4
Owner
Paarth Neekhara
PhD student, Computer Science, UCSD
Paarth Neekhara
Code for paper PairRE: Knowledge Graph Embeddings via Paired Relation Vectors.

PairRE Code for paper PairRE: Knowledge Graph Embeddings via Paired Relation Vectors. This implementation of PairRE for Open Graph Benchmak datasets (

Alipay 65 Dec 19, 2022
FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

XCL 191 Dec 31, 2022
A 1.3B text-to-image generation model trained on 14 million image-text pairs

minDALL-E on Conceptual Captions minDALL-E, named after minGPT, is a 1.3B text-to-image generation model trained on 14 million image-text pairs for no

Kakao Brain 604 Dec 14, 2022
Implementation of Stochastic Image-to-Video Synthesis using cINNs.

Stochastic Image-to-Video Synthesis using cINNs Official PyTorch implementation of Stochastic Image-to-Video Synthesis using cINNs accepted to CVPR202

CompVis Heidelberg 135 Dec 28, 2022
PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

WaveGrad2 - PyTorch Implementation PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis. Status (202

Keon Lee 59 Dec 6, 2022
PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

VAENAR-TTS - PyTorch Implementation PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

Keon Lee 67 Nov 14, 2022
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

Deepvoice3_pytorch PyTorch implementation of convolutional networks-based text-to-speech synthesis models: arXiv:1710.07654: Deep Voice 3: Scaling Tex

Ryuichi Yamamoto 1.8k Jan 8, 2023
Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch

NÜWA - Pytorch (wip) Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch. This repository will be popul

Phil Wang 463 Dec 28, 2022
Deep Text Search is an AI-powered multilingual text search and recommendation engine with state-of-the-art transformer-based multilingual text embedding (50+ languages).

Deep Text Search - AI Based Text Search & Recommendation System Deep Text Search is an AI-powered multilingual text search and recommendation engine w

null 19 Sep 29, 2022
TAP: Text-Aware Pre-training for Text-VQA and Text-Caption, CVPR 2021 (Oral)

TAP: Text-Aware Pre-training TAP: Text-Aware Pre-training for Text-VQA and Text-Caption by Zhengyuan Yang, Yijuan Lu, Jianfeng Wang, Xi Yin, Dinei Flo

Microsoft 61 Nov 14, 2022
Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

SwinTextSpotter This is the pytorch implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text R

mxin262 183 Jan 3, 2023
A framework for joint super-resolution and image synthesis, without requiring real training data

SynthSR This repository contains code to train a Convolutional Neural Network (CNN) for Super-resolution (SR), or joint SR and data synthesis. The met

null 83 Jan 1, 2023
Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

CIPS -- Official Pytorch Implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis Requirements pip install -r requi

Multimodal Lab @ Samsung AI Center Moscow 201 Dec 21, 2022
[CVPR 2021] Anycost GANs for Interactive Image Synthesis and Editing

Anycost GAN video | paper | website Anycost GANs for Interactive Image Synthesis and Editing Ji Lin, Richard Zhang, Frieder Ganz, Song Han, Jun-Yan Zh

MIT HAN Lab 726 Dec 28, 2022
This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

OpenAI 3k Dec 26, 2022
Implementation of Diverse Semantic Image Synthesis via Probability Distribution Modeling

Diverse Semantic Image Synthesis via Probability Distribution Modeling (CVPR 2021) Paper Zhentao Tan, Menglei Chai, Dongdong Chen, Jing Liao, Qi Chu,

tzt 45 Nov 17, 2022
π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis Project Page | Paper | Data Eric Ryan Chan*, Marco Monteiro*, Pe

null 375 Dec 31, 2022
Pytorch implementation of few-shot semantic image synthesis

Few-shot Semantic Image Synthesis Using StyleGAN Prior Our method can synthesize photorealistic images from dense or sparse semantic annotations using

null 40 Sep 26, 2022
iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis

iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis Andreas Bl

CompVis Heidelberg 36 Dec 25, 2022