Text to image synthesis using thought vectors

Paarth Neekhara

Last update: Jan 5, 2023

Related tags

Overview

Text To Image Synthesis Using Thought Vectors

This is an experimental tensorflow implementation of synthesizing images from captions using Skip Thought Vectors. The images are synthesized using the GAN-CLS Algorithm from the paper Generative Adversarial Text-to-Image Synthesis. This implementation is built on top of the excellent DCGAN in Tensorflow. The following is the model architecture. The blue bars represent the Skip Thought Vectors for the captions.

Image Source : Generative Adversarial Text-to-Image Synthesis Paper

Requirements

Python 2.7.6
Tensorflow
h5py
Theano : for skip thought vectors
scikit-learn : for skip thought vectors
NLTK : for skip thought vectors

Datasets

All the steps below for downloading the datasets and models can be performed automatically by running python download_datasets.py. Several gigabytes of files will be downloaded and extracted.
The model is currently trained on the flowers dataset. Download the images from this link and save them in Data/flowers/jpg. Also download the captions from this link. Extract the archive, copy the text_c10 folder and paste it in Data/flowers.
Download the pretrained models and vocabulary for skip thought vectors as per the instructions given here. Save the downloaded files in Data/skipthoughts.
Make empty directories in Data, Data/samples, Data/val_samples and Data/Models. They will be used for sampling the generated images and saving the trained models.

Usage

Data Processing : Extract the skip thought vectors for the flowers data set using :

python data_loader.py --data_set="flowers"

Training
- Basic usage python train.py --data_set="flowers"
- Options
  - z_dim: Noise Dimension. Default is 100.
  - t_dim: Text feature dimension. Default is 256.
  - batch_size: Batch Size. Default is 64.
  - image_size: Image dimension. Default is 64.
  - gf_dim: Number of conv in the first layer generator. Default is 64.
  - df_dim: Number of conv in the first layer discriminator. Default is 64.
  - gfc_dim: Dimension of gen untis for for fully connected layer. Default is 1024.
  - caption_vector_length: Length of the caption vector. Default is 1024.
  - data_dir: Data Directory. Default is Data/.
  - learning_rate: Learning Rate. Default is 0.0002.
  - beta1: Momentum for adam update. Default is 0.5.
  - epochs: Max number of epochs. Default is 600.
  - resume_model: Resume training from a pretrained model path.
  - data_set: Data Set to train on. Default is flowers.
Generating Images from Captions
- Write the captions in text file, and save it as Data/sample_captions.txt. Generate the skip thought vectors for these captions using:
```
python generate_thought_vectors.py --caption_file="Data/sample_captions.txt"
```
- Generate the Images for the thought vectors using:
```
python generate_images.py --model_path=<path to the trained model> --n_images=8
```
n_images specifies the number of images to be generated per caption. The generated images will be saved in Data/val_samples/. python generate_images.py --help for more options.

Sample Images Generated

Following are the images generated by the generative model from the captions.

Caption	Generated Images
the flower shown has yellow anther red pistil and bright red petals
this flower has petals that are yellow, white and purple and has dark lines
the petals on this flower are white with a yellow center
this flower has a lot of small round pink petals.
this flower is orange in color, and has petals that are ruffled and rounded.
the flower has yellow petals and the center of it is brown

Implementation Details

Only the uni-skip vectors from the skip thought vectors are used. I have not tried training the model with combine-skip vectors.
The model was trained for around 200 epochs on a GPU. This took roughly 2-3 days.
The images generated are 64 x 64 in dimension.
While processing the batches before training, the images are flipped horizontally with a probability of 0.5.
The train-val split is 0.75.

Pre-trained Models

Download the pretrained model from here and save it in Data/Models. Use this path for generating the images.

TODO

Train the model on the MS-COCO data set, and generate more generic images.
Try different embedding options for captions(other than skip thought vectors). Also try to train the caption embedding RNN along with the GAN-CLS model.

References

Generative Adversarial Text-to-Image Synthesis Paper
Generative Adversarial Text-to-Image Synthesis Code
Skip Thought Vectors Paper
Skip Thought Vectors Code
DCGAN in Tensorflow
DCGAN in Tensorlayer

Alternate Implementations

License

MIT

Comments

training error for MS-COCO

When I tried to run training on MS-COCO dataset,

python data_loader.py --data_set='MS-COCO' --data_dir='MSCOCO-data'

I got the below error while running the training code:

Traceback (most recent call last):
  File "data_loader.py", line 111, in <module>
    main()
  File "data_loader.py", line 108, in main
    save_caption_vectors_ms_coco(args.data_dir, args.split, args.batch_size)
  File "data_loader.py", line 39, in save_caption_vectors_ms_coco
    h5f_tv_batch = h5py.File( join(data_dir, 'tvs/'+split + '_tvs_' + str(batch_no)), 'w')
  File "/usr/local/lib/python2.7/dist-packages/h5py/_hl/files.py", line 272, in __init__
    fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
  File "/usr/local/lib/python2.7/dist-packages/h5py/_hl/files.py", line 98, in make_fid
    fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/tmp/pip-4rPeHA-build/h5py/_objects.c:2684)
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/tmp/pip-4rPeHA-build/h5py/_objects.c:2642)
  File "h5py/h5f.pyx", line 96, in h5py.h5f.create (/tmp/pip-4rPeHA-build/h5py/h5f.c:2097)
IOError: Unable to create file (Unable to open file: name = '/home/nitish/mscoco-data/tvs/train_tvs_0', errno = 2, error message = 'no such file or directory', flags = 13, o_flags = 242)

opened by nitish11 13

Same error while generating as well as training

Traceback (most recent call last): File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1628, in _create_c_op c_op = c_api.TF_FinishOperation(op_desc) tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimension 1 in both shapes must be equal, but are 100 and 256. Shapes are [64,100] and [64,256]. From merging shape 0 with other shapes. for 'concat/concat_dim' (op: 'Pack') with input shapes: [64,100], [64,256].

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "train.py", line 238, in main() File "train.py", line 76, in main input_tensors, variables, loss, outputs, checks = gan.build_model() File "/home/bsatya/Study/ML/text-to-image/model.py", line 39, in build_model fake_image = self.generator(t_z, t_real_caption) File "/home/bsatya/Study/ML/text-to-image/model.py", line 139, in generator z_concat = tf.concat(1, [t_z, reduced_text_embedding]) File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 1121, in concat dtype=dtypes.int32).get_shape().assert_is_compatible_with( File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1050, in convert_to_tensor as_ref=False) File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1146, in internal_convert_to_tensor ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 971, in _autopacking_conversion_function return _autopacking_helper(v, dtype, name or "packed") File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 923, in _autopacking_helper return gen_array_ops.pack(elems_as_tensors, name=scope) File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 4875, in pack "Pack", values=values, axis=axis, name=name) File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func return func(*args, **kwargs) File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op op_def=op_def) File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1792, in init control_input_ops) File "/home/bsatya/Study/ML/p3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1631, in _create_c_op raise ValueError(str(e)) ValueError: Dimension 1 in both shapes must be equal, but are 100 and 256. Shapes are [64,100] and [64,256]. From merging shape 0 with other shapes. for 'concat/concat_dim' (op: 'Pack') with input shapes: [64,100], [64,256].

opened by BhargavSatya 3
Add download helper for data files

There's quite a few directories to create and files to download/extract, and this script makes it easier to get everything set up. For convenience I also created and included a 6.5MB tar.bz of the flower captions.

opened by neilsh 3
Input of Discriminator

Hi, here's another question. Here the discriminator's input is wrong image and right text. However according to the original paper (page 5), they use real image and wrong text which is different from your implementation.

opened by chingyaoc 2
Evaluation Metrics

Hi, First of all, thanks for your awesome work! I'm wondering that is there any evaluation metrics for this kind of generative model since I want to compare the performance between using Skip Thought Vectors and the other embedding options for captions.

opened by chingyaoc 2

Trying to generate images using pre-trained model

What I did

Downloaded the pre-trained model
Created a file (j.caption) with a sample caption
Ran: python generate_thought_vectors.py --caption_file=j.caption
Got the following error, any ideas?

['pink flower with green leaves']
Loading model parameters...
Traceback (most recent call last):
  File "generate_thought_vectors.py", line 32, in <module>
    main()
  File "generate_thought_vectors.py", line 23, in main
    model = skipthoughts.load_model()
  File "/Users/jikkujose/Projects/outside_projects/text-to-image/skipthoughts.py", line 38, in load_model
    with open('%s.pkl'%path_to_umodel, 'rb') as f:

Specs

Mac OSX 10.11.6
Python 2.7.11

opened by jikkujose 2

Data Loader

['image_06288.jpg', 'image_08158.jpg', 'image_05755.jpg', 'image_02589.jpg', 'image_07136.jpg', 'image_00386.jpg', 'image_03853.jpg', 'image_02558.jpg', 'image_02394.jpg', 'image_03728.jpg', 'image_00459.jpg', 'image_02971.jpg', 'image_05165.jpg', 'image_00200.jpg', 'image_05604.jpg', 'image_00165.jpg', 'image_06698.jpg', 'image_07523.jpg', 'image_05897.jpg', 'image_00893.jpg', 'image_07566.jpg', 'image_02521.jpg', 'image_02177.jpg', 'image_08017.jpg', 'image_04124.jpg', 'image_01274.jpg', 'image_01322.jpg', 'image_05418.jpg', 'image_03020.jpg', 'image_04845.jpg', 'image_02937.jpg', 'image_04535.jpg', 'image_05900.jpg', 'image_06085.jpg', 'image_01547.jpg', 'image_01584.jpg', 'image_03256.jpg', 'image_04241.jpg', 'image_07573.jpg', 'image_07429.jpg', 'image_05473.jpg', 'image_03866.jpg', 'image_02641.jpg', 'image_02421.jpg', 'image_03829.jpg', 'image_00172.jpg', 'image_04244.jpg', 'image_01564.jpg', 'image_00103.jpg', 'image_00394.jpg', 'image_06692.jpg', 'image_07503.jpg', 'image_02566.jpg', 'image_07964.jpg', 'image_07431.jpg', 'image_04724.jpg', 'image_02230.jpg', 'image_02434.jpg', 'image_05386.jpg', 'image_05502.jpg', 'image_00485.jpg', 'image_04411.jpg', 'image_01350.jpg', 'image_03127.jpg', 'image_06379.jpg', 'image_05504.jpg', 'image_04690.jpg', 'image_06777.jpg', 'image_04227.jpg', 'image_02020.jpg', 'image_03077.jpg', 'image_07699.jpg', 'image_05176.jpg', 'image_03054.jpg', 'image_03833.jpg', 'image_04077.jpg', 'image_04581.jpg', 'image_01178.jpg', 'image_05925.jpg', 'image_04951.jpg', 'image_01707.jpg', 'image_00215.jpg', 'image_00497.jpg', 'image_01570.jpg', 'image_04317.jpg', 'image_04728.jpg', 'image_00960.jpg', 'image_00775.jpg', 'image_05898.jpg', 'image_00606.jpg', 'image_01223.jpg', 'image_03507.jpg', 'image_00898.jpg', 'image_07854.jpg', 'image_05882.jpg', 'image_06700.jpg', 'image_04863.jpg', 'image_01872.jpg', 'image_06113.jpg', 'image_05639.jpg'] 8189 8189 Loading model parameters... Compiling encoders... Loading tables... Traceback (most recent call last): File "data_loader.py", line 110, in main() File "data_loader.py", line 105, in main save_caption_vectors_flowers(args.data_dir) File "data_loader.py", line 76, in save_caption_vectors_flowers model = skipthoughts.load_model() File "/home/sarah/text-to-image/skipthoughts.py", line 60, in load_model utable, btable = load_tables() File "/home/sarah/text-to-image/skipthoughts.py", line 80, in load_tables utable = numpy.load(path_to_tables + 'utable.npy',encoding='latin1') File "/home/sarah/.local/lib/python3.7/site-packages/numpy/lib/npyio.py", line 453, in load pickle_kwargs=pickle_kwargs) File "/home/sarah/.local/lib/python3.7/site-packages/numpy/lib/format.py", line 739, in read_array raise ValueError("Object arrays cannot be loaded when " ValueError: Object arrays cannot be loaded when allow_pickle=False

How can I solve it? Python 3.7 Numpy 1.18.4 Tensorflow 2.2.1 Gast 0.3.3

opened by shiningstar93 1
unable to open pickle of skipthoughts

XXX@XXX:~/codes/python_codes/text-to-image$ python generate_thought_vectors.py --caption_file="Data/sample_captions.txt" ['the flower shown has yellow anther red pistil and bright red petals'] Loading model parameters... Compiling encoders... Loading tables... Traceback (most recent call last): File "generate_thought_vectors.py", line 33, in main() File "generate_thought_vectors.py", line 23, in main model = skipthoughts.load_model() File "/home/tushar/codes/python_codes/text-to-image/skipthoughts.py", line 60, in load_model utable, btable = load_tables() File "/home/tushar/codes/python_codes/text-to-image/skipthoughts.py", line 80, in load_tables utable = numpy.load(path_to_tables + 'utable.npy') File "/usr/local/lib/python2.7/dist-packages/numpy/lib/npyio.py", line 406, in load pickle_kwargs=pickle_kwargs) File "/usr/local/lib/python2.7/dist-packages/numpy/lib/format.py", line 637, in read_array array = pickle.load(fp, **pickle_kwargs) EOFError

opened by Exception4U 1
Follow-up to feedback on download_datasets.py
Incorporated feedback on https://github.com/paarthneekhara/text-to-image/pull/6 and made some other fixes:

moved flowers_text_c10.tar.bz2 to Data/

corrected some extraction paths

modified README.md to mention new download_datasets.py

added a requirements.txt
opened by neilsh 1
Did you mean to set reuse=tf.AUTO_REUSE in VarScope?

2021-05-26 00:09:24.773721: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found 2021-05-26 00:09:24.781327: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. WARNING:tensorflow:From C:\pythn\envs\abhin\lib\site-packages\tensorflow\python\compat\v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version. Instructions for updating: non-resource variables are not supported in the long term Traceback (most recent call last): File "generate_images.py", line 106, in main() File "generate_images.py", line 64, in main _, _, _, _, _ = gan.build_model() File "C:\Users\Futura Labs\Documents\codes\speechtext_to_image\text-to-image-master\model.py", line 42, in build_model disc_wrong_image, disc_wrong_image_logits = self.discriminator(t_wrong_image, t_real_caption, reuse = True) File "C:\Users\Futura Labs\Documents\codes\speechtext_to_image\text-to-image-master\model.py", line 163, in discriminator h1 = ops.lrelu( self.d_bn1(ops.conv2d(h0, self.options['df_dim']*2, name = 'd_h1_conv'))) #16 File "C:\Users\Futura Labs\Documents\codes\speechtext_to_image\text-to-image-master\Utils\ops.py", line 36, in call ema_apply_op = self.ema.apply([batch_mean, batch_var]) File "C:\pythn\envs\abhin\lib\site-packages\tensorflow\python\training\moving_averages.py", line 469, in apply "Variable", "VariableV2", "VarHandleOp" File "C:\pythn\envs\abhin\lib\site-packages\tensorflow\python\training\slot_creator.py", line 197, in create_zeros_slot colocate_with_primary=colocate_with_primary) File "C:\pythn\envs\abhin\lib\site-packages\tensorflow\python\training\slot_creator.py", line 174, in create_slot_with_initializer dtype) File "C:\pythn\envs\abhin\lib\site-packages\tensorflow\python\training\slot_creator.py", line 73, in _create_slot_var validate_shape=validate_shape) File "C:\pythn\envs\abhin\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1593, in get_variable aggregation=aggregation) File "C:\pythn\envs\abhin\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1336, in get_variable aggregation=aggregation) File "C:\pythn\envs\abhin\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 591, in get_variable aggregation=aggregation) File "C:\pythn\envs\abhin\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 543, in _true_getter aggregation=aggregation) File "C:\pythn\envs\abhin\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 911, in _get_single_variable "reuse=tf.AUTO_REUSE in VarScope?" % name) ValueError: Variable d_bn1/d_bn1_2/moments/Squeeze/ExponentialMovingAverage/ does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=tf.AUTO_REUSE in VarScope?

opened by thefuturalabs 0
data_loader.py

ValueError: ('The following error happened while compiling the node', forall_inplace,cpu,encoder__layers}(Elemwise{Maximum}[(0, 0)].0, Elemwise{sub,no_inplace}.0, InplaceDimShuffle{0,1,x}.0, Subtensor{int64:int64:int8}.0, Subtensor{int64:int64:int8}.0, IncSubtensor{InplaceSet;:int64:}.0, encoder_U, encoder_Ux, ScalarFromTensor.0, ScalarFromTensor.0), '\n', 'numpy.ufunc has the wrong size, try recompiling. Expected 192, got 216')

when i run this code, this error occurs. Does anyone konw how to solve it?

opened by silverlilin 0
generating images error

Traceback (most recent call last): File "encode_text.py", line 32, in main() File "encode_text.py", line 16, in main model = skipthoughts.load_model() File "/home/anu/Downloads/TAC-GAN-master/skipthoughts.py", line 60, in load_model utable, btable = load_tables() File "/home/anu/Downloads/TAC-GAN-master/skipthoughts.py", line 80, in load_tables utable = numpy.load(path_to_tables + 'utable.npy', encoding='bytes') File "/home/anu/.local/lib/python3.6/site-packages/numpy/lib/npyio.py", line 453, in load pickle_kwargs=pickle_kwargs) File "/home/anu/.local/lib/python3.6/site-packages/numpy/lib/format.py", line 739, in read_array raise ValueError("Object arrays cannot be loaded when " ValueError: Object arrays cannot be loaded when allow_pickle=False how to solve this this

opened by anudeekshith 1
From merging shape 0 with other shapes. for 'concat/concat_dim' (op: 'Pack') with input shapes: [64,100], [64,256]

/home/naveen/anaconda3/lib/python3.7/site-packages/dask/config.py:168: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. data = yaml.load(f.read()) or {} WARNING:tensorflow:From /home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.init (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version. Instructions for updating: If using Keras pass *_constraint arguments to layers. (64, 256) Traceback (most recent call last): File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1610, in _create_c_op c_op = c_api.TF_FinishOperation(op_desc) tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimension 1 in both shapes must be equal, but are 100 and 256. Shapes are [64,100] and [64,256]. From merging shape 0 with other shapes. for 'concat/concat_dim' (op: 'Pack') with input shapes: [64,100], [64,256].

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/naveen/eclipse-workspace/text-to-image-master/train.py", line 238, in main() File "/home/naveen/eclipse-workspace/text-to-image-master/train.py", line 76, in main input_tensors, variables, loss, outputs, checks = gan.build_model() File "/home/naveen/eclipse-workspace/text-to-image-master/model.py", line 39, in build_model fake_image = self.generator(t_z, t_real_caption) File "/home/naveen/eclipse-workspace/text-to-image-master/model.py", line 141, in generator z_concat = tf.concat(1, [t_z, reduced_text_embedding]) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/util/dispatch.py", line 180, in wrapper return target(*args, **kwargs) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/ops/array_ops.py", line 1427, in concat ops.convert_to_tensor(axis, name="concat_dim",dtype=dtypes.int32).get_shape().assert_has_rank(0) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1184, in convert_to_tensor return convert_to_tensor_v2(value, dtype, preferred_dtype, name) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1242, in convert_to_tensor_v2 as_ref=False) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1296, in internal_convert_to_tensor ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/ops/array_ops.py", line 1278, in _autopacking_conversion_function return _autopacking_helper(v, dtype, name or "packed") File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/ops/array_ops.py", line 1214, in _autopacking_helper return gen_array_ops.pack(elems_as_tensors, name=scope) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/ops/gen_array_ops.py", line 6304, in pack "Pack", values=values, axis=axis, name=name) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/framework/op_def_library.py", line 793, in _apply_op_helper op_def=op_def) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func return func(*args, **kwargs) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3360, in create_op attrs, op_def, compute_device) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3429, in _create_op_internal op_def=op_def) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1773, in init control_input_ops) File "/home/naveen/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1613, in _create_c_op raise ValueError(str(e)) ValueError: Dimension 1 in both shapes must be equal, but are 100 and 256. Shapes are [64,100] and [64,256]. From merging shape 0 with other shapes. for 'concat/concat_dim' (op: 'Pack') with input shapes: [64,100], [64,256].

opened by krnk111 2
generate_thought_vectors.py

Write the captions in text file, and save it as Data/sample_captions.txt. Generate the skip thought vectors for these captions using: python generate_thought_vectors.py --caption_file="Data/sample_captions.txt"

When I want to run this code, there is no sample_captions.txt file in the file. I want to know how this is generated. Do you have a generated file to share? @neilsh @paarthneekhara @gitter-badger @AbhishekNarayanan

opened by silverlilin 2
From merging shape 0 with other shapes. for 'h3_concat/concat_dim' (op: 'Pack') with input shapes: [8,4,4,512], [8,4,4,256].

Traceback (most recent call last): File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\framework\ops.py", line 1628, in _create_c_op c_op = c_api.TF_FinishOperation(op_desc) tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimension 3 in both shapes must be equal, but are 512 and 256. Shapes are [8,4,4,512] and [8,4,4,256]. From merging shape 0 with other shapes. for 'h3_concat/concat_dim' (op: 'Pack') with input shapes: [8,4,4,512], [8,4,4,256].

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "generate_images.py", line 106, in main() File "generate_images.py", line 64, in main _, _, _, _, _ = gan.build_model() File "C:\Users\RanjanNi\Desktop\t2i_skip\Python 3 Codes\model.py", line 39, in build_model disc_real_image, disc_real_image_logits = self.discriminator(t_real_image, t_real_caption) File "C:\Users\RanjanNi\Desktop\t2i_skip\Python 3 Codes\model.py", line 171, in discriminator h3_concat = tf.concat( 3, [h3, tiled_embeddings], name='h3_concat') File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\ops\array_ops.py", line 1121, in concat dtype=dtypes.int32).get_shape().assert_is_compatible_with( File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\framework\ops.py", line 1050, in convert_to_tensor as_ref=False) File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\framework\ops.py", line 1146, in internal_convert_to_tensor ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\ops\array_ops.py", line 971, in _autopacking_conversion_function return _autopacking_helper(v, dtype, name or "packed") File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\ops\array_ops.py", line 923, in _autopacking_helper return gen_array_ops.pack(elems_as_tensors, name=scope) File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 5856, in pack "Pack", values=values, axis=axis, name=name) File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func return func(*args, **kwargs) File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\framework\ops.py", line 3274, in create_op op_def=op_def) File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\framework\ops.py", line 1792, in init control_input_ops) File "C:\Users\RanjanNi\AppData\Local\Continuum\anaconda3\envs\py36\lib\site-packages\tensorflow\python\framework\ops.py", line 1631, in _create_c_op raise ValueError(str(e)) ValueError: Dimension 3 in both shapes must be equal, but are 512 and 256. Shapes are [8,4,4,512] and [8,4,4,256]. From merging shape 0 with other shapes. for 'h3_concat/concat_dim' (op: 'Pack') with input shapes: [8,4,4,512], [8,4,4,256].

opened by conquistador3 4

Owner

Paarth Neekhara

PhD student, Computer Science, UCSD

GitHub

Code for paper PairRE: Knowledge Graph Embeddings via Paired Relation Vectors.

PairRE Code for paper PairRE: Knowledge Graph Embeddings via Paired Relation Vectors. This implementation of PairRE for Open Graph Benchmak datasets (

65 Dec 19, 2022

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

191 Dec 31, 2022

A 1.3B text-to-image generation model trained on 14 million image-text pairs

minDALL-E on Conceptual Captions minDALL-E, named after minGPT, is a 1.3B text-to-image generation model trained on 14 million image-text pairs for no

604 Dec 14, 2022

Implementation of Stochastic Image-to-Video Synthesis using cINNs.

Stochastic Image-to-Video Synthesis using cINNs Official PyTorch implementation of Stochastic Image-to-Video Synthesis using cINNs accepted to CVPR202

135 Dec 28, 2022

PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

WaveGrad2 - PyTorch Implementation PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis. Status (202

59 Dec 6, 2022

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

VAENAR-TTS - PyTorch Implementation PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

67 Nov 14, 2022

PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

Deepvoice3_pytorch PyTorch implementation of convolutional networks-based text-to-speech synthesis models: arXiv:1710.07654: Deep Voice 3: Scaling Tex

1.8k Jan 8, 2023

Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch

NÜWA - Pytorch (wip) Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch. This repository will be popul

463 Dec 28, 2022

Deep Text Search is an AI-powered multilingual text search and recommendation engine with state-of-the-art transformer-based multilingual text embedding (50+ languages).

Deep Text Search - AI Based Text Search & Recommendation System Deep Text Search is an AI-powered multilingual text search and recommendation engine w

19 Sep 29, 2022

TAP: Text-Aware Pre-training for Text-VQA and Text-Caption, CVPR 2021 (Oral)

TAP: Text-Aware Pre-training TAP: Text-Aware Pre-training for Text-VQA and Text-Caption by Zhengyuan Yang, Yijuan Lu, Jianfeng Wang, Xi Yin, Dinei Flo

61 Nov 14, 2022

Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

SwinTextSpotter This is the pytorch implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text R

183 Jan 3, 2023

A framework for joint super-resolution and image synthesis, without requiring real training data

SynthSR This repository contains code to train a Convolutional Neural Network (CNN) for Super-resolution (SR), or joint SR and data synthesis. The met

83 Jan 1, 2023

Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

CIPS -- Official Pytorch Implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis Requirements pip install -r requi

Multimodal Lab @ Samsung AI Center Moscow

201 Dec 21, 2022

Text to image synthesis using thought vectors

Related tags

Overview

Text To Image Synthesis Using Thought Vectors

Requirements

Datasets

Usage

Sample Images Generated

Implementation Details

Pre-trained Models

TODO

References

Alternate Implementations

License

Comments

What I did

Specs

Owner

Paarth Neekhara

Code for paper PairRE: Knowledge Graph Embeddings via Paired Relation Vectors.

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

A 1.3B text-to-image generation model trained on 14 million image-text pairs

Implementation of Stochastic Image-to-Video Synthesis using cINNs.

PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch

Deep Text Search is an AI-powered multilingual text search and recommendation engine with state-of-the-art transformer-based multilingual text embedding (50+ languages).

TAP: Text-Aware Pre-training for Text-VQA and Text-Caption, CVPR 2021 (Oral)

Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

A framework for joint super-resolution and image synthesis, without requiring real training data

Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

[CVPR 2021] Anycost GANs for Interactive Image Synthesis and Editing

This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

Implementation of Diverse Semantic Image Synthesis via Probability Distribution Modeling

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

Pytorch implementation of few-shot semantic image synthesis

iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis