Neural Caption Generator with Attention

Overview

Neural Caption Generator with Attention

Code

  • make_flickr_dataset.py: Extracts conv5_3 layer activations of VGG Network for flickr30k images, and save them in 'data/feats.npy'
  • model_tensorflow.py: Main codes

Usage

  • Download flickr30k Dataset.
  • Extract VGG conv5_3 features using make_flickr_dataset.py
  • Train: run train() in model_tensorflow.py
  • Test: run test() in model_tensorflow.py

alt tag

Comments
  • model-8

    model-8

    hi,@jazzsaxmafia , I just don't know where to find 'model-8', and the issue is

    "InvalidArgumentError: Unsuccessful TensorSliceReader constructor: Failed to get matching files on ./model/model-8: Not found: ./model"
    
    opened by chenfsjz 3
  • what's the file model-8?

    what's the file model-8?

    when I run train() in model_tensorflow.py, there would be a DataLossError: unable to find model-8. so what's the model-8? a pre-trained model of tensorflow? How can i get it.

    opened by evercherish 2
  • IndexError: index 0 is out of bounds for axis 1 with size 0

    IndexError: index 0 is out of bounds for axis 1 with size 0

    File "/home/hzhou/anaconda3/envs/py35/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2961, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 1, in runfile('/home/hzhou/code_More/show_attend_and_tell_p1_2/make_flickr_dataset.py', wdir='/home/hzhou/code_More/show_attend_and_tell_p1_2') File "/home/hzhou/zhoueheng/pycharm-2018.2.4/helpers/pydev/_pydev_bundle/pydev_umd.py", line 197, in runfile pydev_imports.execfile(filename, global_vars, local_vars) # execute the script File "/home/hzhou/zhoueheng/pycharm-2018.2.4/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/home/hzhou/code_More/show_attend_and_tell_p1_2/make_flickr_dataset.py", line 30, in feats = cnn.get_features(unique_images, layers='conv5_3', layer_sizes=[512,14,14]) File "/home/hzhou/code_More/show_attend_and_tell_p1_2/cnn_util.py", line 72, in get_features caffe_in = np.zeros(np.array(image_batch.shape)[[0,3,1,2]], dtype=np.float32) IndexError: index 0 is out of bounds for axis 1 with size 0

    opened by zhouheng2018 1
  • How to see attended images?

    How to see attended images?

    i want to see attended images, while tests model. (The model changes its attention to the relevant part of the image while it generates each word.)

    how can i see attended images for testing?

    opened by SinDongHwan 0
  • train out or not?

    train out or not?

    Have somebody train out a good result with the built model in this repo?Trying to recurrent the result in this repo, until now the result is not satisfactory......

    opened by sjksong 0
  • Why is the gradient vanishing?

    Why is the gradient vanishing?

    Thank you very much for sharing. I want to use the attention model to do video classification, but there is always a gradient vanishing during the training. Have you had a similar problem before?

    opened by pefectfeng 0
  • TypeError: int() argument must be a string, a bytes-like object or a number, not 'map'

    TypeError: int() argument must be a string, a bytes-like object or a number, not 'map'

    ERROR LOG:

    C:\Python35\python.exe C:/MainProject/show_attend_and_tell/model_tensorflow.py
    Using TensorFlow backend.
    preprocessing word counts and creating vocab based on word count threshold 30
    filtered words from 20326 to 2942
    Traceback (most recent call last):
      File "C:/MainProject/show_attend_and_tell/model_tensorflow.py", line 334, in <module>
        train()
      File "C:/MainProject/show_attend_and_tell/model_tensorflow.py", line 255, in train
        n_lstm_steps=int(maxlen)+1, # w1~wN까지 예측한 뒤 마지막에 '.'예측해야하니까 +1
    TypeError: int() argument must be a string, a bytes-like object or a number, not 'map'
    
    If you suspect this is an IPython bug, please report it at:
        https://github.com/ipython/ipython/issues
    or send an email to the mailing list at [email protected]
    
    You can print a more detailed traceback right now with "%tb", or use "%debug"
    to interactively debug it.
    
    Extra-detailed tracebacks for bug-reporting purposes can be enabled via:
        %config Application.verbose_crash=True
    
    
    Process finished with exit code 1
    

    Can anyone please tell me how to continue from here ?

    opened by AkritiRao 0
  • About state = tf.zeros([self.batch_size, self.lstm.state_size])

    About state = tf.zeros([self.batch_size, self.lstm.state_size])

    In model.py, both in function build_model(self) and build_generator(self, maxlen), there is a line state = tf.zeros([self.batch_size, self.lstm.state_size]). However, this line cannot run (I am running on tf 0.12). Since self.lstm.state_size is a tuple, while here i believe it just needs an int.

    opened by ecilay 0
Owner
Taeksoo Kim
Taeksoo Kim
Yet another video caption

Yet another video caption

Fan Zhimin 5 May 26, 2022
Fine-grained Control of Image Caption Generation with Abstract Scene Graphs

Faster R-CNN pretrained on VisualGenome This repository modifies maskrcnn-benchmark for object detection and attribute prediction on VisualGenome data

Shizhe Chen 7 Apr 20, 2021
Gif-caption - A straightforward GIF Captioner written in Python

Broksy's GIF Captioner Have you ever wanted to easily caption a GIF without havi

null 3 Apr 9, 2022
This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

The Neural Process Family This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CN

DeepMind 892 Dec 28, 2022
BTC-Generator - BTC Generator With Python

Что такое BTC-Generator? Это генератор чеков всеми любимого @BTC_BANKER_BOT Для

DoomGod 3 Aug 24, 2022
Simple ONNX operation generator. Simple Operation Generator for ONNX.

sog4onnx Simple ONNX operation generator. Simple Operation Generator for ONNX. https://github.com/PINTO0309/simple-onnx-processing-tools Key concept V

Katsuya Hyodo 6 May 15, 2022
An implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.

Neural Attention Distillation This is an implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep

Yige-Li 84 Jan 4, 2023
Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Transformer in Transformer Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image c

Phil Wang 272 Dec 23, 2022
Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones

HaloNet - Pytorch Implementation of the Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones. This re

Phil Wang 189 Nov 22, 2022
Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification

STAM - Pytorch Implementation of STAM (Space Time Attention Model), yet another pure and simple SOTA attention model that bests all previous models in

Phil Wang 109 Dec 28, 2022
PyTorch code for our paper "Attention in Attention Network for Image Super-Resolution"

Under construction... Attention in Attention Network for Image Super-Resolution (A2N) This repository is an PyTorch implementation of the paper "Atten

Haoyu Chen 71 Dec 30, 2022
Attention-driven Robot Manipulation (ARM) which includes Q-attention

Attention-driven Robotic Manipulation (ARM) This codebase is home to: Q-attention: Enabling Efficient Learning for Vision-based Robotic Manipulation I

Stephen James 84 Dec 29, 2022
Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms

LESA Introduction This repository contains the official implementation of Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Cont

Chenglin Yang 20 Dec 31, 2021
Official Pytorch Implementation of Relational Self-Attention: What's Missing in Attention for Video Understanding

Relational Self-Attention: What's Missing in Attention for Video Understanding This repository is the official implementation of "Relational Self-Atte

mandos 43 Dec 7, 2022
Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Memory Efficient Attention Pytorch Implementation of a memory efficient multi-head attention as proposed in the paper, Self-attention Does Not Need O(

Phil Wang 180 Jan 5, 2023
Official implementation of cosformer-attention in cosFormer: Rethinking Softmax in Attention

cosFormer Official implementation of cosformer-attention in cosFormer: Rethinking Softmax in Attention Update log 2022/2/28 Add core code License This

null 120 Dec 15, 2022
Implementation of Deformable Attention in Pytorch from the paper "Vision Transformer with Deformable Attention"

Deformable Attention Implementation of Deformable Attention from this paper in Pytorch, which appears to be an improvement to what was proposed in DET

Phil Wang 128 Dec 24, 2022
FAST-RIR: FAST NEURAL DIFFUSE ROOM IMPULSE RESPONSE GENERATOR

This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.

Anton Jeran Ratnarajah 89 Dec 22, 2022
Graph neural network message passing reframed as a Transformer with local attention

Adjacent Attention Network An implementation of a simple transformer that is equivalent to graph neural network where the message passing is done with

Phil Wang 49 Dec 28, 2022