Unsupervised Learning of Video Representations using LSTMs

Overview

Unsupervised Learning of Video Representations using LSTMs

Code for paper Unsupervised Learning of Video Representations using LSTMs by Nitish Srivastava, Elman Mansimov, Ruslan Salakhutdinov; ICML 2015.

We use multilayer Long Short Term Memory (LSTM) networks to learn representations of video sequences. The representation can be used to perform different tasks, such as reconstructing the input sequence, predicting the future sequence, or for classification. Examples:

mnist gif1 mnist gif2 ucf101 gif1 ucf101 gif2

Note that the code at this link is deprecated.

Getting Started

To compile cudamat library you need to modify CUDA_ROOT in cudamat/Makefile to the relevant cuda root path.

The libraries you need to install are:

  • h5py (HDF5 (>= 1.8.11))
  • google.protobuf (Protocol Buffers (>= 2.5.0))
  • numpy
  • matplotlib

Next compile .proto file by calling

protoc -I=./ --python_out=./ config.proto

Depending on the task, you would need to download the following dataset files. These can be obtained by running:

wget http://www.cs.toronto.edu/~emansim/datasets/mnist.h5
wget http://www.cs.toronto.edu/~emansim/datasets/bouncing_mnist_test.npy
wget http://www.cs.toronto.edu/~emansim/datasets/ucf101_sample_train_patches.npy
wget http://www.cs.toronto.edu/~emansim/datasets/ucf101_sample_valid_patches.npy
wget http://www.cs.toronto.edu/~emansim/datasets/ucf101_sample_train_features.h5
wget http://www.cs.toronto.edu/~emansim/datasets/ucf101_sample_train_labels.txt
wget http://www.cs.toronto.edu/~emansim/datasets/ucf101_sample_train_num_frames.txt
wget http://www.cs.toronto.edu/~emansim/datasets/ucf101_sample_valid_features.h5
wget http://www.cs.toronto.edu/~emansim/datasets/ucf101_sample_valid_labels.txt
wget http://www.cs.toronto.edu/~emansim/datasets/ucf101_sample_valid_num_frames.txt

Note to Toronto users: You don't need to download any files, as they are available in my gobi3 repository and are already set up.

Bouncing (Moving) MNIST dataset

To train a sample model on this dataset you need to set correct data_file in datasets/bouncing_mnist_valid.pbtxt and then run (you may need to change the board id of gpu):

python lstm_combo.py models/lstm_combo_1layer_mnist.pbtxt datasets/bouncing_mnist.pbtxt datasets/bouncing_mnist_valid.pbtxt 1

After training the model and setting correct path to trained weights in models/lstm_combo_1layer_mnist_pretrained.pbtxt, you can visualize the sample reconstruction and future prediction results of the pretrained model by running:

python display_results.py models/lstm_combo_1layer_mnist_pretrained.pbtxt datasets/bouncing_mnist_valid.pbtxt 1

Below are the sample results, where first image is reference image and second image is prediction of the model. Note that first ten frames are reconstructions, whereas the last ten frames are future predictions.

original recon

Video patches

Due to the size constraints, I only managed to upload a small sample dataset of UCF-101 patches. The trained model is overfitting, so this example is just meant for instructional purposes. The setup is the same as in Bouncing MNIST dataset.

To train the model run:

python lstm_combo.py models/lstm_combo_1layer_ucf101_patches.pbtxt datasets/ucf101_patches.pbtxt datasets/ucf101_patches_valid.pbtxt 1

To see the results run:

python display_results.py models/lstm_combo_1layer_ucf101_pretrained.pbtxt datasets/ucf101_patches_valid.pbtxt 1

original recon

Classification using high level representations ('percepts') of video frames

Again, as in the case of UCF-101 patches, I was able to upload a very small subset of fc6 features of video frames extracted using VGG network. To train the classifier run:

python lstm_classifier.py models/lstm_classifier_1layer_ucf101_features.pbtxt datasets/ucf101_features.pbtxt datasets/ucf101_features_valid.pbtxt 1

Reference

If you found this code or our paper useful, please consider citing the following paper:

@inproceedings{srivastava15_unsup_video,
  author    = {Nitish Srivastava and Elman Mansimov and Ruslan Salakhutdinov},
  title     = {Unsupervised Learning of Video Representations using {LSTM}s},
  booktitle = {ICML},
  year      = {2015}
}
Comments
  •   invalid device function{cm.CUDAMatrix.init_random(42)}  cudamat.cudamat.CUDAMatException: CUDA error: no error

    invalid device function{cm.CUDAMatrix.init_random(42)} cudamat.cudamat.CUDAMatException: CUDA error: no error

    python lstm_combo.py models/lstm_combo_1layer_mnist.pbtxt datasets/bouncing_mnist.pbtxt datasets/bouncing_mnist_valid.pbtxt 0 Using board 0 invalid device function Traceback (most recent call last): File "lstm_combo.py", line 405, in cm.CUDAMatrix.init_random(42) File "/home/weihaoxie/unsupervised-videos/cudamat/cudamat.py", line 382, in init_random raise generate_exception(err_code) cudamat.cudamat.CUDAMatException: CUDA error: no error

    opened by weihaoxie 10
  • libcudamat.so?

    libcudamat.so?

    Hi,

    I tried to run lstm_combo.py, but I got the error: OSError: dlopen(/Users/yantian/Google Drive/Yantian/DL/unsupervised-videos-master/cudamat/libcudamat.so, 6): image not found

    Where can I get libcudamat.so? Thank you in advance!

    opened by YantianZha 5
  • libcudamat_conv_gemm.so

    libcudamat_conv_gemm.so

    Hi Emansim,

    I try to train a model on MNIST dataset, but I cannot find libcudamat_conv_gemm.so.

    Can you please help me to figure out what's happening?

    Cheers,

    Chengcheng

    python lstm_combo.py models/lstm_combo_1layer_mnist.pbtxt datasets/bouncing_mnist.pbtxt datasets/bouncing_mnist_valid.pbtxt 0 Traceback (most recent call last): File "lstm_combo.py", line 1, in from data_handler import * File "/home/smile/project/unsupervised-videos-master/data_handler.py", line 3, in from util import * File "/home/smile/project/unsupervised-videos-master/util.py", line 4, in from cudamat import cudamat_conv_gemm as cc File "/home/smile/project/unsupervised-videos-master/cudamat/cudamat_conv_gemm.py", line 4, in _ConvNet = ct.cdll.LoadLibrary('libcudamat_conv_gemm.so') File "/home/smile/anaconda2/lib/python2.7/ctypes/init.py", line 443, in LoadLibrary return self._dlltype(name) File "/home/smile/anaconda2/lib/python2.7/ctypes/init.py", line 365, in init self._handle = _dlopen(self._name, mode) OSError: libcudamat_conv_gemm.so: cannot open shared object file: No such file or directory

    opened by chengchengjia 5
  • How do you handle numerical issues?

    How do you handle numerical issues?

    Instead of relying on math libraries, it looks like you implement your own kernel for forward propagation. I heard that a small number adding with a large number can cause numerical problems. In your program, how do you handle this problem?

    opened by liaocs2008 4
  • CUBLAS error

    CUBLAS error

    Hi, I have a problem with running the code. when I tried : python lstm_combo.py models/lstm_combo_1layer_mnist.pbtxt datasets/bouncing_mnist.pbtxt datasets/bouncing_mnist_valid.pbtxt 0 I got this error:

    Using board 0 2048 2048 Traceback (most recent call last): File "lstm_combo.py", line 407, in main() File "lstm_combo.py", line 394, in main lstm_autoencoder = LSTMCombo(model) File "lstm_combo.py", line 21, in init self.lstm_stack_dec_.Add(lstm.LSTM(l)) File "/home/fahimeh/ProgramFiles/unsupervised-videos/lstm.py", line 21, in init self.w_dense_ = Param((4 * num_lstms, num_lstms), lstm_config.w_dense) File "/home/fahimeh/ProgramFiles/unsupervised-videos/util.py", line 29, in init self.dw_ = cm.empty_like(self.w_) File "/home/fahimeh/ProgramFiles/unsupervised-videos/cudamat/cudamat.py", line 1881, in empty_like cmat = empty(m.shape) File "/home/fahimeh/ProgramFiles/unsupervised-videos/cudamat/cudamat.py", line 1869, in empty raise generate_exception(err_code) cudamat.cudamat.CUDAMatException: CUBLAS error.

    Could you please help me to fix it? Thanks a lot.

    opened by fahimeh62 3
  • Make the file Makefile in the folder cudamat

    Make the file Makefile in the folder cudamat

    Hello Emansim and other friends,

    I'm almost mad about this question and I've spent almost one week on it. This is a course project for one Master course in my university, and I have to reproduce the similar results to those in the paper.But this due time is drawing near while I'm stuck at the beginning of this project. I'm new to makefile, LSTM, google protobuf, so, please give me a hand.

    Firstly, let me introduce my computer hardware and software configuration: Windows 7 Enterprise 64-bit operating system CUDA Toolkit v6.5 NVIDIA Quadro FX 4800 (Version 340.62) Microsoft Visual Studio Ultimate 2013 Microsoft SDK v8.1 (Framework v4.0) Python2.7.8 mingw32-make(after install mingw-w64-installer)

    Secondly, let me introduce the steps to make Makefile in the folder cudamat:

    1. open the cmd command shell;
    2. cd to the directory of Makefile, i.e. C:\Users\david\Desktop\unsupervised-videos-master\cudamat
    3. run the command: mingw32-make.exe -f Makefile The error below always occures: error

    Could anyone give me a hand? Thanks in advance.

    @emansim

    opened by DavidSUN2 2
  • CUDAMatException

    CUDAMatException

    envy@ub1404:~/os_pri/github/unsupervised-videos$ python lstm_combo.py models/lstm_combo_1layer_mnist.pbtxt datasets/bouncing_mnist.pbtxt datasets/bouncing_mnist_valid.pbtxt 1 invalid device ordinal Traceback (most recent call last): File "lstm_combo.py", line 402, in board = LockGPU(board=board_id) File "/home/envy/os_pri/github/unsupervised-videos/util.py", line 144, in LockGPU cm.cuda_set_device(board) File "/home/envy/os_pri/github/unsupervised-videos/cudamat/cudamat.py", line 2273, in cuda_set_device raise generate_exception(err_code) cudamat.cudamat.CUDAMatException: CUDA error: no error envy@ub1404:~/os_pri/github/unsupervised-videos$

    opened by loveJasmine 2
  • How did you do Fprop?

    How did you do Fprop?

    Dear developers,

    I am a newbie in LSTM so I read your code carefully to gain a deep understanding.

    Basically, the code for Fprop runs like: for each batch: for each LSTM: LSTM.Fprop() where batch is the "input_frame" with shape (batch_size, input_dims) and LSTM units are stored in list "models_".

    Let's take video patch as an example, i.e., "python lstm_combo.py models/lstm_combo_1layer_ucf101_patches.pbtxt datasets/ucf101_patches.pbtxt datasets/ucf101_patches_valid.pbtxt 1".

    If you add print statements to show their shape or length, it outputs: len(models_) = 1 input_frame= (100, 3072)

    This means there is only one LSTM object in the "models_" list. Well, if you look at Fprop of LSTM class, it allocates memory for whole network, for example, "w_input_" variable.

    My questions are:

    1. Is each LSTM object a stacked LSTM network?
    2. According to your LSTM Fprop implementation and your cuda kernel kLSTMFprop(), are you multiplying different "w" with "x" first and then in cuda kernel summing different components up?
    3. Just to make sure, "ucf101_sample_train_patches.npy" has shape (9000, 20, 3, 32, 32) where 20 means 20 frames right? What is following code doing?

    for t in xrange(self.enc_seq_length_): self.lstm_stack_enc_.Fprop(input_frame=self.v_.col_slice(t * self.num_dims_, (t+1) * self.num_dims_))

    Any help will be appreciated. Thanks

    opened by liaocs2008 2
  • Problem about moving MNIST

    Problem about moving MNIST

    bounce-off problem

    digit moving is in fact based on digit patch(2828), not digit itself. So the bounce-off is exactly based on digit patch. In most case, digit will bounce back before digit arrives at the edge of outer box(6464) ,due to white padding in digit patch. Does it matter for the training? What problem will it cause?

    digit overlap problem

    data_handler.py about line 417

        """ Put b on top of a."""
       def Overlap(self, a, b):
        # S1: original by author, can not make sure that b is on top of a
        #return np.maximum(a, b)
    
        #S2: digit patch overlap,consistent  with bounce-off process 
        #return b
    
        # S3: digit overlap, but inconsistent with bounce-off process
        return np.where(b==0,a,b)
    

    As mentioned by the notes above,

    • S3 is truly digit overlap
    • S2 is digit patch overlap
    • S1 is something else, but it is the original code in the project

    Is that a problem?

    opened by springzfx 1
  • ImportError: No module named config_pb2,    what's the package name?

    ImportError: No module named config_pb2, what's the package name?

    envy@ub1404:~/os_pri/github/unsupervised-videos$ python lstm_combo.py models/lstm_combo_1layer_mnist.pbtxt datasets/bouncing_mnist.pbtxt datasets/bouncing_mnist_valid.pbtxt 1 Traceback (most recent call last): File "lstm_combo.py", line 1, in from data_handler import * File "/home/envy/os_pri/github/unsupervised-videos/data_handler.py", line 3, in from util import * File "/home/envy/os_pri/github/unsupervised-videos/util.py", line 16, in import config_pb2 ImportError: No module named config_pb2 envy@ub1404:~/os_pri/github/unsupervised-videos$

    opened by loveJasmine 1
  • cudamat.cudamat.CUDAMatException: CUDA error: no error

    cudamat.cudamat.CUDAMatException: CUDA error: no error

    I am using an Amazon Machine Image (AMI) to launch a GPU instance on Amazon EC2 with CUDA 7.5 and the required libraries installed. When I type make in your cudamat directory, everything appears to work. I am able to compile the .proto file and download the datasets. However, when I attempt to train a model, I get this error:

    ubuntu@ip-172-31-9-162:~/unsupervised-videos-master$ python lstm_combo.py models/lstm_combo_1layer_mnist.pbtxt datasets/bouncing_mnist.pbtxt datasets/bouncing_mnist_valid.pbtxt 1 invalid device ordinal Traceback (most recent call last): File "lstm_combo.py", line 402, in board = LockGPU(board=board_id) File "/home/ubuntu/unsupervised-videos-master/util.py", line 144, in LockGPU cm.cuda_set_device(board) File "/home/ubuntu/unsupervised-videos-master/cudamat/cudamat.py", line 2273, in cuda_set_device raise generate_exception(err_code) cudamat.cudamat.CUDAMatException: CUDA error: no error

    Is your cudamat setup compatible with CUDA 7.5? Is it necessary to use your custom directory?

    Can you help me find the source of this error? I'd appreciate it very much.

    Best, Tina

    opened by tinarwhite 1
  • Training with new dataset

    Training with new dataset

    Hi, Thank you very much for providing the code for the paper. I have a small doubt regarding using your code to predict video frames on some other video dataset. Could you kindly share some insight on how to do that? Also, could you kindly tell me if the .npy file contains videos (in frames) of the entire videos in dataset like in a 5-D matrix (video_number,frame_no,channels,height,width) format? Thank you very much.

    opened by shashankvkt 1
  • Extrpolating matrices

    Extrpolating matrices

    Instead of giving videos, how do we test with normal matrices? for examples if i am giving 5x5 matrix then is there a way to get (let's say) a 8x8 matrix? (trained with other 5x5 matrix)

    opened by Mrinal18 0
  • Error while giving the command for training

    Error while giving the command for training

    Hello, When I try to implement

    python lstm_combo.py models/lstm_combo_1layer_mnist.pbtxt datasets/bouncing_mnist.pbtxt datasets/bouncing_mnist_valid.pbtxt 1 I get following error: OSError: /nas/ei/home/ga85pav/unsupervised-videos/cudamat/libcudamat.so: cannot open shared object file: No such file or directory

    I don't see the libcudamat.so in cudamat folder but it is mentioned in Makefile about it. Am I missing something?

    Could someone help me on it.

    1. Also, is it possible to skip training, Are there pre-trained models available anywhere for UCF dataset or MNIST dataset?

    Thanks!!

    opened by Nd-sole 6
  • no kernel image is available for execution on the device

    no kernel image is available for execution on the device

    when I run python lstm_combo.py models/lstm_combo_1layer_mnist.pbtxt datasets/bouncing_mnist.pbtxt datasets/bouncing_mnist_valid.pbtxt 1 I got this errror "no kernel image is available for execution on the device" my environment is ubuntu16.10/1080ti/cuda9.1/anaconda2/python2.7

    opened by smallflyingpig 2
  • Questions about LSTM_classifier

    Questions about LSTM_classifier

    1. The authors mentioned about "We initialize an LSTM classifier with the weights learned by the encoder LSTM from this model. "in their paper, but I am a beginner and I don't understand how to initialize the lstm classifier?

    2. The Figure 6 shows that the lstm classifier is two layers ,could your tell how to implement it ?

    Is there anyone can help me with this? Thanks a lot.

    @emansim

    opened by jianghaojun 1
Owner
Elman Mansimov
Applied Scientist @amazon-research
Elman Mansimov
Repo for flood prediction using LSTMs and HAND

Abstract Every year, floods cause billions of dollars’ worth of damages to life, crops, and property. With a proper early flood warning system in plac

null 1 Oct 27, 2021
LSTMs (Long Short Term Memory) RNN for prediction of price trends

Price Prediction with Recurrent Neural Networks LSTMs BTC-USD price prediction with deep learning algorithm. Artificial Neural Networks specifically L

null 5 Nov 12, 2021
pytorch implementation of "Contrastive Multiview Coding", "Momentum Contrast for Unsupervised Visual Representation Learning", and "Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination"

Unofficial implementation: MoCo: Momentum Contrast for Unsupervised Visual Representation Learning (Paper) InsDis: Unsupervised Feature Learning via N

Zhiqiang Shen 16 Nov 4, 2020
Code for the paper "Unsupervised Contrastive Learning of Sound Event Representations", ICASSP 2021.

Unsupervised Contrastive Learning of Sound Event Representations This repository contains the code for the following paper. If you use this code or pa

Eduardo Fonseca 81 Dec 22, 2022
CURL: Contrastive Unsupervised Representations for Reinforcement Learning

CURL Rainbow Status: Archive (code is provided as-is, no updates expected) This is an implementation of CURL: Contrastive Unsupervised Representations

Aravind Srinivas 46 Dec 12, 2022
Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations. [2021]

Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations This repo contains the Pytorch implementation of our paper: Revisit

Wouter Van Gansbeke 80 Nov 20, 2022
Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

null 1 Jan 23, 2022
Unsupervised Video Interpolation using Cycle Consistency

Unsupervised Video Interpolation using Cycle Consistency Project | Paper | YouTube Unsupervised Video Interpolation using Cycle Consistency Fitsum A.

NVIDIA Corporation 100 Nov 30, 2022
Dense Unsupervised Learning for Video Segmentation (NeurIPS*2021)

Dense Unsupervised Learning for Video Segmentation This repository contains the official implementation of our paper: Dense Unsupervised Learning for

Visual Inference Lab @TU Darmstadt 173 Dec 26, 2022
[AAAI2021] The source code for our paper 《Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion》.

DSM The source code for paper Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion Project Website; Datasets li

Jinpeng Wang 114 Oct 16, 2022
Video lie detector using xgboost - A video lie detector using OpenFace and xgboost

video_lie_detector_using_xgboost a video lie detector using OpenFace and xgboost

null 2 Jan 11, 2022
Learning trajectory representations using self-supervision and programmatic supervision.

Trajectory Embedding for Behavior Analysis (TREBA) Implementation from the paper: Jennifer J. Sun, Ann Kennedy, Eric Zhan, David J. Anderson, Yisong Y

null 58 Jan 6, 2023
Exploiting Robust Unsupervised Video Person Re-identification

Exploiting Robust Unsupervised Video Person Re-identification Implementation of the proposed uPMnet. For the preprint, please refer to [Arxiv]. Gettin

null 1 Apr 9, 2022
[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

Reference-based Video Super-Resolution (RefVSR) Official PyTorch Implementation of the CVPR 2022 Paper Project | arXiv | RealMCVSR Dataset This repo c

Junyong Lee 151 Dec 30, 2022
We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.

Multi-Modal Self-Supervision using GDT and StiCa This is an official pytorch implementation of papers: Multi-modal Self-Supervision from Generalized D

Facebook Research 42 Dec 9, 2022
Viewmaker Networks: Learning Views for Unsupervised Representation Learning

Viewmaker Networks: Learning Views for Unsupervised Representation Learning Alex Tamkin, Mike Wu, and Noah Goodman Paper link: https://arxiv.org/abs/2

Alex Tamkin 31 Dec 1, 2022
CRLT: A Unified Contrastive Learning Toolkit for Unsupervised Text Representation Learning

CRLT: A Unified Contrastive Learning Toolkit for Unsupervised Text Representation Learning This repository contains the code and relevant instructions

XiaoMing 5 Aug 19, 2022
Code for: Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space. Nicholas Monath, Manzil Zaheer, Daniel Silva, Andrew McCallum, Amr Ahmed. KDD 2019.

gHHC Code for: Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space. Nicholas Monath, Manzil Zaheer, D

Nicholas Monath 35 Nov 16, 2022