Code and training data for our ECCV 2016 paper on Unsupervised Learning

Ishan Misra

Last update: Dec 8, 2021

Related tags

Deep Learning shuffle-tuple

Overview

Shuffle and Learn (Shuffle Tuple)

Created by Ishan Misra

Based on the ECCV 2016 Paper - "Shuffle and Learn: Unsupervised Learning using Temporal Order Verification" link to paper.

This codebase contains the model and training data from our paper.

Introduction

Our code base is a mix of Python and C++ and uses the Caffe framework. Design decisions and some code is derived from the Fast-RCNN codebase by Ross Girshick.

Citing

If you find our code useful in your research, please consider citing:

@inproceedings{misra2016unsupervised,
  title={{Shuffle and Learn: Unsupervised Learning using Temporal Order Verification}},
  author={Misra, Ishan and Zitnick, C. Lawrence and Hebert, Martial},
  booktitle={ECCV},
  year={2016}
}

Benchmark Results

We summarize the results of finetuning our method here (details in the paper).

Action Recognition

| Dataset | Accuracy (split 1) | Accuracy (mean over splits) :--- | :--- | :--- | :--- UCF101 | 50.9 | 50.2 HMDB51 | 19.8 | 18.1

Pascal Action Classification (VOC2012): Coming soon

Pose estimation

FLIC: PCK (Mean, AUC) 84.7, 49.6
MPII: [email protected] (Upper, Full, AUC): 87.7, 85.8, 47.6

Object Detection

PASCAL VOC2007 test mAP of 42.4% using Fast RCNN.

We initialize conv1-5 using our unsupervised pre-training. We initialize fc6-8 randomly. We then follow the procedure from Krahenbuhl et al., 2016 to rescale our network and finetune all layers using their hyperparameters.

Surface Normal Prediction

NYUv2 (Coming soon)

Requirements: software
Models and Training Data
Usage
Utils

Requirements: software

Requirements for Caffe and pycaffe (see: Caffe installation instructions)

Note: Caffe must be built with support for Python layers and OpenCV.

# In your Makefile.config, make sure to have this line uncommented
WITH_PYTHON_LAYER := 1
USE_OPENCV := 1

You can download a compatible fork of Caffe from here. Note that since our model requires Batch Normalization, you will need to have a fairly recent fork of caffe.

Models and Training Data

Our model trained on tuples from UCF101 (train split 1, without using action labels) can be downloaded here.
The tuples used for training our model can be downloaded as a zipped text file here. Each line of the file train01_image_keys.txt defines a tuple of three frames. The corresponding file train01_image_labs.txt has a binary label indicating whether the tuple is in the correct or incorrect order.
Using the training tuples requires you to have the raw videos from the UCF101 dataset (link to videos). We extract frames from the videos and resize them such that the max dimension is 340 pixels. You can use ffmpeg to extract the frames. Example command: ffmpeg -i <video_name> -qscale 1 -f image2 <video_sub_name>/<video_sub_name>_%06d.jpg, where video_sub_name is the name of the raw video without the file extension.

Usage

Once you have downloaded and formatted the UCF101 videos, you can use the networks/tuple_train.prototxt file to train your network. The only complicated part in the network definition is the data layer, which reads a tuple and a label. The data layer source file is in the python_layers subdirectory. Make sure to add this to your PYTHONPATH.
Training for Action Recognition: We used the codebase from here
Training for Pose Estimation: We used the codebase from here. Since this code does not use caffe for training a network, I have included a experimental data layer for caffe in python_layers/pose_data_layer.py

Utils

This repo also includes a bunch of utilities I used for training and debugging my models

python_layers/loss_tracking_layer: This layer tracks loss of each individual data point and its class label. This is useful for debugging as one can see the loss per class across epochs. Thanks to Abhinav Shrivastava for discussions on this.
model_training_utils: This is the wrapper code used to train the network if one wants to use the loss_tracking layer. These utilities not only track the loss, but also keep a log of various other statistics of the network - weights of the layers, norms of the weights, magnitude of change etc. For an example of how to use this check networks/tuple_exp.py. Thanks to Carl Doersch for discussions on this.
python_layers/multiple_image_multiple_label_data_layer: This is a fairly generic data layer that can read multiple images and data. It is based off my data layers repo.

Comments

Finetune dataset

Hello, Imisra. In your paper, you said that the pretrain dataset is the training split 1 of UCF101, and the fine-tune dataset is test split 1 of UCF101. Could you please tell me that do you use the label of test split 1 of UCF101 to fine-tune? If not, then what is the ground-truth? The tuple verification? Thanks a lot!

opened by laura-wang 0
Where is class ratio?

I cannot find where you define the class ratio: pos : neg = 1 : 3. By watching the output of network training, I find it only slides according to the lab.txt files without shuffle, with only 3-5 pos samples per minibatch (32).

opened by zhujiagang 0

[AAAI2021] The source code for our paper 《Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion》.

DSM The source code for paper Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion Project Website; Datasets li

114 Oct 16, 2022

This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code for training a DPR model then continuing training with RAG.

KGI (Knowledge Graph Induction) for slot filling This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code fo

72 Jan 6, 2023

Code for paper ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop.

Who Left the Dogs Out? Evaluation and demo code for our ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization

29 Dec 28, 2022

Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

CoProtector Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

1 Oct 26, 2021

pytorch implementation of "Contrastive Multiview Coding", "Momentum Contrast for Unsupervised Visual Representation Learning", and "Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination"

Unofficial implementation: MoCo: Momentum Contrast for Unsupervised Visual Representation Learning (Paper) InsDis: Unsupervised Feature Learning via N

16 Nov 4, 2020

Code and training data for our ECCV 2016 paper on Unsupervised Learning

Related tags

Overview

Shuffle and Learn (Shuffle Tuple)

Introduction

Citing

Benchmark Results

Contents

Requirements: software

Models and Training Data

Usage

Utils

You might also like...

[AAAI2021] The source code for our paper 《Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion》.

This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code for training a DPR model then continuing training with RAG.

Code for paper ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop.

Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

pytorch implementation of "Contrastive Multiview Coding", "Momentum Contrast for Unsupervised Visual Representation Learning", and "Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination"

Code for the paper "Improving Vision-and-Language Navigation with Image-Text Pairs from the Web" (ECCV 2020)

Code for ECCV 2020 paper "Contacts and Human Dynamics from Monocular Video".

Code of our paper "Contrastive Object-level Pre-training with Spatial Noise Curriculum Learning"

The code for our paper Semi-Supervised Learning with Multi-Head Co-Training

Comments

Finetune dataset

Where is class ratio?

Owner

Ishan Misra

Code for our paper at ECCV 2020: Post-Training Piecewise Linear Quantization for Deep Neural Networks

IAST: Instance Adaptive Self-training for Unsupervised Domain Adaptation (ECCV 2020)

PyTorch code for our ECCV 2020 paper "Single Image Super-Resolution via a Holistic Attention Network"

PyTorch code for our ECCV 2018 paper "Image Super-Resolution Using Very Deep Residual Channel Attention Networks"

Code for the paper: Adversarial Training Against Location-Optimized Adversarial Patches. ECCV-W 2020.

PyTorch implementation of our Adam-NSCL algorithm from our CVPR2021 (oral) paper "Training Networks in Null Space for Continual Learning"

code for our paper "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer"

《Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement》(ECCV 2020) GitHub: [fig9]

We evaluate our method on different datasets (including ShapeNet, CUB-200-2011, and Pascal3D+) and achieve state-of-the-art results, outperforming all the other supervised and unsupervised methods and 3D representations, all in terms of performance, accuracy, and training time.

Pytorch implementation of Value Iteration Networks (NIPS 2016 best paper)