Learning Correspondence from the Cycle-consistency of Time (CVPR 2019)



Code for Learning Correspondence from the Cycle-consistency of Time (CVPR 2019, Oral). The code is developed based on the PyTorch framework, in version PyTorch 0.4 with Python 2. It also runs smoothly with PyTorch 1.0. This repo includes the training code for learning semi-dense correspondence from unlabeled videos, and testing code for applying this correspondence on segmentation mask tracking in videos.


If you use our code in your research or wish to refer to the baseline results, please use the following BibTeX entry.

    Author = {Xiaolong Wang and Allan Jabri and Alexei A. Efros},
    Title = {Learning Correspondence from the Cycle-Consistency of Time},
    Booktitle = {CVPR},
    Year = {2019},

Model and Result

Our trained model can be downloaded from here. The tracking performance on DAVIS-2017 for this model (without training on DAVIS-2017) is:

cropSize J_mean J_recall J_decay F_mean F_recall F_decay
320 x 320 0.419 0.409 0.272 0.394 0.336 0.328
400 x 400 0.430 0.437 0.296 0.426 0.413 0.356
480 x 480 0.464 0.500 0.332 0.500 0.480 0.379

Note that one can easily improve the results in test time by increasing the input image size "cropSize" in the script. The training and testing procedures for this model are described as follows.

Converting Our Model to Standard Pytorch ResNet-50

Please see convert_model.ipynb for converting our model here to standard Pytorch ResNet-50 model format.

Dataset Preparation

Please read DATASET.md for downloading and preparing the VLOG dataset for training and DAVIS dataset for testing.


Replace the input list in train_video_cycle_simple.py in the home folder as:

    params['filelist'] = 'YOUR_DATASET_FOLDER/vlog_frames_12fps.txt'

Then run the following code:

    python train_video_cycle_simple.py --checkpoint pytorch_checkpoints/release_model_simple


Replace the input list in test_davis.py in the home folder as:

    params['filelist'] = 'YOUR_DATASET_FOLDER/davis/DAVIS/vallist.txt'

Set up the dataset path YOUR_DATASET_FOLDER in run_test.sh . Then run the testing and evaluation code together:

    sh run_test.sh


weakalign by Ignacio Rocco, Relja Arandjelović and Josef Sivic.

inflated_convnets_pytorch by Yana Hasson.

pytorch-classification by Wei Yang.

  • Error in class VlogSet

    Error in class VlogSet

    The default videoLen is set to 4 in the config. Now there are videos where the number of frames is less than 4.

    Within the file models/dataset/vlog_train.py, line 124 tries to read at least 4 frames for each video and the dataloader crashes for the case where the number of frames is less than 4.

    Crash at line 132 while reading the image

    img = load_image(img_path)


    File "/beegfs/ahj265/self_supervised_tracking/models/dataset/vlog_train.py", line 175, in __getitem__
        img = load_image(img_path)  # CxHxW
      File "/beegfs/ahj265/self_supervised_tracking/utils/imutils2.py", line 23, in load_image
        img = img.astype(np.float32)
    AttributeError: 'NoneType' object has no attribute 'astype'

    Is there a preprocessing step I am missing where you filter out such videos?

    opened by ananyahjha93 7
  • In test_davis.py line 459

    In test_davis.py line 459

    Should "hid = ids / width_dim " be "hid = ids / /width_dim" ? Otherwise the hid is not int, then you can not use it as the index.

    opened by Talegqz 5
  • Weird loss progression

    Weird loss progression

    Since I am training the model on VLOG with a very small batch size, the training is going to take forever (8 days). And because I don't want to wait that long, I'll stop the training before 30 epochs. But the losses shown in the logs seem odd to me. Can someone provide me the log of a complete training so I can compare the losses and see if my early results are normal or not? Thanks

    Learning Rate	Train Loss	Theta Loss	Theta Skip Loss	
    0.000200	-0.002401	0.366067	0.331109	
    0.000200	-0.002381	0.369635	0.328924	
    0.000200	-0.001740	0.402181	0.374113	
    0.000200	-0.001929	0.378956	0.342752
    opened by RaphaelRoyerRivard 4
  • How to reduce the GPU memory needs?

    How to reduce the GPU memory needs?

    During the first epoch, I get the following out of memory error

    Traceback (most recent call last):                                                                                                                            
      File "train_video_cycle_simple.py", line 352, in <module>                                                                                                   
      File "train_video_cycle_simple.py", line 232, in main                                                                                                       
        train_loss, theta_loss, theta_skip_loss = train(train_loader, model, criterion, optimizer, epoch, use_cuda, args)                                         
      File "train_video_cycle_simple.py", line 290, in train                                                                                                      
        outputs = model(imgs, patch2, img, theta)                                                                                                                 
      File "C:\Logiciels\Anaconda3\envs\torch\lib\site-packages\torch\nn\modules\module.py", line 493, in __call__                                                
        result = self.forward(*input, **kwargs)                                                                                                                   
      File "C:\Logiciels\Anaconda3\envs\torch\lib\site-packages\torch\nn\parallel\data_parallel.py", line 150, in forward                                         
        return self.module(*inputs[0], **kwargs[0])                                                                                                               
      File "C:\Logiciels\Anaconda3\envs\torch\lib\site-packages\torch\nn\modules\module.py", line 493, in __call__                                                
        result = self.forward(*input, **kwargs)                                                                                                                   
      File "C:\Users\root\Projects\TimeCycle\models\videos\model_simple.py", line 203, in forward                                                                 
        r50_feat1, r50_feat1_pre, r50_feat1_norm = self.forward_base(videoclip1)                                                                                  
      File "C:\Users\root\Projects\TimeCycle\models\videos\model_simple.py", line 164, in forward_base                                                            
        x_pre = self.encoderVideo(x)                                                                                                                              
      File "C:\Logiciels\Anaconda3\envs\torch\lib\site-packages\torch\nn\modules\module.py", line 493, in __call__                                                
        result = self.forward(*input, **kwargs)                                                                                                                   
      File "C:\Users\root\Projects\TimeCycle\models\videos\inflated_resnet.py", line 35, in forward                                                               
        x = self.layer1(x)                                                                                                                                        
      File "C:\Logiciels\Anaconda3\envs\torch\lib\site-packages\torch\nn\modules\module.py", line 493, in __call__                                                
        result = self.forward(*input, **kwargs)                                                                                                                   
      File "C:\Logiciels\Anaconda3\envs\torch\lib\site-packages\torch\nn\modules\container.py", line 92, in forward                                               
        input = module(input)                                                                                                                                     
      File "C:\Logiciels\Anaconda3\envs\torch\lib\site-packages\torch\nn\modules\module.py", line 493, in __call__                                                
        result = self.forward(*input, **kwargs)                                                                                                                   
      File "C:\Users\root\Projects\TimeCycle\models\videos\inflated_resnet.py", line 95, in forward                                                               
        out = self.conv3(out)                                                                                                                                     
      File "C:\Logiciels\Anaconda3\envs\torch\lib\site-packages\torch\nn\modules\module.py", line 493, in __call__                                                
        result = self.forward(*input, **kwargs)                                                                                                                   
      File "C:\Logiciels\Anaconda3\envs\torch\lib\site-packages\torch\nn\modules\conv.py", line 476, in forward                                                   
        self.padding, self.dilation, self.groups)                                                                                                                 
    RuntimeError: CUDA out of memory. Tried to allocate 508.00 MiB (GPU 0; 8.00 GiB total capacity; 5.63 GiB already allocated; 362.97 MiB free; 41.09 MiB cached)
    > c:\logiciels\anaconda3\envs\torch\lib\site-packages\torch\nn\modules\conv.py(476)forward()                                                                  
    -> self.padding, self.dilation, self.groups)                                                                                                                  

    The settings used are the default ones

    batchSize: 36                   
    temperature: 0.04419417382415922
    gridSize: 9                     
    classNum: 49                    
    videoLen: 4                     
    self.T: 0.04419417382415922     
        Total params: 26.01M        
    weight_decay: 0.0               
    beta1: 0.5                      

    What do I need to change to reduce the needs of GPU memory?

    opened by RaphaelRoyerRivard 3
  • How many GPU memory it Need?

    How many GPU memory it Need?

    Hi, I run test_davis.py, It always report CUDA out of memory, My GPU has 6GB, it is not enough? I set the batch_size=1, it also report CUDA out of memory, why?

    And if you could add a demo to run tracking on general video, it well be fine, thank you!

    opened by dongfangduoshou123 3
  • Reproducing DeepCluster DAVIS Evaluation Results

    Reproducing DeepCluster DAVIS Evaluation Results


    Thank you for providing this repo! I have had some trouble reproducing the exact DeepCluster performance numbers on DAVIS-2017 from your paper. Could you confirm whether the network you used is the VGG16-PyTorch pretrained model from the DeepCluster repo (https://github.com/facebookresearch/deepcluster)?

    In addition, during evaluation do you extract the feature map directly before maxpool-4 in the VGG16 model, or which feature map output do you use from the pretrained model?


    opened by dmckee5 2
  • out of memory

    out of memory

    i try to test on davis dataset with two GPU ,11GB,but it got error,please help me to solve it, thanks. batchSize: 1 temperature: 1.0 gridSize: 9 classNum: 49 videoLen: 8 cropSize: 320 cropSize2: 80 0,1,2,3 False self.T: 0.04419417382415922 Total params: 26.01M ==> Resuming from checkpoint..

    Evaluation only gridx: 4 gridy: 4 total_frame_num: 77 (77, 320, 320, 3) [array([0, 0, 0], dtype=uint8), array([ 0, 128, 0], dtype=uint8), array([128, 0, 0], dtype=uint8)] [85088, 10181, 7129] 20.661283493041992 relabel 0.456728458404541 label 0 Traceback (most recent call last): File "test_davis.py", line 458, in test_loss = test(val_loader, model, 1, use_cuda) File "test_davis.py", line 238, in test corrfeat2_now = model(imgs_tensor, target_tensor) File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(*input, **kwargs) File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 119, in forward inputs, kwargs = self.scatter(inputs, kwargs, self.device_ids) File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 130, in scatter return scatter_kwargs(inputs, kwargs, device_ids, dim=self.dim) File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/scatter_gather.py", line 35, in scatter_kwargs inputs = scatter(inputs, target_gpus, dim) if inputs else [] File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/scatter_gather.py", line 28, in scatter return scatter_map(inputs) File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/scatter_gather.py", line 15, in scatter_map return list(zip(*map(scatter_map, obj))) File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/scatter_gather.py", line 13, in scatter_map return Scatter.apply(target_gpus, None, dim, obj) File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/_functions.py", line 87, in forward outputs = comm.scatter(input, ctx.target_gpus, ctx.chunk_sizes, ctx.dim, streams) File "/opt/conda/lib/python3.6/site-packages/torch/cuda/comm.py", line 142, in scatter return tuple(torch._C._scatter(tensor, devices, chunk_sizes, dim, streams)) RuntimeError: CUDA error: out of memory (allocate at /opt/conda/conda-bld/pytorch_1532579805626/work/aten/src/THC/THCCachingAllocator.cpp:510)

    opened by Lywzz 2
  • Error while converting checkpoint_14.pth.tar model

    Error while converting checkpoint_14.pth.tar model

    The following error is hit while converting the released model (checkpoint_14.pth.tar) to resnet50.

    KeyError                                  Traceback (most recent call last)
    <ipython-input-3-6df704e20364> in <module>()
         15     kk = k.replace('module.encoderVideo.', '')
         16     tmp = model_state[k]
    ---> 17     if net_state[kk].shape != model_state[k].shape and net_state[kk].dim() == 4 and model_state[k].dim() == 5:
         18         tmp = model_state[k].squeeze(2)
         19     net_state[kk][:] = tmp[:]
    KeyError: 'conv1.weight'

    Do I need a specific version of pytorch to convert the model?

    opened by lathampratti 2
  • Missing file 0_9_mask.png

    Missing file 0_9_mask.png

    When I tried to test the pre-trained model on the DAVIS dataset using sh run_test.sh, it cannot find file 0_9_mask.png. It seems that this file is not included in the DAVIS dataset and this repo.

    opened by GehenHe 1
  • How to generate the 'davis/DAVIS/ImageSets/2017/val.txt' file?

    How to generate the 'davis/DAVIS/ImageSets/2017/val.txt' file?

    I'm trying to test my trained network on Davis, but the test_davis.py script wants to read the file davis/DAVIS/ImageSets/2017/val.txt which doesn't exist.

    I downloaded the dataset manually on the website (https://davischallenge.org/davis2017/code.html) and I got the Annotations, ImageSets and JPEGImages folders, but in the ImageSets/2017 folder, I the only file I got is test-dev.txt.

    I can't find any info on how to generate that file. Can someone help me?

    opened by RaphaelRoyerRivard 1
  • AffineGridGenV3


    What is the reason for AffineGridGenV3? It seems like the only change from it and V2 was that you moved the creation of the grids to the forward pass instead of the initialization and made them Tensors instead of FloatTensors. What was the reasoning behind these changes?

    opened by cinjon 1
  • evaluating with pretrained_imagenet gives lower J_mean and F_mean than reported in paper

    evaluating with pretrained_imagenet gives lower J_mean and F_mean than reported in paper

    I tried evaluating using imagenet pre-trained resent50 model by removing the --resume flag and providing --pretrained-imagenet flag while running test_davis.py. I got a J_mean of 45.2 and F_mean of 48.8 which is lower than what is reported in the paper Table 1 (50.3 and 49.0). How can I reproduce these numbers?

    opened by ruppeshnalwaya1993 0
  • why return batch2d in inflate_batch_norm()?

    why return batch2d in inflate_batch_norm()?


    Thanks for your appealing work. When you inflating the ResNet2d into ResNet3d, I noticed that you returned the batch2d object in >>inflate_batch_norm()<< function. Shouldn't it return the batch3d?

    opened by LiUzHiAn 0
  • missing davis/DAVIS/vallist.txt

    missing davis/DAVIS/vallist.txt

    I am trying to test my trained network, but the script can't seem to find vallist.txt at the location it expects it to be (DATASET_FOLDER/davis/DAVIS/vallist.txt).

    I dowloaded Test-Dev 2017 and Test-Challenge 2017 and couldn't find the file in either of the two. Which of the two are we expected to use to test (to get the same results as in the paper)?

    Any ideas? Thanks!

    opened by arjung128 1
  • Out-of-bounds indexing while training, parameter explanations

    Out-of-bounds indexing while training, parameter explanations

    Hi, I am in the process of training on a custom dataset. I have 12 videos, each with 250 jpeg images and the appropriate .txt file for the dataset which specifies the paths to the dataset during training. I am running into the same issue from a closed issue:

    File "models/dataset/vlog_train.py", line 175, in getitem img = load_image(img_path) # CxHxW

    The path to the image is trying to index 000250.jpg which is out of bounds (since there are only 250 images, 0-indexed). I think this has something to do with what the parameters videoLen and frame_gap are for the dataset. I see that fnums is the # of jpeg images for the given folder, so what are the videoLen and frame_gap parameters used for?

    In models/dataset/vlog_train.py, there is also a line:

    current_len = (self.videoLen + self.predDistance) * frame_gap

    and later on a check that says:

    if fnum >= current_len: > diffnum = fnum - current_len > startframe = random.randint(0, diffnum) > future_idx = startframe + current_len - 1

    What do videoLen, predDistance, and frame_gap represent in this context?

    opened by priyasundaresan 0
  • Questions about the training from scratch

    Questions about the training from scratch

    Hi. I used the provided code to train TimeCycle on some other video datasets. Finetuning the network with the provided checkpoint_14.pth.tar works fine. But when I training the network from scratch, both the inlier loss and theta loss did not decrease. Is there any training tips when training TimeCycle from scratch?

    opened by gonglixue 9
  • transform_trans_out


    I was looking at the transform_trans_out in model_simple.py function and I noticed that the 2D rotation matrix is multiplied by 1/3. Could you please help figure out the reason for that? Shouldn't the 2D matrix be correct as is?


    opened by AbdallaGomaa 1
Xiaolong Wang
Assistant Professor, UC San Diego
Xiaolong Wang
An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

PyTorch implementation of SFNet This is the implementation of the paper "SFNet: Learning Object-aware Semantic Correspondence". For more information,

CV Lab @ Yonsei University 87 Dec 30, 2022
Unsupervised Video Interpolation using Cycle Consistency

Unsupervised Video Interpolation using Cycle Consistency Project | Paper | YouTube Unsupervised Video Interpolation using Cycle Consistency Fitsum A.

NVIDIA Corporation 100 Nov 30, 2022
Self-Learned Video Rain Streak Removal: When Cyclic Consistency Meets Temporal Correspondence

In this paper, we address the problem of rain streaks removal in video by developing a self-learned rain streak removal method, which does not require any clean groundtruth images in the training process.

Yang Wenhan 44 Dec 6, 2022
Official repository for Few-shot Image Generation via Cross-domain Correspondence (CVPR '21)

Few-shot Image Generation via Cross-domain Correspondence Utkarsh Ojha, Yijun Li, Jingwan Lu, Alexei A. Efros, Yong Jae Lee, Eli Shechtman, Richard Zh

Utkarsh Ojha 251 Dec 11, 2022
PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization using Augmented-Self Reference and Dense Semantic Correspondence) and pre-trained model on ImageNet dataset

Reference-Based-Sketch-Image-Colorization-ImageNet This is a PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization usin

Yuzhi ZHAO 11 Jul 28, 2022
SurfEmb (CVPR 2022) - SurfEmb: Dense and Continuous Correspondence Distributions

SurfEmb SurfEmb: Dense and Continuous Correspondence Distributions for Object Pose Estimation with Learnt Surface Embeddings Rasmus Laurvig Haugard, A

Rasmus Haugaard 56 Nov 19, 2022
Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning, CVPR 2021

Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning By Zhenda Xie*, Yutong Lin*, Zheng Zhang, Yue Ca

Zhenda Xie 293 Dec 20, 2022
Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CAC) Xin Lai*, Zhuotao Tian*, Li Jiang, Shu Liu, Hengshuang Zhao, Li

Jia Research Lab 137 Dec 14, 2022
Self-supervised Augmentation Consistency for Adapting Semantic Segmentation (CVPR 2021)

Self-supervised Augmentation Consistency for Adapting Semantic Segmentation This repository contains the official implementation of our paper: Self-su

Visual Inference Lab @TU Darmstadt 132 Dec 21, 2022
Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021

Portrait Photo Retouching with PPR10K Paper | Supplementary Material PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask an

null 184 Dec 11, 2022
Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CAC) Xin Lai*, Zhuotao Tian*, Li Jiang, Shu Liu, Hengshuang Zhao, Li

DV Lab 137 Dec 14, 2022
STEAL - Learning Semantic Boundaries from Noisy Annotations (CVPR 2019)

STEAL This is the official inference code for: Devil Is in the Edges: Learning Semantic Boundaries from Noisy Annotations David Acuna, Amlan Kar, Sanj

null 469 Dec 26, 2022
Facial Action Unit Intensity Estimation via Semantic Correspondence Learning with Dynamic Graph Convolution

FAU Implementation of the paper: Facial Action Unit Intensity Estimation via Semantic Correspondence Learning with Dynamic Graph Convolution. Yingruo

Evelyn 78 Nov 29, 2022
CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation

CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation (CVPR 2021, oral presentation) CoCosNet v2: Full-Resolution Correspondence

Microsoft 308 Dec 7, 2022
RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving

RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving (AAAI2021). RTS3D is efficiency and accuracy s

null 71 Nov 29, 2022
Pytorch Implementation for NeurIPS (oral) paper: Pixel Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation

Pixel-Level Cycle Association This is the Pytorch implementation of our NeurIPS 2020 Oral paper Pixel-Level Cycle Association: A New Perspective for D

null 87 Oct 19, 2022
Code and models for ICCV2021 paper "Robust Object Detection via Instance-Level Temporal Cycle Confusion".

Robust Object Detection via Instance-Level Temporal Cycle Confusion This repo contains the implementation of the ICCV 2021 paper, Robust Object Detect

Xin Wang 69 Oct 13, 2022
pcnaDeep integrates cutting-edge detection techniques with tracking and cell cycle resolving models.

pcnaDeep: a deep-learning based single-cell cycle profiler with PCNA signal Welcome! pcnaDeep integrates cutting-edge detection techniques with tracki

ChanLab 8 Oct 18, 2022
Cycle Consistent Adversarial Domain Adaptation (CyCADA)

Cycle Consistent Adversarial Domain Adaptation (CyCADA) A pytorch implementation of CyCADA. If you use this code in your research please consider citi

Hyunwoo Ko 2 Jan 10, 2022