Official code of paper: MovingFashion: a Benchmark for the Video-to-Shop Challenge

Overview

PWC

SEAM Match-RCNN

Official code of MovingFashion: a Benchmark for the Video-to-Shop Challenge paper

CC BY-NC-SA 4.0

Installation

Requirements:

  • Pytorch 1.5.1 or more recent, with cudatoolkit (10.2)
  • torchvision
  • tensorboard
  • cocoapi
  • OpenCV Python
  • tqdm
  • cython
  • CUDA >= 10

Step-by-step installation

# first, make sure that your conda is setup properly with the right environment
# for that, check that `which conda`, `which pip` and `which python` points to the
# right path. From a clean conda env, this is what you need to do

conda create --name seam -y python=3
conda activate seam

pip install cython tqdm opencv-python

# follow PyTorch installation in https://pytorch.org/get-started/locally/
conda install pytorch torchvision cudatoolkit=10.2 -c pytorch

conda install tensorboard

export INSTALL_DIR=$PWD

# install pycocotools
cd $INSTALL_DIR
git clone https://github.com/cocodataset/cocoapi.git
cd cocoapi/PythonAPI
python setup.py build_ext install

# download SEAM
cd $INSTALL_DIR
git clone https://github.com/VIPS4/SEAM-Match-RCNN.git
cd SEAM-Match-RCNN
mkdir data
mkdir ckpt

unset INSTALL_DIR

Dataset

SEAM Match-RCNN has been trained and test on MovingFashion and DeepFashion2 datasets. Follow the instruction to download and extract the datasets.

We suggest to download the datasets inside the folder data.

MovingFashion

MovingFashion dataset is available for academic purposes here.

Deepfashion2

DeepFashion2 dataset is available here. You need fill in the form to get password for unzipping files.

Once the dataset will be extracted, use the reserved DeepFtoCoco.py script to convert the annotations in COCO format, specifying dataset path.

python DeepFtoCoco.py --path <dataset_root>

Training

We provide the scripts to train both Match-RCNN and SEAM Match-RCNN. Check the scripts for all the possible parameters.

Single GPU

#training of Match-RCNN
python train_matchrcnn.py --root_train <path_of_images_folder> --train_annots <json_path> --save_path <save_path> 

#training on movingfashion
python train_movingfashion.py --root <path_of_dataset_root> --train_annots <json_path> --test_annots <json_path> --pretrained_path <path_of_matchrcnn_model>


#training on multi-deepfashion2
python train_multiDF2.py --root <path_of_dataset_root> --train_annots <json_path> --test_annots <json_path> --pretrained_path <path_of_matchrcnn_model>

Multi GPU

We use internally torch.distributed.launch in order to launch multi-gpu training. This utility function from PyTorch spawns as many Python processes as the number of GPUs we want to use, and each Python process will only use a single GPU.

#training of Match-RCNN
python -m torch.distributed.launch --nproc_per_node=<NUM_GPUS> train_matchrcnn.py --root_train <path_of_images_folder> --train_annots <json_path> --save_path <save_path>

#training on movingfashion
python -m torch.distributed.launch --nproc_per_node=<NUM_GPUS> train_movingfashion.py --root <path_of_dataset_root> --train_annots <json_path> --test_annots <json_path> --pretrained_path <path_of_matchrcnn_model> 

#training on multi-deepfashion2
python -m torch.distributed.launch --nproc_per_node=<NUM_GPUS> train_multiDF2.py --root <path_of_dataset_root> --train_annots <json_path> --test_annots <json_path> --pretrained_path <path_of_matchrcnn_model> 

Pre-Trained models

It is possibile to start training using the MatchRCNN pre-trained model.

[MatchRCNN] Pre-trained model on Deepfashion2 is available to download here. This model can be used to start the training at the second phase (training directly SEAM Match-RCNN).

We suggest to download the model inside the folder ckpt.

Evaluation

To evaluate the models of SEAM Match-RCNN please use the following scripts.

#evaluation on movingfashion
python evaluate_movingfashion.py --root_test <path_of_dataset_root> --test_annots <json_path> --ckpt_path <checkpoint_path>


#evaluation on multi-deepfashion2
python evaluate_multiDF2.py --root_test <path_of_dataset_root> --test_annots <json_path> --ckpt_path <checkpoint_path>

Citation

@misc{godi2021movingfashion,
      title={MovingFashion: a Benchmark for the Video-to-Shop Challenge}, 
      author={Marco Godi and Christian Joppi and Geri Skenderi and Marco Cristani},
      year={2021},
      eprint={2110.02627},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

CC BY-NC-SA 4.0

You might also like...
A code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Vanderhaeghe, and Yotam Gingold from SIGGRAPH Asia 2020.

A Benchmark for Rough Sketch Cleanup This is the code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Va

The dataset and source code for our paper: "Did You Ask a Good Question? A Cross-Domain Question IntentionClassification Benchmark for Text-to-SQL"

TriageSQL The dataset and source code for our paper: "Did You Ask a Good Question? A Cross-Domain Question Intention Classification Benchmark for Text

This is the code for the paper "Jinkai Zheng, Xinchen Liu, Wu Liu, Lingxiao He, Chenggang Yan, Tao Mei: Gait Recognition in the Wild with Dense 3D Representations and A Benchmark. (CVPR 2022)"

Gait3D-Benchmark This is the code for the paper "Jinkai Zheng, Xinchen Liu, Wu Liu, Lingxiao He, Chenggang Yan, Tao Mei: Gait Recognition in the Wild

Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation
Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation

📖 Depth-Aware Generative Adversarial Network for Talking Head Video Generation (CVPR 2022) 🔥 If DaGAN is helpful in your photos/projects, please hel

This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

SLATE This is the official source code for SLATE. We provide the code for the model, the training code and a dataset loader for the 3D Shapes dataset.

[IJCAI-2021] A benchmark of data-free knowledge distillation from paper
[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

DataFree A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation" Authors: Gongfa

Codebase for the self-supervised goal reaching benchmark introduced in the LEXA paper
Codebase for the self-supervised goal reaching benchmark introduced in the LEXA paper

LEXA Benchmark Codebase for the self-supervised goal reaching benchmark introduced in the LEXA paper (Discovering and Achieving Goals via World Models

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

TimeSformer This is an official pytorch implementation of Is Space-Time Attention All You Need for Video Understanding?. In this repository, we provid

The official pytorch implemention of the CVPR paper "Temporal Modulation Network for Controllable Space-Time Video Super-Resolution".

This is the official PyTorch implementation of TMNet in the CVPR 2021 paper "Temporal Modulation Network for Controllable Space-Time VideoSuper-Resolu

Comments
  • AttributeError: 'TemporalRoIHeads' object has no attribute 'tracking_predictor'

    AttributeError: 'TemporalRoIHeads' object has no attribute 'tracking_predictor'

    Hi, Thank you for your code! when I run train_movingfashion.py, it runs to a AttributeError: 'TemporalRoIHeads' object has no attribute 'tracking_predictor' at video_matchrcnn.py (line 327, in load_saved_matchrcnn) How can I fix it?

    bug 
    opened by young-yangb 6
  • Evaluation error Missing keys

    Evaluation error Missing keys

    Hi, When I am running the evaluation on the pretrained model it is failing with ''' RuntimeError: Error(s) in loading state_dict for VideoMatchRCNN: Missing key(s) in state_dict: "backbone.body.conv1.weight", "backbone.body.bn1.weight", "backbone.body.bn1.bias", "backbone.body.bn1.running_mean", "backbone.body.bn1.running_var", "backbone.body.layer1.0.conv1.weight", "backbone.body.layer1.0.bn1.weight", "backbone.body.layer1.0.bn1.bias", "backbone.body.layer1.0.bn1.running_mean", "backbone.body.layer1.0.bn1.running_var", "backbone.body.layer1.0.conv2.weight", "backbone.body.layer1.0.bn2.weight", "backbone.body.layer1.0.bn2.bias", "backbone.body.layer1.0.bn2.running_mean", "backbone.body.layer1.0.bn2.running_var", "backbone.body.layer1.0.conv3.weight", "backbone.body.layer1.0.bn3.weight", "backbone.body.layer1.0.bn3.bias", "backbone.body.layer1.0.bn3.running_mean", "backbone.body.layer1.0.bn3.running_var", "backbone.body.layer1.0.downsample.0.weight", "backbone.body.layer1.0.downsample.1.weight", roi_heads.temporal_aggregator.newnlb.phi.weight", "roi_heads.temporal_aggregator.newnlb.phi.bias", "roi_heads.temporal_aggregator.newnlb.concat_project.0.weight". Unexpected key(s) in state_dict: "module.backbone.body.conv1.weight", "module.backbone.body.bn1.weight", "module.backbone.body.bn1.bias", "module.backbone.body.bn1.running_mean", "module.backbone.body.bn1.running_var", "module.backbone.body.layer1.0.conv1.weight", "module.backbone.body.layer1.0.bn1.weight", "module.backbone.body.layer1.0.bn1.bias", "module.backbone.body.layer1.0.bn1.running_mean", "module.roi_heads.match_predictor.conv_seq.4.weight", "module.roi_heads.match_predictor.conv_seq.4.bias", "module.roi_heads.match_predictor.conv_seq.6.weight", "module.roi_heads.match_predictor.conv_seq.6.bias", "module.roi_heads.match_predictor.linear.0.weight", "module.roi_heads.match_predictor.linear.0.bias", "module.roi_heads.match_predictor.linear.1.weight", "module.roi_heads.match_predictor.linear.1.bias", "module.roi_heads.match_predictor.linear.1.running_mean", "module.roi_heads.match_predictor.linear.1.running_var", "module.roi_heads.match_predictor.linear.1.num_batches_tracked", "module.roi_heads.match_predictor.last.weight", "module.roi_heads.match_predictor.last.bias". '''

    Basically all the keys are have module added to the name compared to the config i believe.

    May i know if this is an issue with the pretrained model uploaded? if so can you share the corrected one.

    Thanks and Regards

    question 
    opened by programmeddeath1 1
  • retrieval results

    retrieval results

    Hi, the retrieval results I reproduced seems inaccurate and I have no idea what the problem is. I used the weights pre-trained on DeepFashion2, with 2 gpus for distributed training, and the evaluation result is as follows:

    Top-1 Retrieval Accuracy: 0.0626 Top-5 Retrieval Accuracy: 0.1818 Top-10 Retrieval Accuracy: 0.2727 Top-20 Retrieval Accuracy: 0.3667


    Top-1 Retrieval Accuracy Product Max: 0.2727 Top-5 Retrieval Accuracy Product Max: 0.5253 Top-10 Retrieval Accuracy Product Max: 0.6263 Top-20 Retrieval Accuracy Product Max: 0.7778


    Top-1 Retrieval Accuracy Product Avg Desc: 0.1212 Top-5 Retrieval Accuracy Product Avg Desc: 0.3535 Top-10 Retrieval Accuracy Product Avg Desc: 0.4545 Top-20 Retrieval Accuracy Product Avg Desc: 0.5455


    Top-1 Retrieval Accuracy Product Aggr Desc: 0.1818 Top-5 Retrieval Accuracy Product Aggr Desc: 0.4141 Top-10 Retrieval Accuracy Product Aggr Desc: 0.4646 Top-20 Retrieval Accuracy Product Aggr Desc: 0.5455


    Top-1 Retrieval Accuracy Product Avg Dist: 0.1010 Top-5 Retrieval Accuracy Product Avg Dist: 0.3434 Top-10 Retrieval Accuracy Product Avg Dist: 0.4242 Top-20 Retrieval Accuracy Product Avg Dist: 0.5556


    Top-1 Retrieval Accuracy Product Max Dist: 0.0606 Top-5 Retrieval Accuracy Product Max Dist: 0.2323 Top-10 Retrieval Accuracy Product Max Dist: 0.4444 Top-20 Retrieval Accuracy Product Max Dist: 0.5051


    Top-1 Retrieval Accuracy Product Max Score: 0.1111 Top-5 Retrieval Accuracy Product Max Score: 0.2626 Top-10 Retrieval Accuracy Product Max Score: 0.3939 Top-20 Retrieval Accuracy Product Max Score: 0.5051


    Regular ONLY Top-1 Retrieval Accuracy: 0.0826 Top-5 Retrieval Accuracy: 0.2261 Top-10 Retrieval Accuracy: 0.3232 Top-20 Retrieval Accuracy: 0.4217


    Top-1 Retrieval Accuracy Product Avg Desc: 0.1739 Top-5 Retrieval Accuracy Product Avg Desc: 0.4493 Top-10 Retrieval Accuracy Product Avg Desc: 0.5362 Top-20 Retrieval Accuracy Product Avg Desc: 0.6377


    Top-1 Retrieval Accuracy Product Aggr Desc: 0.2174 Top-5 Retrieval Accuracy Product Aggr Desc: 0.5217 Top-10 Retrieval Accuracy Product Aggr Desc: 0.5652 Top-20 Retrieval Accuracy Product Aggr Desc: 0.6522


    Top-1 Retrieval Accuracy Product Avg Dist: 0.1449 Top-5 Retrieval Accuracy Product Avg Dist: 0.4493 Top-10 Retrieval Accuracy Product Avg Dist: 0.5072 Top-20 Retrieval Accuracy Product Avg Dist: 0.6667


    Top-1 Retrieval Accuracy Product Max Dist: 0.0870 Top-5 Retrieval Accuracy Product Max Dist: 0.2899 Top-10 Retrieval Accuracy Product Max Dist: 0.5362 Top-20 Retrieval Accuracy Product Max Dist: 0.6087


    Top-1 Retrieval Accuracy Product Max Score: 0.1449 Top-5 Retrieval Accuracy Product Max Score: 0.3333 Top-10 Retrieval Accuracy Product Max Score: 0.4783 Top-20 Retrieval Accuracy Product Max Score: 0.5942


    Hard ONLY Top-1 Retrieval Accuracy: 0.0167 Top-5 Retrieval Accuracy: 0.0800 Top-10 Retrieval Accuracy: 0.1567 Top-20 Retrieval Accuracy: 0.2400


    Top-1 Retrieval Accuracy Product Avg Desc: 0.0000 Top-5 Retrieval Accuracy Product Avg Desc: 0.1333 Top-10 Retrieval Accuracy Product Avg Desc: 0.2667 Top-20 Retrieval Accuracy Product Avg Desc: 0.3333


    Top-1 Retrieval Accuracy Product Aggr Desc: 0.1000 Top-5 Retrieval Accuracy Product Aggr Desc: 0.1667 Top-10 Retrieval Accuracy Product Aggr Desc: 0.2333 Top-20 Retrieval Accuracy Product Aggr Desc: 0.3000


    Top-1 Retrieval Accuracy Product Avg Dist: 0.0000 Top-5 Retrieval Accuracy Product Avg Dist: 0.1000 Top-10 Retrieval Accuracy Product Avg Dist: 0.2333 Top-20 Retrieval Accuracy Product Avg Dist: 0.3000


    Top-1 Retrieval Accuracy Product Max Dist: 0.0000 Top-5 Retrieval Accuracy Product Max Dist: 0.1000 Top-10 Retrieval Accuracy Product Max Dist: 0.2333 Top-20 Retrieval Accuracy Product Max Dist: 0.2667


    Top-1 Retrieval Accuracy Product Max Score: 0.0333 Top-5 Retrieval Accuracy Product Max Score: 0.1000 Top-10 Retrieval Accuracy Product Max Score: 0.2000 Top-20 Retrieval Accuracy Product Max Score: 0.3000


    Rank median: 36.0; rank 1st quartile: 7.0; rank 3rd quartile: 140.25 Average Track Length: 9.292929292929292

    opened by young-yangb 1
Owner
HumaticsLAB
Video and Image Processing for Fashion
HumaticsLAB
Official Implement of CVPR 2021 paper “Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting”

RGBT Crowd Counting Lingbo Liu, Jiaqi Chen, Hefeng Wu, Guanbin Li, Chenglong Li, Liang Lin. "Cross-Modal Collaborative Representation Learning and a L

null 37 Dec 8, 2022
Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

null 1 Jan 23, 2022
This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.

Skeleton Aware Multi-modal Sign Language Recognition By Songyao Jiang, Bin Sun, Lichen Wang, Yue Bai, Kunpeng Li and Yun Fu. Smile Lab @ Northeastern

Isen (Songyao Jiang) 128 Dec 8, 2022
Official code for 'Robust Siamese Object Tracking for Unmanned Aerial Manipulator' and offical introduction to UAMT100 benchmark

SiamSA: Robust Siamese Object Tracking for Unmanned Aerial Manipulator Demo video ?? Our video on Youtube and bilibili demonstrates the evaluation of

Intelligent Vision for Robotics in Complex Environment 12 Dec 18, 2022
The official implementation code of "PlantStereo: A Stereo Matching Benchmark for Plant Surface Dense Reconstruction."

PlantStereo This is the official implementation code for the paper "PlantStereo: A Stereo Matching Benchmark for Plant Surface Dense Reconstruction".

Wang Qingyu 14 Nov 28, 2022
Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.

NeuralTextures This is repository with inference code for paper "StylePeople: A Generative Model of Fullbody Human Avatars" (CVPR21). This code is for

Visual Understanding Lab @ Samsung AI Center Moscow 18 Oct 6, 2022
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark

Introduction English | 简体中文 MMAction2 is an open-source toolbox for video understanding based on PyTorch. It is a part of the OpenMMLab project. The m

OpenMMLab 2.7k Jan 7, 2023
HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world Benchmark Dataset (ICCV 2021)

Code for HDR Video Reconstruction HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world Benchmark Dataset (ICCV 2021) Guanying Chen, Cha

Guanying Chen 64 Nov 19, 2022
Revisiting Video Saliency: A Large-scale Benchmark and a New Model (CVPR18, PAMI19)

DHF1K =========================================================================== Wenguan Wang, J. Shen, M.-M Cheng and A. Borji, Revisiting Video Sal

Wenguan Wang 126 Dec 3, 2022
[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

Reference-based Video Super-Resolution (RefVSR) Official PyTorch Implementation of the CVPR 2022 Paper Project | arXiv | RealMCVSR Dataset This repo c

Junyong Lee 151 Dec 30, 2022