Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation

Related tags

Deep Learning PTSNet
Overview

Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation

By Qiang Zhou*, Zilong Huang*, Lichao Huang, Han Shen, Yongchao Gong, Chang Huang, Wenyu Liu, Xinggang Wang.(* means equal contribution)

This code is the implementation mainly for DAVIS 2017 dataset. For more detail, please refer to our paper.

Architecture


Overview of our proposed PTSNet for video object segmentation. OPN is designed for generating proposals of the interested objects and OTN aims to distinguish which one of the proposals is the best. Finally, DRSN does the final pixel level tracking(segmentation) task. Note in our implementation we couple OPN and OTN as a whole network, and spearate DRSN out under engineering consideration.

Usage

Preparation

  1. Install PyTorch 1.0 and necessary libraries like opencv, PIL etc.

  2. There are some native CUDA implementations, InPlace-ABN and MaskRCNN Operators, which must be compiled at the very start.

    # Before you compile, you need to figure out several things:
    # - The CUDA kernels supported by your GPU, here we use `sm_52`, `sm_61` and `sm_70` for NVIDIA Titan V.
    # - `cuda` and `nvcc` paths in your operating system, which exist usually in `/usr/local/cuda` and `/usr/local/cuda/bin/nvcc` respectively.
    # InPlace-ABN_0.4   (PyTorch 0.4)
    cd model/inplace_ABN_0.4
    bash build.sh
    # OR you could choose the 1.0 version of inplace ABN.
    # InPlace-ABN_1.0   (PyTorch 1.0)
    cd model/inplace_ABN    # It is dynamically compiled when running (gcc > 4.9)
    
    # MaskRCNN Operators (PyTorch 0.4)
    cd coupled_otn_opn/tracking/maskrcnn/lib
    bash make.sh
  3. You can train PTSNet from scratch or just evaluate our pretrained model.

    • Train it from scratch, you need to download:

       # DRSN: wget "https://download.pytorch.org/models/resnet50-19c8e357.pth" -O drsn/init_models/resnet50-19c8e357.pth
       # OPN: wget "https://drive.google.com/open?id=1ma1fNmEvS9dJLOIcm1FRzYofVS_t3aI3" -O coupled_otn_opn/tracking/maskrcnn/data/X-152-32x8d-IN5k.pkl
       # If you want to use our pretrained OTN:
       #   wget https://drive.google.com/open?id=12bF1dRlEUZoQz3Qcr2WD3ojqNHzbCrjf, put it into `coupled_otn_opn/models/mdnet_davis_50cyche.pth`
       # Else please modify from py-MDNet(https://github.com/HyeonseobNam/py-MDNet) to train OTN on DAVIS by yourself.
    • If you want to use our pretrained model to do the evaluation, you need to download:

       # DRSN: https://drive.google.com/open?id=116yXnqX43BZ7kEgdzUhIeTSn1dbvcE2F, put it into `drsn/snapshots/drsn_yvos_10w_davis_3p5w.pth`
       # OPN: wget "https://drive.google.com/open?id=1ma1fNmEvS9dJLOIcm1FRzYofVS_t3aI3" -O coupled_otn_opn/tracking/maskrcnn/data/X-152-32x8d-IN5k.pkl
       # OTN: https://drive.google.com/open?id=12bF1dRlEUZoQz3Qcr2WD3ojqNHzbCrjf, put it into `coupled_otn_opn/models/mdnet_davis_50cycle.pth`
  4. Dataset

    • YouTube-VOS: Download from YouTube-VOS, note we only need the training part(train_all_frames.zip), totally about 41G. Unzip, move and rename it to drsn/dataset/yvos.
    • DAVIS: Download from DAVIS, note we only need the 480p version(DAVIS-2017-trainval-480p.zip). Unzip, move and rename it to drsn/dataset/DAVIS/trainval and coupled_otn_opn/DAVIS/trainval. Here you need to make a subdirectory of trainval directory to store the dataset.

    And make sure to put the files as the following structure:

    .
    ├── drsn
    │   ├── dataset
    │   │   ├── DAVIS
    │   │   │   └── trainval
    │   │   │       ├── Annotations
    │   │   │       ├── ImageSets
    │   │   │       └── JPEGImages
    │   │   └── yvos
    │   │       └── train_all_frames
    │   ├── init_model
    │   │   └── resnet50-19c8e357.pth
    │   └── snapshots
    │       └── drsn_yvos_10w_davis_3p5w.pth
    └── coupled_otn_opn
        ├── DAVIS
        │   └── trainval
        ├── models
        │   └── mdnet_davis_50cycle.pth
        └── tracking
            └── maskrcnn
                └── data
                    └── X-152-32x8d-FPN-IN5k.pkl
    

Train and Evaluate

  • Firstly, check the directory of coupled_otn_opn and follow the README.md inside to generate our proposals. You can also skip this step for we have provided generated proposals in drsn/dataset/result_davis directory.
  • Secondly, enter drsn and check do_train_eval.sh to train and evaluate.
  • Finally, we also provide result masks by our PTSNet in result-masks-GoogleDrive. The quantitative results are measured by DAVIS official matlab toolbox.
J Mean F Mean G Mean
Avg 71.6 77.7 74.7

Acknowledgment

The work was mainly done during an internship at Horizon Robotics.

Citing PTSNet

If you find PTSNet useful in your research, please consider citing:

@article{ptsnet2019,
        title={Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation},
        author={Zhou, Qiang and Huang, Zilong and Huang, Lichao and Han, Shen and Gong, Yongchao and Huang, Chang and Liu, Wenyu and Wang, Xinggang},
        journal = {arXiv preprint arXiv:1907.01203v2},
        year={2019}
        }

Thanks to the Third Party Libs

Comments
  • Segment Fault

    Segment Fault

    When we run the CUDA_VISIBLE_DEVICES=0 python train.py --model_save_path experiments/snapshots --max_iters 100000 --decayat 60000 --learning_rate 2e-5 --batch_size 8, a segment fault has been occured. How can we handle it? ### Thank you very much!

    (torch) visuallab@visuallab:~/PTSNet/drsn$ CUDA_VISIBLE_DEVICES=0 python train.py --model_save_path experiments/snapshots --max_iters 100000 --decayat 60000 --learning_rate 2e-5 --batch_size 8 Detected CUDA files, patching ldflags Emitting ninja build file /home/visuallab/PTSNet/drsn/model/inplace_ABN/build/build.ninja... Building extension module inplace_abn... [1/2] c++ -MMD -MF inplace_abn.o.d -DTORCH_EXTENSION_NAME=inplace_abn -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/visuallab/anaconda3/envs/torch041/lib/python3.6/site-packages/torch/lib/include -isystem /home/visuallab/anaconda3/envs/torch041/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/visuallab/anaconda3/envs/torch041/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/visuallab/anaconda3/envs/torch041/lib/python3.6/site-packages/torch/lib/include/THC -isystem /usr/local/cuda-10.0/include -isystem /home/visuallab/anaconda3/envs/torch041/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -O3 -c /home/visuallab/PTSNet/drsn/model/inplace_ABN/src/inplace_abn.cpp -o inplace_abn.o [2/2] c++ inplace_abn.o inplace_abn_cpu.o inplace_abn_cuda.cuda.o inplace_abn_cuda_half.cuda.o -shared -L/usr/local/cuda-10.0/lib64 -lcudart -o inplace_abn.so Loading extension module inplace_abn... [I 190717 20:33:53 train:133] { "init_model_path": "init_models/resnet50-19c8e357.pth", "model_save_path": "experiments/snapshots", "img_size": [ 256, 256 ], "batch_size": 8, "save_num_images": 2, "decayat": 60000, "learning_rate": [ 2e-05 ], "learning_policy": "step", "max_iters": 100000, "save_iters": 20000, "weight_decay": 0.0002, "power": 0.9 } [I 190717 20:33:53 train:135] Setting model... [I 190717 20:33:55 train:143] Setting criterion... [I 190717 20:33:55 train:150] Setting CUDNN... 段错误 (核心已转储)

    opened by thyztw 6
  • error when trying to train on the yvos dataset

    error when trying to train on the yvos dataset

    using the command taken from drsn/do_train_eval and adapted i tried to train on the yvos dataset to test the algorithm

    the comman i used was: "CUDA_VISIBLE_DEVICES=0 python train.py --model_save_path experiments/snapshots --max_iters 100000 --decayat 60000 --learning_rate 2e-5 --batch_size 64 --input_size 256,256" my default python version is 3.6.8, gcc version 5.5, cuda version 9.0 and inplace_abn version 1.0.3

    the error i received was the following:


    Traceback (most recent call last): File "train.py", line 23, in from model.drsn import DRSN File "/home/orel/projects/pts/PTSNet-master/drsn/model/drsn.py", line 9, in from inplace_ABN import InPlaceABN, InPlaceABNSync File "/home/orel/projects/pts/PTSNet-master/drsn/model/inplace_ABN/init.py", line 1, in from .bn import ABN, InPlaceABN, InPlaceABNSync File "/home/orel/projects/pts/PTSNet-master/drsn/model/inplace_ABN/bn.py", line 10, in from .functions import * File "/home/orel/projects/pts/PTSNet-master/drsn/model/inplace_ABN/functions.py", line 22, in extra_cuda_cflags=["--expt-extended-lambda"]) File "/home/orel/projects/pts/ptsenv/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 644, in load is_python_module) File "/home/orel/projects/pts/ptsenv/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 802, in _jit_compile if baton.try_acquire(): File "/home/orel/projects/pts/ptsenv/lib/python3.6/site-packages/torch/utils/file_baton.py", line 36, in try_acquire self.fd = os.open(self.lock_file_path, os.O_CREAT | os.O_EXCL) FileNotFoundError: [Errno 2] No such file or directory: '/home/orel/projects/pts/PTSNet-master/drsn/model/inplace_ABN/build/lock'


    I don't understand what file is missing and how to replace/construct it

    I would appreciate your help

    Thanks,

    Orel

    opened by ghost 4
  • download link issue

    download link issue

    Hi,

    thanks for sharing your work. I'd like to try your pretrained network but I'm not able to access the given link : https://s3-us-west-2.amazonaws.com/detectron/ImageNetPretrained/25093814/X-152-32x8d-IN5k.pkl

    opened by jeremy-cv 3
  • OPN model in google driver can not be downloaded

    OPN model in google driver can not be downloaded

    Hello, I try to download the pretrained OPN model in google driver but it is too slow and get wrong in a while. Can you give some other downloads or links?

    opened by saxg 2
  • init_model not found

    init_model not found

    Hello,

    I tried to train on the yvos dataset using the following command: 'CUDA_VISIBLE_DEVICES=0 python train.py --model_save_path experiments/snapshots --max_iters 100000 --decayat 60000 --learning_rate 2e-5 --batch_size 64' (without inputting the input_size argument)

    i received the following output:

    "init_model_path": "init_models/resnet50-19c8e357.pth", "model_save_path": "experiments/snapshots", "img_size": [ 256, 256 ], "batch_size": 64, "save_num_images": 2, "decayat": 60000, "learning_rate": [ 2e-05 ], "learning_policy": "step", "max_iters": 100000, "save_iters": 20000, "weight_decay": 0.0002, "power": 0.9 } [I 190807 14:22:30 train:127] Setting model... Traceback (most recent call last): File "train.py", line 206, in main() File "train.py", line 129, in main model.init(args.init_model_path, "yvos_train") File "/home/orel/projects/pts/PTSNet-master/drsn/model/drsn.py", line 216, in init saved_state_dict = torch.load(model_path) File "/home/orel/projects/pts/ptsenv/lib/python3.6/site-packages/torch/serialization.py", line 382, in load f = open(f, 'rb') FileNotFoundError: [Errno 2] No such file or directory: 'init_models/resnet50-19c8e357.pth'

    I am guessing I needed to prepare the pretrained resnet model but i am not sure could you instruct me on how to proceed?

    Thanks!

    opened by ghost 2
  • unrecognized argument input size

    unrecognized argument input size

    Hi again,

    the command i used to run training on the yvos dataset was the following:

    CUDA_VISIBLE_DEVICES=0 python train.py --model_save_path experiments/snapshots --max_iters 100000 --decayat 60000 --learning_rate 2e-5 --batch_size 64 --input_size 256,256

    and i recieved this output:

    usage: train.py [-h] [--init_model_path INIT_MODEL_PATH] [--model_save_path MODEL_SAVE_PATH] [--img_size IMG_SIZE IMG_SIZE] [--batch_size BATCH_SIZE] [--save_num_images SAVE_NUM_IMAGES] [--decayat DECAYAT] [--learning_rate LEARNING_RATE [LEARNING_RATE ...]] [--learning_policy {step,poly,constant}] [--max_iters MAX_ITERS] [--save_iters SAVE_ITERS] [--weight_decay WEIGHT_DECAY] [--power POWER] train.py: error: unrecognized arguments: --input_size 256,256

    the argument "input_size" is one you suggest using in the readme file. should i remove it from the command or add it to 'train.py'?

    opened by ghost 2
  • Something wrong with the pre-trained model

    Something wrong with the pre-trained model

    Thanks a lot for sharing the code!

    I meet the following error when trying to evaluate the pre-trained model on DAVIS2017:

    image

    It seems that there is problem about key mismatch. Could you please help to address this issue? Thanks!

    opened by KunpengLi1994 2
  • cffi.VerificationError: CompileError: command 'gcc' failed with exit status 1

    cffi.VerificationError: CompileError: command 'gcc' failed with exit status 1

    my environment is CUDA10.0 RTX 2080 Ti python 3.6 pytorch 0.4.1

    but I meet this error, cffi.VerificationError: CompileError: command 'gcc' failed with exit status 1 please tell me how to solve it. and will you update your code with pytorch1.1.0.

    by the way, where do you get these files of "native cuda operation" from ? are there more information about these files?

    opened by shoutOutYangJie 1
Owner
Forest
If a bullet's going to get you, it has already been fired.
Forest
Cascaded Pyramid Network (CPN) based on Keras (Tensorflow backend)

ML2 Takehome Project Reimplementing the paper: Cascaded Pyramid Network for Multi-Person Pose Estimation Dataset The model uses the COCO dataset which

Vo Van Tu 1 Nov 22, 2021
PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud, CVPR 2019.

PointRCNN PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud Code release for the paper PointRCNN:3D Object Proposal Generation a

Shaoshuai Shi 1.5k Dec 27, 2022
git《FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding》(CVPR 2021) GitHub: [fig8]

FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding (CVPR 2021) This repo contains the implementation of our state-of-the-art fewshot ob

null 233 Dec 29, 2022
Face Detection and Alignment using Multi-task Cascaded Convolutional Networks (MTCNN)

Face-Detection-with-MTCNN Face detection is a computer vision problem that involves finding faces in photos. It is a trivial problem for humans to sol

Chetan Hirapara 3 Oct 7, 2022
CVPR2021: Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Temporal Context Aggregation Network - Pytorch This repo holds the pytorch-version codes of paper: "Temporal Context Aggregation Network for Temporal

Zhiwu Qing 63 Sep 27, 2022
QueryDet: Cascaded Sparse Query for Accelerating High-Resolution SmallObject Detection

QueryDet-PyTorch This repository is the official implementation of our paper: QueryDet: Cascaded Sparse Query for Accelerating High-Resolution Small O

Chenhongyi Yang 276 Dec 31, 2022
Photographic Image Synthesis with Cascaded Refinement Networks - Pytorch Implementation

Photographic Image Synthesis with Cascaded Refinement Networks-Pytorch (https://arxiv.org/abs/1707.09405) This is a Pytorch implementation of cascaded

Soumya Tripathy 63 Mar 27, 2022
Python package for multiple object tracking research with focus on laboratory animals tracking.

motutils is a Python package for multiple object tracking research with focus on laboratory animals tracking. Features loads: MOTChallenge CSV, sleap

Matěj Šmíd 2 Sep 5, 2022
Object tracking and object detection is applied to track golf puts in real time and display stats/games.

Putting_Game Object tracking and object detection is applied to track golf puts in real time and display stats/games. Works best with the Perfect Prac

Max 1 Dec 29, 2021
Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

This is the official PyTorch implementation of our paper: "Joint Object Detection and Multi-Object Tracking with Graph Neural Networks". Our project website and video demos are here.

Richard Wang 443 Dec 6, 2022
Object Detection and Multi-Object Tracking

Object Detection and Multi-Object Tracking

Bobby Chen 1.6k Jan 4, 2023
TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction

TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction TSDF++ is a novel multi-object TSDF formulation that can encode mult

ETHZ ASL 130 Dec 29, 2022
The official repo for OC-SORT: Observation-Centric SORT on video Multi-Object Tracking. OC-SORT is simple, online and robust to occlusion/non-linear motion.

OC-SORT Observation-Centric SORT (OC-SORT) is a pure motion-model-based multi-object tracker. It aims to improve tracking robustness in crowded scenes

Jinkun Cao 325 Jan 5, 2023
This repository contains the code for "Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP".

Self-Diagnosis and Self-Debiasing This repository contains the source code for Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based

Timo Schick 62 Dec 12, 2022
A pytorch-version implementation codes of paper: "BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation"

BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation A pytorch-version implementation

null 11 Oct 8, 2022
[CVPR21] LightTrack: Finding Lightweight Neural Network for Object Tracking via One-Shot Architecture Search

LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search The official implementation of the paper LightTra

Multimedia Research 290 Dec 24, 2022
Implementation of "Efficient Regional Memory Network for Video Object Segmentation" (Xie et al., CVPR 2021).

RMNet This repository contains the source code for the paper Efficient Regional Memory Network for Video Object Segmentation. Cite this work @inprocee

Haozhe Xie 76 Dec 14, 2022
Hierarchical Memory Matching Network for Video Object Segmentation (ICCV 2021)

Hierarchical Memory Matching Network for Video Object Segmentation Hongje Seong, Seoung Wug Oh, Joon-Young Lee, Seongwon Lee, Suhyeon Lee, Euntai Kim

Hongje Seong 72 Dec 14, 2022