Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation

Forest

Last update: Apr 1, 2022

Related tags

Deep Learning PTSNet

Overview

Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation

By Qiang Zhou*, Zilong Huang*, Lichao Huang, Han Shen, Yongchao Gong, Chang Huang, Wenyu Liu, Xinggang Wang.(* means equal contribution)

This code is the implementation mainly for DAVIS 2017 dataset. For more detail, please refer to our paper.

Architecture

Overview of our proposed PTSNet for video object segmentation. OPN is designed for generating proposals of the interested objects and OTN aims to distinguish which one of the proposals is the best. Finally, DRSN does the final pixel level tracking(segmentation) task. Note in our implementation we couple OPN and OTN as a whole network, and spearate DRSN out under engineering consideration.

Usage

Preparation

Install PyTorch 1.0 and necessary libraries like opencv, PIL etc.

There are some native CUDA implementations, InPlace-ABN and MaskRCNN Operators, which must be compiled at the very start.

# Before you compile, you need to figure out several things:
# - The CUDA kernels supported by your GPU, here we use `sm_52`, `sm_61` and `sm_70` for NVIDIA Titan V.
# - `cuda` and `nvcc` paths in your operating system, which exist usually in `/usr/local/cuda` and `/usr/local/cuda/bin/nvcc` respectively.
# InPlace-ABN_0.4   (PyTorch 0.4)
cd model/inplace_ABN_0.4
bash build.sh
# OR you could choose the 1.0 version of inplace ABN.
# InPlace-ABN_1.0   (PyTorch 1.0)
cd model/inplace_ABN    # It is dynamically compiled when running (gcc > 4.9)

# MaskRCNN Operators (PyTorch 0.4)
cd coupled_otn_opn/tracking/maskrcnn/lib
bash make.sh

You can train PTSNet from scratch or just evaluate our pretrained model.

Train it from scratch, you need to download:

 # DRSN: wget "https://download.pytorch.org/models/resnet50-19c8e357.pth" -O drsn/init_models/resnet50-19c8e357.pth
 # OPN: wget "https://drive.google.com/open?id=1ma1fNmEvS9dJLOIcm1FRzYofVS_t3aI3" -O coupled_otn_opn/tracking/maskrcnn/data/X-152-32x8d-IN5k.pkl
 # If you want to use our pretrained OTN:
 #   wget https://drive.google.com/open?id=12bF1dRlEUZoQz3Qcr2WD3ojqNHzbCrjf, put it into `coupled_otn_opn/models/mdnet_davis_50cyche.pth`
 # Else please modify from py-MDNet(https://github.com/HyeonseobNam/py-MDNet) to train OTN on DAVIS by yourself.

If you want to use our pretrained model to do the evaluation, you need to download:

 # DRSN: https://drive.google.com/open?id=116yXnqX43BZ7kEgdzUhIeTSn1dbvcE2F, put it into `drsn/snapshots/drsn_yvos_10w_davis_3p5w.pth`
 # OPN: wget "https://drive.google.com/open?id=1ma1fNmEvS9dJLOIcm1FRzYofVS_t3aI3" -O coupled_otn_opn/tracking/maskrcnn/data/X-152-32x8d-IN5k.pkl
 # OTN: https://drive.google.com/open?id=12bF1dRlEUZoQz3Qcr2WD3ojqNHzbCrjf, put it into `coupled_otn_opn/models/mdnet_davis_50cycle.pth`

Dataset

YouTube-VOS: Download from YouTube-VOS, note we only need the training part(train_all_frames.zip), totally about 41G. Unzip, move and rename it to drsn/dataset/yvos.
DAVIS: Download from DAVIS, note we only need the 480p version(DAVIS-2017-trainval-480p.zip). Unzip, move and rename it to drsn/dataset/DAVIS/trainval and coupled_otn_opn/DAVIS/trainval. Here you need to make a subdirectory of trainval directory to store the dataset.

And make sure to put the files as the following structure:

.
├── drsn
│   ├── dataset
│   │   ├── DAVIS
│   │   │   └── trainval
│   │   │       ├── Annotations
│   │   │       ├── ImageSets
│   │   │       └── JPEGImages
│   │   └── yvos
│   │       └── train_all_frames
│   ├── init_model
│   │   └── resnet50-19c8e357.pth
│   └── snapshots
│       └── drsn_yvos_10w_davis_3p5w.pth
└── coupled_otn_opn
    ├── DAVIS
    │   └── trainval
    ├── models
    │   └── mdnet_davis_50cycle.pth
    └── tracking
        └── maskrcnn
            └── data
                └── X-152-32x8d-FPN-IN5k.pkl

Train and Evaluate

Firstly, check the directory of coupled_otn_opn and follow the README.md inside to generate our proposals. You can also skip this step for we have provided generated proposals in drsn/dataset/result_davis directory.
Secondly, enter drsn and check do_train_eval.sh to train and evaluate.
Finally, we also provide result masks by our PTSNet in result-masks-GoogleDrive. The quantitative results are measured by DAVIS official matlab toolbox.

	J Mean	F Mean	G Mean
Avg	71.6	77.7	74.7

Acknowledgment

The work was mainly done during an internship at Horizon Robotics.

Citing PTSNet

If you find PTSNet useful in your research, please consider citing:

@article{ptsnet2019,
        title={Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation},
        author={Zhou, Qiang and Huang, Zilong and Huang, Lichao and Han, Shen and Gong, Yongchao and Huang, Chang and Liu, Wenyu and Wang, Xinggang},
        journal = {arXiv preprint arXiv:1907.01203v2},
        year={2019}
        }

Thanks to the Third Party Libs

Comments

Segment Fault

When we run the CUDA_VISIBLE_DEVICES=0 python train.py --model_save_path experiments/snapshots --max_iters 100000 --decayat 60000 --learning_rate 2e-5 --batch_size 8, a segment fault has been occured. How can we handle it? ### Thank you very much!

(torch) visuallab@visuallab:~/PTSNet/drsn$ CUDA_VISIBLE_DEVICES=0 python train.py --model_save_path experiments/snapshots --max_iters 100000 --decayat 60000 --learning_rate 2e-5 --batch_size 8 Detected CUDA files, patching ldflags Emitting ninja build file /home/visuallab/PTSNet/drsn/model/inplace_ABN/build/build.ninja... Building extension module inplace_abn... [1/2] c++ -MMD -MF inplace_abn.o.d -DTORCH_EXTENSION_NAME=inplace_abn -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/visuallab/anaconda3/envs/torch041/lib/python3.6/site-packages/torch/lib/include -isystem /home/visuallab/anaconda3/envs/torch041/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/visuallab/anaconda3/envs/torch041/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/visuallab/anaconda3/envs/torch041/lib/python3.6/site-packages/torch/lib/include/THC -isystem /usr/local/cuda-10.0/include -isystem /home/visuallab/anaconda3/envs/torch041/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -O3 -c /home/visuallab/PTSNet/drsn/model/inplace_ABN/src/inplace_abn.cpp -o inplace_abn.o [2/2] c++ inplace_abn.o inplace_abn_cpu.o inplace_abn_cuda.cuda.o inplace_abn_cuda_half.cuda.o -shared -L/usr/local/cuda-10.0/lib64 -lcudart -o inplace_abn.so Loading extension module inplace_abn... [I 190717 20:33:53 train:133] { "init_model_path": "init_models/resnet50-19c8e357.pth", "model_save_path": "experiments/snapshots", "img_size": [ 256, 256 ], "batch_size": 8, "save_num_images": 2, "decayat": 60000, "learning_rate": [ 2e-05 ], "learning_policy": "step", "max_iters": 100000, "save_iters": 20000, "weight_decay": 0.0002, "power": 0.9 } [I 190717 20:33:53 train:135] Setting model... [I 190717 20:33:55 train:143] Setting criterion... [I 190717 20:33:55 train:150] Setting CUDNN... 段错误 (核心已转储)

opened by thyztw 6
error when trying to train on the yvos dataset

using the command taken from drsn/do_train_eval and adapted i tried to train on the yvos dataset to test the algorithm

the comman i used was: "CUDA_VISIBLE_DEVICES=0 python train.py --model_save_path experiments/snapshots --max_iters 100000 --decayat 60000 --learning_rate 2e-5 --batch_size 64 --input_size 256,256" my default python version is 3.6.8, gcc version 5.5, cuda version 9.0 and inplace_abn version 1.0.3

the error i received was the following:

Traceback (most recent call last): File "train.py", line 23, in from model.drsn import DRSN File "/home/orel/projects/pts/PTSNet-master/drsn/model/drsn.py", line 9, in from inplace_ABN import InPlaceABN, InPlaceABNSync File "/home/orel/projects/pts/PTSNet-master/drsn/model/inplace_ABN/init.py", line 1, in from .bn import ABN, InPlaceABN, InPlaceABNSync File "/home/orel/projects/pts/PTSNet-master/drsn/model/inplace_ABN/bn.py", line 10, in from .functions import * File "/home/orel/projects/pts/PTSNet-master/drsn/model/inplace_ABN/functions.py", line 22, in extra_cuda_cflags=["--expt-extended-lambda"]) File "/home/orel/projects/pts/ptsenv/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 644, in load is_python_module) File "/home/orel/projects/pts/ptsenv/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 802, in _jit_compile if baton.try_acquire(): File "/home/orel/projects/pts/ptsenv/lib/python3.6/site-packages/torch/utils/file_baton.py", line 36, in try_acquire self.fd = os.open(self.lock_file_path, os.O_CREAT | os.O_EXCL) FileNotFoundError: [Errno 2] No such file or directory: '/home/orel/projects/pts/PTSNet-master/drsn/model/inplace_ABN/build/lock'

I don't understand what file is missing and how to replace/construct it

I would appreciate your help

Thanks,

Orel

opened by ghost 4
download link issue

Hi,

thanks for sharing your work. I'd like to try your pretrained network but I'm not able to access the given link : https://s3-us-west-2.amazonaws.com/detectron/ImageNetPretrained/25093814/X-152-32x8d-IN5k.pkl

opened by jeremy-cv 3
OPN model in google driver can not be downloaded

Hello, I try to download the pretrained OPN model in google driver but it is too slow and get wrong in a while. Can you give some other downloads or links？

opened by saxg 2
init_model not found

Hello,

I tried to train on the yvos dataset using the following command: 'CUDA_VISIBLE_DEVICES=0 python train.py --model_save_path experiments/snapshots --max_iters 100000 --decayat 60000 --learning_rate 2e-5 --batch_size 64' (without inputting the input_size argument)

i received the following output:

"init_model_path": "init_models/resnet50-19c8e357.pth", "model_save_path": "experiments/snapshots", "img_size": [ 256, 256 ], "batch_size": 64, "save_num_images": 2, "decayat": 60000, "learning_rate": [ 2e-05 ], "learning_policy": "step", "max_iters": 100000, "save_iters": 20000, "weight_decay": 0.0002, "power": 0.9 } [I 190807 14:22:30 train:127] Setting model... Traceback (most recent call last): File "train.py", line 206, in main() File "train.py", line 129, in main model.init(args.init_model_path, "yvos_train") File "/home/orel/projects/pts/PTSNet-master/drsn/model/drsn.py", line 216, in init saved_state_dict = torch.load(model_path) File "/home/orel/projects/pts/ptsenv/lib/python3.6/site-packages/torch/serialization.py", line 382, in load f = open(f, 'rb') FileNotFoundError: [Errno 2] No such file or directory: 'init_models/resnet50-19c8e357.pth'

I am guessing I needed to prepare the pretrained resnet model but i am not sure could you instruct me on how to proceed?

Thanks!

opened by ghost 2
unrecognized argument input size

Hi again,

the command i used to run training on the yvos dataset was the following:

CUDA_VISIBLE_DEVICES=0 python train.py --model_save_path experiments/snapshots --max_iters 100000 --decayat 60000 --learning_rate 2e-5 --batch_size 64 --input_size 256,256

and i recieved this output:

usage: train.py [-h] [--init_model_path INIT_MODEL_PATH] [--model_save_path MODEL_SAVE_PATH] [--img_size IMG_SIZE IMG_SIZE] [--batch_size BATCH_SIZE] [--save_num_images SAVE_NUM_IMAGES] [--decayat DECAYAT] [--learning_rate LEARNING_RATE [LEARNING_RATE ...]] [--learning_policy {step,poly,constant}] [--max_iters MAX_ITERS] [--save_iters SAVE_ITERS] [--weight_decay WEIGHT_DECAY] [--power POWER] train.py: error: unrecognized arguments: --input_size 256,256

the argument "input_size" is one you suggest using in the readme file. should i remove it from the command or add it to 'train.py'?

opened by ghost 2
Something wrong with the pre-trained model

Thanks a lot for sharing the code!

I meet the following error when trying to evaluate the pre-trained model on DAVIS2017:

It seems that there is problem about key mismatch. Could you please help to address this issue? Thanks!

opened by KunpengLi1994 2
cffi.VerificationError: CompileError: command 'gcc' failed with exit status 1

my environment is CUDA10.0 RTX 2080 Ti python 3.6 pytorch 0.4.1

but I meet this error, cffi.VerificationError: CompileError: command 'gcc' failed with exit status 1 please tell me how to solve it. and will you update your code with pytorch1.1.0.

by the way, where do you get these files of "native cuda operation" from ? are there more information about these files?

opened by shoutOutYangJie 1

Owner

Forest

If a bullet's going to get you, it has already been fired.

GitHub

Cascaded Pyramid Network (CPN) based on Keras (Tensorflow backend)

ML2 Takehome Project Reimplementing the paper: Cascaded Pyramid Network for Multi-Person Pose Estimation Dataset The model uses the COCO dataset which

1 Nov 22, 2021

PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud, CVPR 2019.

PointRCNN PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud Code release for the paper PointRCNN:3D Object Proposal Generation a

1.5k Dec 27, 2022

git《FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding》(CVPR 2021) GitHub: [fig8]

FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding (CVPR 2021) This repo contains the implementation of our state-of-the-art fewshot ob

233 Dec 29, 2022

SiamMOT is a region-based Siamese Multi-Object Tracking network that detects and associates object instances simultaneously.

SiamMOT: Siamese Multi-Object Tracking

432 Dec 17, 2022

Face Detection and Alignment using Multi-task Cascaded Convolutional Networks (MTCNN)

Face-Detection-with-MTCNN Face detection is a computer vision problem that involves finding faces in photos. It is a trivial problem for humans to sol

3 Oct 7, 2022

CVPR2021: Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Temporal Context Aggregation Network - Pytorch This repo holds the pytorch-version codes of paper: "Temporal Context Aggregation Network for Temporal

63 Sep 27, 2022

QueryDet: Cascaded Sparse Query for Accelerating High-Resolution SmallObject Detection

QueryDet-PyTorch This repository is the official implementation of our paper: QueryDet: Cascaded Sparse Query for Accelerating High-Resolution Small O

276 Dec 31, 2022

Photographic Image Synthesis with Cascaded Refinement Networks - Pytorch Implementation

Photographic Image Synthesis with Cascaded Refinement Networks-Pytorch (https://arxiv.org/abs/1707.09405) This is a Pytorch implementation of cascaded

63 Mar 27, 2022

Python package for multiple object tracking research with focus on laboratory animals tracking.

motutils is a Python package for multiple object tracking research with focus on laboratory animals tracking. Features loads: MOTChallenge CSV, sleap

2 Sep 5, 2022

Object tracking and object detection is applied to track golf puts in real time and display stats/games.

Putting_Game Object tracking and object detection is applied to track golf puts in real time and display stats/games. Works best with the Perfect Prac

1 Dec 29, 2021

Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

This is the official PyTorch implementation of our paper: "Joint Object Detection and Multi-Object Tracking with Graph Neural Networks". Our project website and video demos are here.

443 Dec 6, 2022

Object Detection and Multi-Object Tracking

1.6k Jan 4, 2023

TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction

TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction TSDF++ is a novel multi-object TSDF formulation that can encode mult

130 Dec 29, 2022

The official repo for OC-SORT: Observation-Centric SORT on video Multi-Object Tracking. OC-SORT is simple, online and robust to occlusion/non-linear motion.

OC-SORT Observation-Centric SORT (OC-SORT) is a pure motion-model-based multi-object tracker. It aims to improve tracking robustness in crowded scenes

325 Jan 5, 2023

This repository contains the code for "Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP".

Self-Diagnosis and Self-Debiasing This repository contains the source code for Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based

62 Dec 12, 2022

A pytorch-version implementation codes of paper: "BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation"

BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation A pytorch-version implementation

11 Oct 8, 2022

Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation

Related tags

Overview

Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation

Architecture

Usage

Preparation

Train and Evaluate

Acknowledgment

Citing PTSNet

Thanks to the Third Party Libs

Comments

Owner

Forest

Cascaded Pyramid Network (CPN) based on Keras (Tensorflow backend)

PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud, CVPR 2019.

git《FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding》(CVPR 2021) GitHub: [fig8]

SiamMOT is a region-based Siamese Multi-Object Tracking network that detects and associates object instances simultaneously.

Face Detection and Alignment using Multi-task Cascaded Convolutional Networks (MTCNN)

CVPR2021: Temporal Context Aggregation Network for Temporal Action Proposal Refinement

QueryDet: Cascaded Sparse Query for Accelerating High-Resolution SmallObject Detection

Photographic Image Synthesis with Cascaded Refinement Networks - Pytorch Implementation

Python package for multiple object tracking research with focus on laboratory animals tracking.

Object tracking and object detection is applied to track golf puts in real time and display stats/games.

Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

Object Detection and Multi-Object Tracking

TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction

The official repo for OC-SORT: Observation-Centric SORT on video Multi-Object Tracking. OC-SORT is simple, online and robust to occlusion/non-linear motion.

This repository contains the code for "Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP".

A pytorch-version implementation codes of paper: "BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation"

[CVPR21] LightTrack: Finding Lightweight Neural Network for Object Tracking via One-Shot Architecture Search

Implementation of "Efficient Regional Memory Network for Video Object Segmentation" (Xie et al., CVPR 2021).

Hierarchical Memory Matching Network for Video Object Segmentation (ICCV 2021)