Real-time Object Detection for Streaming Perception, CVPR 2022

Jinrong Yang

Last update: Dec 27, 2022

Related tags

Overview

StreamYOLO

Real-time Object Detection for Streaming Perception

Jinrong Yang, Songtao Liu, Zeming Li, Xiaoping Li, Sun Jian
Real-time Object Detection for Streaming Perception, CVPR 2022 (Oral)
Paper

Bestsoftwarechoose

Benchmark

Model	size	velocity	sAP 0.5:0.95	sAP50	sAP75	weights	COCO pretrained weights
StreamYOLO-s	600×960	1x	29.8	50.3	29.8	github	github
StreamYOLO-m	600×960	1x	33.7	54.5	34.0	github	github
StreamYOLO-l	600×960	1x	36.9	58.1	37.5	github	github
StreamYOLO-l	600×960	2x	34.6	56.3	34.7	github	github
StreamYOLO-l	600×960	still	39.4	60.0	40.2	github	github

Quick Start

Dataset preparation

You can download Argoverse-1.1 full dataset and annotation from HERE and unzip it.

The folder structure should be organized as follows before our processing.

StreamYOLO
├── exps
├── tools
├── yolox
├── data
│   ├── Argoverse-1.1
│   │   ├── annotations
│   │       ├── tracking
│   │           ├── train
│   │           ├── val
│   │           ├── test
│   ├── Argoverse-HD
│   │   ├── annotations
│   │       ├── test-meta.json
│   │       ├── train.json
│   │       ├── val.json

The hash strings represent different video sequences in Argoverse, and ring_front_center is one of the sensors for that sequence. Argoverse-HD annotations correspond to images from this sensor. Information from other sensors (other ring cameras or LiDAR) is not used, but our framework can be also extended to these modalities or to a multi-modality setting.

Installation

# basic python libraries
conda create --name streamyolo python=3.7

pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html

pip3 install yolox==0.3
git clone [email protected]:yancie-yjr/StreamYOLO.git

cd StreamYOLO/

# add StreamYOLO to PYTHONPATH and add this line to ~/.bashrc or ~/.zshrc (change the file accordingly)
ADDPATH=$(pwd)
echo export PYTHONPATH=$PYTHONPATH:$ADDPATH >> ~/.bashrc
source ~/.bashrc

# Installing `mmcv` for the official sAP evaluation:
# Please replace `{cu_version}` and ``{torch_version}`` with the versions you are currently using.
# You will get import or runtime errors if the versions are incorrect.
pip install mmcv-full==1.1.5 -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html

Reproduce our results on Argoverse-HD

Step1. Prepare COCO dataset

cd <StreamYOLO_HOME>
ln -s /path/to/your/Argoverse-1.1 ./data/Argoverse-1.1
ln -s /path/to/your/Argoverse-HD ./data/Argoverse-HD

Step2. Reproduce our results on Argoverse:

python tools/train.py -f cfgs/m_s50_onex_dfp_tal_flip.py -d 8 -b 32 -c [/path/to/your/coco_pretrained_path] -o --fp16

-d: number of gpu devices.
-b: total batch size, the recommended number for -b is num-gpu * 8.
--fp16: mixed precision training.
-c: model checkpoint path.

Offline Evaluation

We support batch testing for fast evaluation:

python tools/eval.py -f  cfgs/l_s50_onex_dfp_tal_flip.py -c [/path/to/your/model_path] -b 64 -d 8 --conf 0.01 [--fp16] [--fuse]

--fuse: fuse conv and bn.
-d: number of GPUs used for evaluation. DEFAULT: All GPUs available will be used.
-b: total batch size across on all GPUs.
-c: model checkpoint path.
--conf: NMS threshold. If using 0.001, the performance will further improve by 0.2~0.3 sAP.

Online Evaluation

We modify the online evaluation from sAP

Please use 1 V100 GPU to test the performance since other GPUs with low computing power will trigger non-real-time results!!!!!!!!

cd sAP/streamyolo
bash streamyolo.sh

Citation

Please cite the following paper if this repo helps your research:

@InProceedings{streamyolo,
    author    = {Yang, Jinrong and Liu, Songtao and Li, Zeming and Li, Xiaoping and Sun, Jian},
    title     = {Real-time Object Detection for Streaming Perception},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    year      = {2022}
}

License

This repo is released under the Apache 2.0 license. Please see the LICENSE file for more information.

Comments

when will the readme document be completed

Hi, @GOATmessi7 @yancie-yjr great wokrs. Can you enrich the readme about datasets preparing、how to training & validation and so on. hope to finish it soon. thanks

opened by SmallMunich 1
ModuleNotFoundError: No module named 'exps'

hi everyone, I got this issue ...File "cfgs/m_s50_onex_dfp_tal_flip.py", line 189, in get_trainer from exps.train_utils.double_trainer import Trainer ModuleNotFoundError: No module named 'exps'

Actually I ran code on local I got this error but when I try "echo export PYTHONPATH=$PYTHONPATH:$ADDPATH >> " it worked. But as you can guess my local GPU didn't enough for training. And I established everything on colab but this time "echo export..." didn't save me.

opened by Tezcan98 3

A small bug in README about Dataset Prep.

For Developers

Hi! When reproducing your results on Argoverse-HD, I found that the directory structure you provided in Quick Start - Dataset preparation section doesn't match the original directory structure of Argoverse-HD dataset, as well as your code required. The directory structure in Quick Start - Dataset preparation section:

StreamYOLO
├── exps
├── tools
├── yolox
├── data
│   ├── Argoverse-1.1
│   │   ├── annotations
│   │       ├── tracking
│   │           ├── train
│   │           ├── val
│   │           ├── test
│   ├── Argoverse-HD
│   │   ├── annotations
│   │       ├── test-meta.json
│   │       ├── train.json
│   │       ├── val.json

should be edited as:

StreamYOLO
├── exps
├── tools
├── yolox
├── data
│   ├── Argoverse-1.1
│   │   ├── tracking
│   │       ├── train
│   │       ├── val
│   │       ├── test
│   ├── Argoverse-HD
│   │   ├── annotations
│   │       ├── test-meta.json
│   │       ├── train.json
│   │       ├── val.json

which matches the directory structure of the Argoverse-HD dataset: Screenshot 2022-09-21 151703.png

For Stargazers

BTW, if anyone manually modifies the directory structure to fit the one provided in README, an AssertionError will occur: (some parts of file path was edited)

AssertionError: Caught AssertionError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "%HOME%\anaconda3\envs\streamyolo\lib\site-packages\torch\utils\data\_utils\worker.py", line 198, in _worker_loop
    data = fetcher.fetch(index)
  File "%HOME%\anaconda3\envs\streamyolo\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "%HOME%\anaconda3\envs\streamyolo\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "%HOME%\anaconda3\envs\streamyolo\lib\site-packages\yolox\data\datasets\datasets_wrapper.py", line 110, in wrapper
    ret_val = getitem_fn(self, index)
  File "%WORKSPACE%\StreamYOLO\exps\data\tal_flip_mosaicdetection.py", line 255, in __getitem__
    img, support_img, label, support_label, img_info, id_ = self._dataset.pull_item(idx)
  File "%WORKSPACE%\StreamYOLO\exps\dataset\tal_flip_one_future_argoversedataset.py", line 227, in pull_item
    img = self.load_resized_img(index)
  File "%WORKSPACE%\StreamYOLO\exps\dataset\tal_flip_one_future_argoversedataset.py", line 180, in load_resized_img
    img = self.load_image(index)
  File "%WORKSPACE%\StreamYOLO\exps\dataset\tal_flip_one_future_argoversedataset.py", line 196, in load_image
    assert img is not None
AssertionError

If anyone gets the similar error message, the content in For Developers may be helpful.

opened by jingwenchong 6

Figure 2 in the paper

Hi, I have read your paper.

I have a question in figure 2.

On the page3 in the paper, you wrote the expression "the output y1 of the frame F1 is matched and evaluated with the ground truth of F3 and the result of F2 is missed" about Figure 2.

I understood like that expression mean y1 is the output of the none-real-time detectors of frame F1.

But, before the frame F3 is received, the frame F2 is received in first.

So I can't understand that point and I also want to ask when the output of the frame f0 come out.

opened by wpdlatm1452 1
How can i save the detection result?

Hi, thank you for suggesting your nice code.

I trained the model using Argoverse dataset following your readme.

I want to run demo and save detection results (image or video), how can i do that?

thank you.

opened by daminlee1 0

Releases(0.1.0rc)

0.1.0rc(Apr 26, 2022)

Uploading YOLOX COCO pretrained and StreamYOLO weights.
Source code(tar.gz)
Source code(zip)
l_s50_one_x.pth(419.09 MB)
l_s50_still.pth(419.09 MB)
l_s50_two_x.pth(419.09 MB)
m_s50_one_x.pth(196.37 MB)
s_s50_one_x.pth(69.87 MB)
yolox_l.pth(414.23 MB)
yolox_m.pth(193.70 MB)
yolox_s.pth(68.74 MB)

Owner

Jinrong Yang

Research: Object detection, Deep learning

GitHub

[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

87 Jan 8, 2023

(CVPR 2022) A minimalistic mapless end-to-end stack for joint perception, prediction, planning and control for self driving.

LAV Learning from All Vehicles Dian Chen, Philipp Krähenbühl CVPR 2022 (also arXiV 2203.11934) This repo contains code for paper Learning from all veh

300 Dec 15, 2022

[CVPR 2022 Oral] Versatile Multi-Modal Pre-Training for Human-Centric Perception

Versatile Multi-Modal Pre-Training for Human-Centric Perception Fangzhou Hong1 Liang Pan1 Zhongang Cai1,2,3 Ziwei Liu1* 1S-Lab, Nanyang Technologic

96 Jan 3, 2023

Code for Towards Streaming Perception (ECCV 2020) :car:

sAP — Code for Towards Streaming Perception ECCV Best Paper Honorable Mention Award Feb 2021: Announcing the Streaming Perception Challenge (CVPR 2021

85 Dec 22, 2022

Object tracking and object detection is applied to track golf puts in real time and display stats/games.

Putting_Game Object tracking and object detection is applied to track golf puts in real time and display stats/games. Works best with the Perfect Prac

1 Dec 29, 2021

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

147 Dec 3, 2022

Autonomous Perception: 3D Object Detection with Complex-YOLO

Autonomous Perception: 3D Object Detection with Complex-YOLO LiDAR object detect

2 Feb 18, 2022

This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametric Head Model (CVPR 2022)".

HeadNeRF: A Real-time NeRF-based Parametric Head Model This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametr

294 Jan 1, 2023

Real-Time-Student-Attendence-System - Real Time Student Attendence System

Real-Time-Student-Attendence-System The Student Attendance Management System Pro

1 Feb 15, 2022

Official PyTorch code for WACV 2022 paper "CFLOW-AD: Real-Time Unsupervised Anomaly Detection with Localization via Conditional Normalizing Flows"

CFLOW-AD: Real-Time Unsupervised Anomaly Detection with Localization via Conditional Normalizing Flows WACV 2022 preprint:https://arxiv.org/abs/2107.1

156 Dec 28, 2022

Official source code to CVPR'20 paper, "When2com: Multi-Agent Perception via Communication Graph Grouping"

When2com: Multi-Agent Perception via Communication Graph Grouping This is the PyTorch implementation of our paper: When2com: Multi-Agent Perception vi

34 Nov 9, 2022

[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

MDCA Calibration This is the official PyTorch implementation for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved

21 Dec 22, 2022

[CVPR 2022] Back To Reality: Weak-supervised 3D Object Detection with Shape-guided Label Enhancement

Back To Reality: Weak-supervised 3D Object Detection with Shape-guided Label Enhancement Announcement ?? We have not tested the code yet. We will fini

7 Oct 30, 2022

Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022.

Jadena Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022. arXiv

13 Nov 29, 2022

Real-time Object Detection for Streaming Perception, CVPR 2022

Related tags

Overview

StreamYOLO

Real-time Object Detection for Streaming Perception

Benchmark

Quick Start

Citation

License

Comments

when will the readme document be completed

ModuleNotFoundError: No module named 'exps'

A small bug in README about Dataset Prep.

For Developers

For Stargazers

Figure 2 in the paper

How can i save the detection result?

Releases(0.1.0rc)

0.1.0rc(Apr 26, 2022)

Owner

Jinrong Yang

[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

(CVPR 2022) A minimalistic mapless end-to-end stack for joint perception, prediction, planning and control for self driving.

[CVPR 2022 Oral] Versatile Multi-Modal Pre-Training for Human-Centric Perception

Code for Towards Streaming Perception (ECCV 2020) :car:

Object tracking and object detection is applied to track golf puts in real time and display stats/games.

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

Autonomous Perception: 3D Object Detection with Complex-YOLO

This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametric Head Model (CVPR 2022)".

Real-Time-Student-Attendence-System - Real Time Student Attendence System

Official PyTorch code for WACV 2022 paper "CFLOW-AD: Real-Time Unsupervised Anomaly Detection with Localization via Conditional Normalizing Flows"

Official source code to CVPR'20 paper, "When2com: Multi-Agent Perception via Communication Graph Grouping"

[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

[CVPR 2022] Back To Reality: Weak-supervised 3D Object Detection with Shape-guided Label Enhancement

Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022.

Official code of the paper "Expanding Low-Density Latent Regions for Open-Set Object Detection" (CVPR 2022)

Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds (CVPR 2022)

Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral)

ICCV2021 Paper: AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection

Implementation for the paper 'YOLO-ReT: Towards High Accuracy Real-time Object Detection on Edge GPUs'