Yolov5+SlowFast: Realtime Action Detection
A realtime action detection frame work based on PytorchVideo.
Here are some details about our modification:
- we choose yolov5 as an object detector instead of detectron2, it is faster and more convenient
- we use a tracker(deepsort) to allocate action labels to all objects(with same ids) in different frames
- our processing speed reached 24.2 FPS at 30 inference barch size (on a single RTX 2080Ti GPU)
Relevant infomation: FAIR/PytorchVideo; Ultralytics/Yolov5
Demo comparison betwween original(<-left) and ours(->right).
Installation
-
create a new python environment:
conda create -n env_name python=3.7.11
-
install requiments:
pip install -r requirements.txt
-
download weights file(ckpt.t7) from [deepsort] to this folder:
./deep_sort/deep_sort/deep/checkpoint/
-
test on your video:
python yolo_slowfast.py --input {path to your video}
The first time to execute this command may take some times to download the yolov5 code and it's weights file from torch.hub, keep your network connected.
References
Thanks for these great works:
[2] ZQPei/deepsort
[2] AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions. paper
[3] SlowFast Networks for Video Recognition. paper
Citation
If you find our work useful, please cite as follow:
{ yolo_slowfast,
author = {Wu Fan},
title = { A realtime action detection frame work based on PytorchVideo},
year = {2021},
url = {\url{https://github.com/wufan-tb/gmm_dae}}
}