Drone-based Joint Density Map Estimation, Localization and Tracking with Space-Time Multi-Scale Attention Network

VisDrone

Last update: Nov 16, 2022

Related tags

Deep Learning DroneCrowd

Overview

DroneCrowd

Paper Detection, Tracking, and Counting Meets Drones in Crowds: A Benchmark.

Introduction

This paper proposes a space-time multi-scale attention network (STANet) to solve density map estimation, localization and tracking in dense crowds of video clips captured by drones with arbitrary crowd density, perspective, and flight altitude. Our STANet method aggregates multi-scale feature maps in sequential frames to exploit the temporal coherency, and then predict the density maps, localize the targets, and associate them in crowds simultaneously. A coarse-to-fine process is designed to gradually apply the attention module on the aggregated multi-scale feature maps to enforce the network to exploit the discriminative space-time features for better performance. The whole network is trained in an end-to-end manner with the multi-task loss, formed by three terms, i.e., the density map loss, localization loss and association loss. The non-maximal suppression followed by the min-cost flow framework is used to generate the trajectories of targets' in scenarios. Since existing crowd counting datasets merely focus on crowd counting in static cameras rather than density map estimation, counting and tracking in crowds on drones, we have collected a new large-scale drone-based dataset, DroneCrowd, formed by 112 video clips with 33,600 high resolution frames (i.e., 1920x1080) captured in 70 different scenarios. With intensive amount of effort, our dataset provides 20,800 people trajectories with 4.8 million head annotations and several video-level attributes in sequences. Extensive experiments are conducted on two challenging public datasets, i.e., Shanghaitech and UCF-QNRF, and our DroneCrowd, to demonstrate that STANet achieves favorable performance against the state-of-the-arts.

Dataset

ECCV2020 Challenge

The VisDrone 2020 Crowd Counting Challenge requires participating algorithms to count persons in each frame. The challenge will provide 112 challenging sequences, including 82 video sequences for training (2,420 frames in total), and 30 sequences for testing (900 frames in total), which are available on the download page. We manually annotate persons with points in each video frame.

DroneCrowd (1.03 GB): BaiduYun(code: h0j8)| GoogleDrive

DroneCrowd (Full Version)

This full version consists of 112 video clips with 33,600 high resolution frames (i.e., 1920x1080) captured in 70 different scenarios. With intensive amount of effort, our dataset provides 20,800 people trajectories with 4.8 million head annotations and several video-level attributes in sequences.

DroneCrowd BaiduYun(code:ml1u)| GoogleDrive

Code

Space-Time Neighbor-Aware Network (STNNet-pytorch)

Space-Time Multi-Scale Attention Network (STANet-pytorch)

Citation

Please cite this paper if you want to use it in your work.

@inproceedings{dronecrowd_cvpr2021,
  author    = {Longyin Wen and
               Dawei Du and
               Pengfei Zhu and
               Qinghua Hu and
               Qilong Wang and
               Liefeng Bo and
               Siwei Lyu},
  title     = {Detection, Tracking, and Counting Meets Drones in Crowds: A Benchmark},
  booktitle = {CVPR},
  year      = {2021}
}

Comments

Where do you store the tracklets of each pedestrian ?

In mytest.py it seems to me that calc_trkpt helps you to find the next detection point using tracking. Are you storing tracklets somewhere to help you rebuild the complete track of each pedestrian ? How can we reproduce the tracking that is illustrated in the DroneCrowd README.md ?

opened by MounirB 1
[STNNet] spatial-correlation-sampler lib issue with RTX 3000 series GPUs

Hello, It seems that the spatial-correlation-sampler doesn't work with a RTX 3000 series GPU when we try to install it through pip along with this conda environment conda create -n STTNet python=3.6 pytorch=1.6 torchvision -c pytorch I made it work by replacing python and pytorch by newest versions; respectively 3.9 and 1.11.

Do we need oldest versions of pytorch to make STNNet work ?

opened by MounirB 1
Can't read GT_img015203.mat and GT_img015205.mat in test_data/ground_truth

Thanks for providing such a good data set! I can't read the files GT_img015203.mat and GT_img015204.mat in test_data/ground_truth both with matlab and python. I download these files from BaiduYun. could you update these two files? thank you so much.

opened by Amelie01 0

The same images in validation and test sets

Hi, Why does val_data contain the same images as test_data?

val_data

bartosz@bartosz-pro:~/Downloads/val_data/images$ ll | head -n 20
total 111000
drwxrwxr-x 2 bartosz bartosz  20480 lis  6  2020 ./
drwxrwxr-x 4 bartosz bartosz   4096 lis  6  2020 ../
-rw-rw-r-- 1 bartosz bartosz 296163 mar 13  2018 img011001.jpg
-rw-rw-r-- 1 bartosz bartosz 295048 mar 13  2018 img011002.jpg
-rw-rw-r-- 1 bartosz bartosz 291832 mar 13  2018 img011003.jpg
-rw-rw-r-- 1 bartosz bartosz 292177 mar 13  2018 img011004.jpg
-rw-rw-r-- 1 bartosz bartosz 292308 mar 13  2018 img011005.jpg
-rw-rw-r-- 1 bartosz bartosz 292594 mar 13  2018 img011006.jpg
-rw-rw-r-- 1 bartosz bartosz 291670 mar 13  2018 img011007.jpg
-rw-rw-r-- 1 bartosz bartosz 292245 mar 13  2018 img011008.jpg
-rw-rw-r-- 1 bartosz bartosz 296073 mar 13  2018 img011009.jpg
-rw-rw-r-- 1 bartosz bartosz 293852 mar 13  2018 img011010.jpg
-rw-rw-r-- 1 bartosz bartosz 294563 mar 13  2018 img011011.jpg
-rw-rw-r-- 1 bartosz bartosz 294614 mar 13  2018 img011012.jpg
-rw-rw-r-- 1 bartosz bartosz 294053 lut  9  2019 img015001.jpg
-rw-rw-r-- 1 bartosz bartosz 294591 lut  9  2019 img015002.jpg
-rw-rw-r-- 1 bartosz bartosz 294779 lut  9  2019 img015003.jpg
-rw-rw-r-- 1 bartosz bartosz 294860 mar 13  2018 img015004.jpg
-rw-rw-r-- 1 bartosz bartosz 295132 mar 13  2018 img015005.jpg

test_data

bartosz@bartosz-pro:~/Downloads/test_data/images$ ll | head -n 20
total 2769944
drwxrwxr-x 2 bartosz bartosz 299008 mar  8  2021 ./
drwxrwxr-x 4 bartosz bartosz   4096 mar  8  2021 ../
-rw-rw-r-- 1 bartosz bartosz 296163 mar 13  2018 img011001.jpg
-rw-rw-r-- 1 bartosz bartosz 295048 mar 13  2018 img011002.jpg
-rw-rw-r-- 1 bartosz bartosz 291832 mar 13  2018 img011003.jpg
-rw-rw-r-- 1 bartosz bartosz 292177 mar 13  2018 img011004.jpg
-rw-rw-r-- 1 bartosz bartosz 292308 mar 13  2018 img011005.jpg
-rw-rw-r-- 1 bartosz bartosz 292594 mar 13  2018 img011006.jpg
-rw-rw-r-- 1 bartosz bartosz 291670 mar 13  2018 img011007.jpg
-rw-rw-r-- 1 bartosz bartosz 292245 mar 13  2018 img011008.jpg
-rw-rw-r-- 1 bartosz bartosz 296073 mar 13  2018 img011009.jpg
-rw-rw-r-- 1 bartosz bartosz 293852 mar 13  2018 img011010.jpg
-rw-rw-r-- 1 bartosz bartosz 294563 mar 13  2018 img011011.jpg
-rw-rw-r-- 1 bartosz bartosz 294614 mar 13  2018 img011012.jpg
-rw-rw-r-- 1 bartosz bartosz 294075 mar 13  2018 img011013.jpg
-rw-rw-r-- 1 bartosz bartosz 293981 mar 13  2018 img011014.jpg
-rw-rw-r-- 1 bartosz bartosz 293826 mar 13  2018 img011015.jpg
-rw-rw-r-- 1 bartosz bartosz 294447 mar 13  2018 img011016.jpg
-rw-rw-r-- 1 bartosz bartosz 295887 mar 13  2018 img011017.jpg

opened by bartoszptak 0

Questions of training and testing Process

Hello, thanks for the great work!!

I tried to test the results. 'Setup environment' and 'Download the DroneCrowd data' and 'Ground-Truth Generation' are finished. The pre-trained models are downloaded. I get errors when try python mytest.py . The error ：ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 3 and the array at index 1 has size 2

how might able to solve it?

I am looking forward to your reply.

opened by jackydinosaur 8

Owner

VisDrone

The official website for the VisDrone Challenge

GitHub

MOT-Tracking-by-Detection-Pipeline - For Tracking-by-Detection format MOT (Multi Object Tracking), is it a framework that separates Detection and Tracking processes?

MOT-Tracking-by-Detection-Pipeline Tracking-by-Detection形式のMOT(Multi Object Trac

41 Nov 23, 2022

Estimation of human density in a closed space using deep learning.

Siemens HOLLZOF challenge - Human Density Estimation Add project description here. Installing Dependencies: Install Python3 either system-wide, user-w

3 Aug 8, 2021

This YoloV5 based model is fit to detect people and different types of land vehicles, and displaying their density on a fitted map, according to their coordinates and detected labels.

This YoloV5 based model is fit to detect people and different types of land vehicles, and displaying their density on a fitted map, according to their

8 May 22, 2022

Codes for TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization.

TS-CAM: Token Semantic Coupled Attention Map for Weakly SupervisedObject Localization This is the official implementaion of paper TS-CAM: Token Semant

112 Jan 2, 2023

Joint detection and tracking model named DEFT, or ``Detection Embeddings for Tracking.

DEFT: Detection Embeddings for Tracking DEFT: Detection Embeddings for Tracking, Mohamed Chaabane, Peter Zhang, J. Ross Beveridge, Stephen O'Hara

253 Dec 18, 2022

Tello Drone Trajectory Tracking

With this library you can track the trajectory of your tello drone or swarm of drones in real time.

2 Oct 12, 2022

Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX.

ONNX Object Localization Network Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX. Ori

15 Oct 14, 2022

Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification

STAM - Pytorch Implementation of STAM (Space Time Attention Model), yet another pure and simple SOTA attention model that bests all previous models in

109 Dec 28, 2022

Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

This is the official PyTorch implementation of our paper: "Joint Object Detection and Multi-Object Tracking with Graph Neural Networks". Our project website and video demos are here.

443 Dec 6, 2022

Pytorch implementation of "Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling"

RNN-for-Joint-NLU Pytorch implementation of "Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling"

194 Dec 28, 2022

SSL_SLAM2: Lightweight 3-D Localization and Mapping for Solid-State LiDAR (mapping and localization separated) ICRA 2021

SSL_SLAM2 Lightweight 3-D Localization and Mapping for Solid-State LiDAR (Intel Realsense L515 as an example) This repo is an extension work of SSL_SL

1.3k Jan 8, 2023

Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

CCAM (Unsupervised) Code repository for our paper "CCAM: Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localizati

113 Dec 27, 2022

PyTorch implementations of algorithms for density estimation

pytorch-flows A PyTorch implementations of Masked Autoregressive Flow and some other invertible transformations from Glow: Generative Flow with Invert

546 Dec 5, 2022

MADE (Masked Autoencoder Density Estimation) implementation in PyTorch

pytorch-made This code is an implementation of "Masked AutoEncoder for Density Estimation" by Germain et al., 2015. The core idea is that you can turn

498 Dec 30, 2022

This program presents convolutional kernel density estimation, a method used to detect intercritical epilpetic spikes (IEDs)

Description This program presents convolutional kernel density estimation, a method used to detect intercritical epilpetic spikes (IEDs) in [Gardy et

0 Feb 9, 2022

Detection of drones using their thermal signatures from thermal camera through YOLO-V3 based CNN with modifications to encapsulate drone motion

Drone Detection using Thermal Signature This repository highlights the work for night-time drone detection using a using an Optris PI Lightweight ther

6 Dec 31, 2022

Drone-based Joint Density Map Estimation, Localization and Tracking with Space-Time Multi-Scale Attention Network

Related tags

Overview

DroneCrowd

Introduction

Dataset

ECCV2020 Challenge

DroneCrowd (Full Version)

Code

Citation

Comments

Where do you store the tracklets of each pedestrian ?

[STNNet] spatial-correlation-sampler lib issue with RTX 3000 series GPUs

Can't read GT_img015203.mat and GT_img015205.mat in test_data/ground_truth

The same images in validation and test sets

val_data

test_data

Questions of training and testing Process

Owner

VisDrone

MOT-Tracking-by-Detection-Pipeline - For Tracking-by-Detection format MOT (Multi Object Tracking), is it a framework that separates Detection and Tracking processes?

Estimation of human density in a closed space using deep learning.

This YoloV5 based model is fit to detect people and different types of land vehicles, and displaying their density on a fitted map, according to their coordinates and detected labels.

Codes for TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization.

Joint detection and tracking model named DEFT, or ``Detection Embeddings for Tracking.

Tello Drone Trajectory Tracking

Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX.

Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification

Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

Pytorch implementation of "Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling"

SSL_SLAM2: Lightweight 3-D Localization and Mapping for Solid-State LiDAR (mapping and localization separated) ICRA 2021

Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

PyTorch implementations of algorithms for density estimation

MADE (Masked Autoencoder Density Estimation) implementation in PyTorch

This program presents convolutional kernel density estimation, a method used to detect intercritical epilpetic spikes (IEDs)

Detection of drones using their thermal signatures from thermal camera through YOLO-V3 based CNN with modifications to encapsulate drone motion

TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-Captured Scenarios

Light-weight network, depth estimation, knowledge distillation, real-time depth estimation, auxiliary data.

STMTrack: Template-free Visual Tracking with Space-time Memory Networks