PyTorch implementations of the paper: "Learning Independent Instance Maps for Crowd Localization"

tao han

Last update: Nov 10, 2022

Related tags

Overview

IIM - Crowd Localization

This repo is the official implementation of paper: Learning Independent Instance Maps for Crowd Localization. The code is developed based on C3F.

Progress

Testing Code (2020.12.10)
Training Code
- NWPU (2020.12.14)
- JHU (2021.01.05)
- UCF-QNRF (2020.12.30)
- ShanghaiTech Part A/B (2020.12.29)
- FDST (2020.12.30)
scale information for UCF-QNRF and ShanghaiTech Part A/B (2021.01.07)

Getting Started

Preparation

Prerequisites
- Python 3.7
- Pytorch 1.6: http://pytorch.org .
- other libs in requirements.txt, run pip install -r requirements.txt.
Code
- Clone this repo in the directory (Root/IIM):
- Download the pre-trained HR models from this link. More details are availble at HRNet-Semantic-Segmentation and HRNet-Image-Classification.
Datasets
- Download NWPU-Crowd dataset from this link.
- Unzip *zip files in turns and place images_part* into the same folder (Root/ProcessedData/NWPU/images).
- Download the processing labels and val gt file from this link. Place them into Root/ProcessedData/NWPU/masks and Root/ProcessedData/NWPU, respectively.
- If you want to reproduce the results on Shanghai Tech Part A/B , UCF-QNRF, and JHU datasets, you can follow the instructions in DATA.md to setup the datasets.
- Finally, the folder tree is below:

   -- ProcessedData
   	|-- NWPU
   		|-- images
   		|   |-- 0001.jpg
   		|   |-- 0002.jpg
   		|   |-- ...
   		|   |-- 5109.jpg
   		|-- masks
   		|   |-- 0001.png
   		|   |-- 0002.png
   		|   |-- ...
   		|   |-- 3609.png
   		|-- train.txt
   		|-- val.txt
   		|-- test.txt
   		|-- val_gt_loc.txt
   -- PretrainedModels
     |-- hrnetv2_w48_imagenet_pretrained.pth
   -- IIM
     |-- datasets
     |-- misc
     |-- ...

Training

run python train.py.
run tensorboard --logdir=exp --port=6006.
The validtion records are shown as follows:
The sub images are the input image, GT, prediction map,localization result, and pixel-level threshold, respectively:

Tips: The training process takes ~50 hours on NWPU datasets with two TITAN RTX (48GB Memeory).

Testing and Submitting

Modify some key parameters in test.py:
- netName.
- model_path.
Run python test.py. Then the output file (*_*_test.txt) will be generated, which can be directly submitted to CrowdBenchmark

Visualization on the val set

Modify some key parameters in test.py:
- test_list = 'val.txt'
- netName.
- model_path.
Run python test.py. Then the output file (*_*_val.txt) will be generated.
Modify some key parameters in vis4val.py:
- pred_file.
Run python vis4val.py.

Performance

The results (F1, Pre., Rec. under the sigma_l) and pre-trained models on NWPU val set, UCF-QNRF, SHT A, SHT B, and FDST:

Method	NWPU val	UCF-QNRF	SHT A
Paper: VGG+FPN [2,3]	77.0/80.2/74.1	68.8/78.2/61.5	72.5/72.6/72.5
This Repo's Reproduction: VGG+FPN [2,3]	77.1/82.5/72.3	67.8/75.7/61.5	71.6/75.9/67.8
Paper: HRNet [1]	80.2/84.1/76.6	72.0/79.3/65.9	73.9/79.8/68.7
This Repo's Reproduction: HRNet [1]	79.8/83.4/76.5	72.0/78.7/66.4	76.1/79.1/73.3

Method	SHT B	FDST	JHU
Paper: VGG+FPN [2,3]	80.2/84.9/76.0	93.1/92.7/93.5	-
This Repo's Reproduction: VGG+FPN [2,3]	81.7/88.5/75.9	93.9/94.7/93.1	61.8/73.2/53.5
Paper: HRNet [1]	86.2/90.7/82.1	95.5/95.3/95.8	62.5/74.0/54.2
This Repo's Reproduction: HRNet [1]	86.0/91.5/81.0	95.7/96.9 /94.4	64.0/73.3/56.8

References

Deep High-Resolution Representation Learning for Visual Recognition, T-PAMI, 2019.
Very Deep Convolutional Networks for Large-scale Image Recognition, arXiv, 2014.
Feature Pyramid Networks for Object Detection, CVPR, 2017.

About the leaderboard on the test set, please visit Crowd benchmark. Our submissions are the IIM(HRNet) and IIM (VGG16).

Video Demo

We test the pretrained HR Net model on the NWPU dataset in a real-world subway scene. Please visit bilibili or YouTube to watch the video demonstration.

Citation

If you find this project is useful for your research, please cite:

@article{gao2020learning,
  title={Learning Independent Instance Maps for Crowd Localization},
  author={Gao, Junyu and Han, Tao and Yuan, Yuan and Wang, Qi},
  journal={arXiv preprint arXiv:2012.04164},
  year={2020}
}

Our code borrows a lot from the C^3 Framework, and you may cite:

@article{gao2019c,
  title={C$^3$ Framework: An Open-source PyTorch Code for Crowd Counting},
  author={Gao, Junyu and Lin, Wei and Zhao, Bin and Wang, Dong and Gao, Chenyu and Wen, Jun},
  journal={arXiv preprint arXiv:1907.02724},
  year={2019}
}

If you use pre-trained models in this repo (HR Net, VGG, and FPN), please cite them.

Comments

Cannot load pretrained models
Hi, thank you for sharing your code! This model seems very interesting and promising. I was trying to test your model on a video, but unfortunately I was not able to load your pre-trained models. I tried both HR and VGG models, but it always break on load_state_dict() Do you know why?

netName = 'HR_Net' GPU_ID = '0' torch.backends.cudnn.benchmark = True os.environ["CUDA_VISIBLE_DEVICES"] = GPU_ID model_path = './saved_model/NWPU-HR-ep_241_F1_0.802_Pre_0.841_Rec_0.766_mae_55.6_mse_330.9.pth' net = Crowd_locator(netName,GPU_ID,pretrained=True) net.load_state_dict(torch.load(model_path)) net.eval()

File "/home/walter/anaconda3/envs/crowd/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1052, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for Crowd_locator: Missing key(s) in state_dict: "Extractor.conv1.weight", "Extractor.bn1.weight", "Extractor.bn1.bias", "Extractor.bn1.running_mean", "Extractor.bn1.running_var", "Extractor.conv2.weight", "Extractor.bn2.weight", "Extractor.bn2.bias", "Extractor.bn2.running_mean", "Extractor.bn2.running_var", "Extractor.layer1.0.conv1.weight", "Extractor.layer1.0.bn1.weight", "Extractor.layer1.0.bn1.bias", "Extractor.layer1.0.bn1.running_mean", "Extractor.layer1.0.bn1.running_var", "Extractor.layer1.0.conv2.weight", "Extractor.layer1.0.bn2.weight", "Extractor.layer1.0.bn2.bias", "Extractor.layer1.0.bn2.running_mean", [...]
opened by waltermaffy 3
the weights of Threshold Encoder be NaN when training

Hello, your work is interesting and inspired. I'm trying to re-implement your paper using keras, but the weights of threshold encoder would be nan after some epochs. Have this phenomenon happened to you, or could you give some suggestions?

thank you~

opened by s1702319 2
Have you compare the result with other semantic segmentation methods?

It is interesting to consider the crowd localization as as segmentation task, impressive!

I wonder to know that have you compare your methods with other well known segmentation methods? It seems that common segmentation network can also be trained with the mask.

Also, in Table 3, it seems that the lower fixed value of threshold, the better performance will be got. Have you try thresholds lower than 0.5? As far as I think, if the center is the only thing needed, the lower threshold will have better performance.

By the way, your IBM/PBM module looks also suitable for other segmentation task, have you test it on other dataset such as COCO?

Best regards.

opened by streamer-AP 2
RuntimeError: CUDA error: out of memory CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Ho guys,

How to fix this issue

RuntimeError: CUDA error: out of memory CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

opened by vtmjapandev 1
训练时有足够得内存，依然得到out of memory问题

您好，首先感谢您分享了您的稠密目标定位算法，最近我对您得算法进行实现得时候出现了问题，如下：首先我是在windows10平台上运行得训练过程，前面数据和模型处理好了以后，开始训练出现问题如下：不论运行那个模型，当模型开始正向执行，总是在第一个conv出爆出问题： VGG16_FPN.py: def forward(self, x): f = [] x = self.layer1(x) seg_hrnet.py: def forward(self, x): residual = x out = self.conv1(x) 总是爆出内存不足问题， RuntimeError: CUDA out of memory. Tried to allocate 4.50 GiB (GPU 0; 12.00 GiB total capacity; 886.66 MiB already allocated; 5.14 GiB free; 4.94 GiB reserved in total by PyTorch) 但是其中它想去分配4.5G，（Tried to allocate 4.50 GiB），而我的电脑除了pytorch占用得，还有5G(5.14 GiB free; 4.94 GiB reserved in total by PyTorch). 目前不知道是哪里除了问题，请问您原始代码是在那个平台训练得，ubuntu吗？或者您这边有什么思路可以解决此问题吗？期待您得回复，万分感谢！

opened by montensorrt 0

How to config to train on NVIDIA GeForce RTX 3090 Ti/PCIe/SSE2

Hi guys,

I want to train this model on(NVIDIA GeForce RTX 2080 Ti/PCIe/SSE2), RAM 16GB, How to config it. When I ran the default setting, the bellow error has occored.

    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

opened by vtmjapandev 6

threshold appear to be nan during the training process
Hi tao han: I am a graduate student in SEU, trying to replace the backbone of IIM (VGG16_FPN or HRNet) to my Transformer crowd counting model. However, even I low the initial lr 2 1e-6 to 1e-7 in SHA. the threshold even appears to be NAN in the 700 epoch. Also, the best MAE is only 126, which is far away from my model combined with other losses (more than MSE) on SHA. I noticed that in this link https://github.com/taohan10200/IIM/issues/7#issuecomment-766274210 you have mentioned that we also could lower the initial threshold, I wonder to sure whether is the initial weight 0.5 in the Binarized module. But even I change the initial weight to 0.4, the t_max also starts with 0.54. I get confused with the Binarized module. looking forward to your reply, my email is [email protected]/ [email protected]

> class BinarizedModule(nn.Module): > def __init__(self, input_channels=720): > super(BinarizedModule, self).__init__() > self.Threshold_Module = nn.Sequential( > nn.Conv2d(input_channels, 256, kernel_size=3, stride=1, padding=1, bias=False), > adding=0, bias=False), > nn.AvgPool2d(15, stride=1, padding=7), > ) > self.sig = compressedSigmoid() > #Change the threshold org to 0.4 > self.weight = nn.Parameter(torch.Tensor(1).fill_(0.4),requires_grad=True) > self.bias = nn.Parameter(torch.Tensor(1).fill_(0), requires_grad=True)
opened by knightyxp 2
How to visualize other datasets test set pictures?

Thanks for the author's impressive work! Question 1: How to visualize test set pictures for SHHA, SHHB, UCF-QNRF, FDST?

Question 2: The link (DATA.md) is broken, could you please update the link, especially SHHA and SHHB, thanks.

opened by wzzhai 6
Great work but Threshold Encoder module is useless.Details look at the picture

Thanks for the author's work.Great work!!!!! 1:Working so hard to train the Threshold Encoder which aim is to seperate the pred_map(the output of the network) to 1 or 0, but there is little differences between the pred_map after processed by the Threshold Encoder and the pred_map without processed by the Threshold Encoder.Details can look at the picture

opened by lab-gpu 3
模型保存为pt时报错RuntimeError: Could not export Python function call 'BinarizedF'

您好，在尝试使用torch.jit.trace保存为pt模型文件时 traced_script_module = torch.jit.trace(net,torch.rand(1, 3, 224, 224)) traced_script_module.save("modelIIMX.pt") 报如下错误： Traceback (most recent call last): File "gpt.py", line 28, in torch.jit.save(model,"modelIIMX.pt") File "/home/wjq/anaconda3/envs/IIM/lib/python3.7/site-packages/torch/jit/_serialization.py", line 81, in save m.save(f, _extra_files=_extra_files) File "/home/wjq/anaconda3/envs/IIM/lib/python3.7/site-packages/torch/jit/_script.py", line 490, in save return self._c.save(str(f), **kwargs) RuntimeError: Could not export Python function call 'BinarizedF'. Remove calls to Python functions before export. Did you forget to add @script or @script_method annotation? If this is a nn.ModuleList, add it to constants: /home/wjq/IIM/IIM-main/model/PBM.py(77): forward /home/wjq/anaconda3/envs/IIM/lib/python3.7/site-packages/torch/nn/modules/module.py(860): _slow_forward /home/wjq/anaconda3/envs/IIM/lib/python3.7/site-packages/torch/nn/modules/module.py(887): _call_impl /home/wjq/IIM/IIM-main/model/locator.py(39): forward /home/wjq/anaconda3/envs/IIM/lib/python3.7/site-packages/torch/nn/modules/module.py(860): _slow_forward /home/wjq/anaconda3/envs/IIM/lib/python3.7/site-packages/torch/nn/modules/module.py(887): _call_impl /home/wjq/anaconda3/envs/IIM/lib/python3.7/site-packages/torch/jit/_trace.py(940): trace_module /home/wjq/anaconda3/envs/IIM/lib/python3.7/site-packages/torch/jit/_trace.py(742): trace gpt.py(27): 尝试了一些方法仍未解决，麻烦有空帮忙看下要如何解决，谢谢

opened by wangle-wang 1

PyTorch implementations of the paper: "Learning Independent Instance Maps for Crowd Localization"

Related tags

Overview

IIM - Crowd Localization

Progress

Getting Started

Preparation

Training

Testing and Submitting

Visualization on the val set

Performance

Video Demo

Citation

Comments

Owner

tao han

PyTorch implementations of the paper: "DR.VIC: Decomposition and Reasoning for Video Individual Counting, CVPR, 2022"

StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

TorchMetrics is a collection of 25+ PyTorch metrics implementations and an easy-to-use API to create custom metrics.

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

PyTorch implementations of Top-N recommendation, collaborative filtering recommenders.

PyTorch implementations of deep reinforcement learning algorithms and environments

Annotated, understandable, and visually interpretable PyTorch implementations of: VAE, BIRVAE, NSGAN, MMGAN, WGAN, WGANGP, LSGAN, DRAGAN, BEGAN, RaGAN, InfoGAN, fGAN, FisherGAN

PyTorch implementations of algorithms for density estimation

PyTorch implementations of Generative Adversarial Networks.

Pytorch implementations of popular off-policy multi-agent reinforcement learning algorithms, including QMix, VDN, MADDPG, and MATD3.

PyTorch implementations of neural network models for keyword spotting

Implementations of polygamma, lgamma, and beta functions for PyTorch

ilpyt: imitation learning library with modular, baseline implementations in Pytorch

A lightweight library to compare different PyTorch implementations of the same network architecture.

Independent and minimal implementations of some reinforcement learning algorithms using PyTorch (including PPO, A3C, A2C, ...).

Pytorch Implementations of large number classical backbone CNNs, data enhancement, torch loss, attention, visualization and some common algorithms.

Siamese-nn-semantic-text-similarity - A repository containing comprehensive Neural Networks based PyTorch implementations for the semantic text similarity task

PyTorch implementations of the beta divergence loss.

Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.