Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Ramana Subramanyam

Last update: Dec 6, 2022

Related tags

Computer Vision object-detection

Overview

Head Detector

Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd. The head_detection module can be installed using pip in order to be able to plug-and-play with HeadHunter-T.

Requirements

Nvidia Driver >= 418
Cuda 10.0 and compaitible CudNN
Python packages : To install the required python packages; conda env create -f head_detection.yml.
Use the anaconda environment head_detection by activating it, source activate head_detection or conda activate head_detection.
Alternatively pip can be used to install required packages using pip install -r requirements.txt or update your existing environment with the aforementioned yml file.

Training

To train a model, define environment variable NGPU, config file and use the following command

$python -m torch.distributed.launch --nproc_per_node=$NGPU --use_env train.py --cfg_file config/config_chuman.yaml --world_size $NGPU --num_workers 4

Training is currently supported over (a) ScutHead dataset (b) CrowdHuman + ScutHead combined, (c) Our proposed CroHD dataset. This can be mentioned in the config file.
To train the model, config files must be defined. More details about the config files are mentioned in the section below

Evaluation and Testing

Unlike the training, testing and evaluation does not have a config file. Rather, all the parameters are set as argument variable while executing the code. Refer to the respective files, evaluate.py and test.py.
evaluate.py evaluates over the validation/test set using AP, MMR, F1, MODA and MODP metrics.
test.py runs the detector over a "bunch of images" in the testing set for qualitative evaluation.

Config file

A config file is necessary for all training. It's built to ease the number of arg variable passed during each execution. Each sub-sections are as elaborated below.

DATASET
1. Set the base_path as the parent directory where the dataset is situated at.
2. Train and Valid are .txt files that contains relative path to respective images from the base_path defined above and their corresponding Ground Truth in (x_min, y_min, x_max, y_max) format. Generation files for the three datasets can be seen inside data directory. For example,
```
/path/to/image.png
x_min_1, y_min_1, x_max_1, y_max_1
x_min_2, y_min_2, x_max_2, y_max_2
x_min_3, y_min_3, x_max_3, y_max_3
.
.
.
```
1. mean_std are RGB means and stdev of the training dataset. If not provided, can be computed prior to the start of the training
TRAINING
1. Provide pretrained_model and corresponding start_epoch for resuming.
2. milestones are epoch at which the learning rates are set to 0.1 * lr.
3. only_backbone option loads just the Resnet backbone and not the head. Not applicable for mobilenet.
NETWORK
1. The mentioned parameters are as described in experiment section of the paper.
2. When using median_anchors, the anchors have to be defined in anchors.py.
3. We experimented with mobilenet, resnet50 and resnet150 as alternative backbones. This experiment was not reported in the paper due to space constraints. We found the accuracy to significantly decrease with mobilenet but resnet50 and resnet150 yielded an almost same performance.
4. We also briefly experimented with Deformable Convolutions but again didn't see noticable improvements in performance. The code we used are available in this repository.

Note :

This codebase borrows a noteable portion from pytorch-vision owing to the fact some of their modules cannot be "imported" as a package.

Citation :

@InProceedings{Sundararaman_2021_CVPR,
    author    = {Sundararaman, Ramana and De Almeida Braga, Cedric and Marchand, Eric and Pettre, Julien},
    title     = {Tracking Pedestrian Heads in Dense Crowd},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {3865-3875}
}

Comments

Hello, when I run the model, I get the following error

ValueError: Anchors should be Tuple[Tuple[int]] because each feature map could potentially have different sizes and aspect ratios. There needs to be a match between the number of feature maps passed and the number of sizes / aspect ratios specified.

opened by neversettle-tech 7
how to set the anchor and choose the benchmark when running the tracker on a CroHD dataset(use_public" is false)

when I try to run the tracker on a CroHD dataset(training set) and **set the "use_public" option in the 'det_cfg' to 'False',**the ValueError said that "Anchors should be Tuple[Tuple[int]] because each feature map could potentially have different sizes and aspect ratios. " I noticed that in the obj_detect.py,if det_cfg['median_anchor']:the program can choose different benchmark to import different anchors,so which benchmark I should set in the det_cfg when I want to run the tracker on a CroHD dataset?And I noticed that the benchmark should be associated with the anchor.py,too. Another question is,if i set the "use_public" option in the 'det_cfg' to 'True',does it mean that I will use the data in det.txt which the dataset offers and the detector will not be used to detect the pedestrians or output anything.

opened by yx98314 5
how to solve fast_rcnn.py

File "/home/project/HeadHunter-master/head_detection/models/fast_rcnn.py", line 496, in forward boxes, scores = self.filter_proposals(proposals, objectness, images.image_sizes, num_anchors_per_level) File "/home/project/HeadHunter-master/head_detection/models/fast_rcnn.py", line 458, in filter_proposals keep = keep[:self.post_nms_top_n] TypeError: slice indices must be integers or None or have an index method

opened by Freezing-hxy 5
load state dict error

I run test.py. args.pretrained_model is from this link. I meet this error: RuntimeError: Error(s) in loading state_dict for FasterRCNN: Unexpected key(s) in state_dict: "backbone.ssh1.branch1.0.weight", "backbone.ssh1.branch1.1.weight", "backbone.ssh1.branch1.1.bias", "backbone.ssh1.branch1.1.running_mean", "backbone.ssh1.branch1.1.running_var", "backbone.ssh1.branch1.1.num_batches_tracked", "backbone.ssh1.branch2a.0.weight", "backbone.ssh1.branch2a.1.weight", "backbone.ssh1.branch2a.1.bias", "backbone.ssh1.branch2a.1.running_mean", "backbone.ssh1.branch2a.1.running_var", "backbone.ssh1.branch2a.1.num_batches_tracked", "backbone.ssh1.branch2b.0.weight", "backbone.ssh1.branch2b.1.weight", "backbone.ssh1.branch2b.1.bias", "backbone.ssh1.branch2b.1.running_mean", "backbone.ssh1.branch2b.1.running_var", "backbone.ssh1.branch2b.1.num_batches_tracked", "backbone.ssh1.branch2c.0.weight", "backbone.ssh1.branch2c.1.weight", "backbone.ssh1.branch2c.1.bias", "backbone.ssh1.branch2c.1.running_mean", "backbone.ssh1.branch2c.1.running_var", "backbone.ssh1.branch2c.1.num_batches_tracked", "backbone.ssh1.ssh_1.weight", "backbone.ssh1.ssh_1.bias", "backbone.ssh1.ssh_dimred.weight", "backbone.ssh1.ssh_dimred.bias", "backbone.ssh1.ssh_2.weight", "backbone.ssh1.ssh_2.bias", "backbone.ssh1.ssh_3a.weight", "backbone.ssh1.ssh_3a.bias", "backbone.ssh1.ssh_3b.weight", "backbone.ssh1.ssh_3b.bias", "backbone.ssh1.ssh_final.0.weight", "backbone.ssh1.ssh_final.1.weight", "backbone.ssh1.ssh_final.1.bias", "backbone.ssh1.ssh_final.1.running_mean", "backbone.ssh1.ssh_final.1.running_var", "backbone.ssh1.ssh_final.1.num_batches_tracked", "backbone.ssh2.branch1.0.weight", "backbone.ssh2.branch1.1.weight", "backbone.ssh2.branch1.1.bias", "backbone.ssh2.branch1.1.running_mean", "backbone.ssh2.branch1.1.running_var", "backbone.ssh2.branch1.1.num_batches_tracked", "backbone.ssh2.branch2a.0.weight", "backbone.ssh2.branch2a.1.weight", "backbone.ssh2.branch2a.1.bias", "backbone.ssh2.branch2a.1.running_mean", "backbone.ssh2.branch2a.1.running_var", "backbone.ssh2.branch2a.1.num_batches_tracked", "backbone.ssh2.branch2b.0.weight", "backbone.ssh2.branch2b.1.weight", "backbone.ssh2.branch2b.1.bias", "backbone.ssh2.branch2b.1.running_mean", "backbone.ssh2.branch2b.1.running_var", "backbone.ssh2.branch2b.1.num_batches_tracked", "backbone.ssh2.branch2c.0.weight", "backbone.ssh2.branch2c.1.weight", "backbone.ssh2.branch2c.1.bias", "backbone.ssh2.branch2c.1.running_mean", "backbone.ssh2.branch2c.1.running_var", "backbone.ssh2.branch2c.1.num_batches_tracked", "backbone.ssh2.ssh_1.weight", "backbone.ssh2.ssh_1.bias", "backbone.ssh2.ssh_dimred.weight", "backbone.ssh2.ssh_dimred.bias", "backbone.ssh2.ssh_2.weight", "backbone.ssh2.ssh_2.bias", "backbone.ssh2.ssh_3a.weight", "backbone.ssh2.ssh_3a.bias", "backbone.ssh2.ssh_3b.weight", "backbone.ssh2.ssh_3b.bias", "backbone.ssh2.ssh_final.0.weight", "backbone.ssh2.ssh_final.1.weight", "backbone.ssh2.ssh_final.1.bias", "backbone.ssh2.ssh_final.1.running_mean", "backbone.ssh2.ssh_final.1.running_var", "backbone.ssh2.ssh_final.1.num_batches_tracked", "backbone.ssh3.branch1.0.weight", "backbone.ssh3.branch1.1.weight", "backbone.ssh3.branch1.1.bias", "backbone.ssh3.branch1.1.running_mean", "backbone.ssh3.branch1.1.running_var", "backbone.ssh3.branch1.1.num_batches_tracked", "backbone.ssh3.branch2a.0.weight", "backbone.ssh3.branch2a.1.weight", "backbone.ssh3.branch2a.1.bias", "backbone.ssh3.branch2a.1.running_mean", "backbone.ssh3.branch2a.1.running_var", "backbone.ssh3.branch2a.1.num_batches_tracked", "backbone.ssh3.branch2b.0.weight", "backbone.ssh3.branch2b.1.weight", "backbone.ssh3.branch2b.1.bias", "backbone.ssh3.branch2b.1.running_mean", "backbone.ssh3.branch2b.1.running_var", "backbone.ssh3.branch2b.1.num_batches_tracked", "backbone.ssh3.branch2c.0.weight", "backbone.ssh3.branch2c.1.weight", "backbone.ssh3.branch2c.1.bias", "backbone.ssh3.branch2c.1.running_mean", "backbone.ssh3.branch2c.1.running_var", "backbone.ssh3.branch2c.1.num_batches_tracked", "backbone.ssh3.ssh_1.weight", "backbone.ssh3.ssh_1.bias", "backbone.ssh3.ssh_dimred.weight", "backbone.ssh3.ssh_dimred.bias", "backbone.ssh3.ssh_2.weight", "backbone.ssh3.ssh_2.bias", "backbone.ssh3.ssh_3a.weight", "backbone.ssh3.ssh_3a.bias", "backbone.ssh3.ssh_3b.weight", "backbone.ssh3.ssh_3b.bias", "backbone.ssh3.ssh_final.0.weight", "backbone.ssh3.ssh_final.1.weight", "backbone.ssh3.ssh_final.1.bias", "backbone.ssh3.ssh_final.1.running_mean", "backbone.ssh3.ssh_final.1.running_var", "backbone.ssh3.ssh_final.1.num_batches_tracked", "backbone.ssh4.branch1.0.weight", "backbone.ssh4.branch1.1.weight", "backbone.ssh4.branch1.1.bias", "backbone.ssh4.branch1.1.running_mean", "backbone.ssh4.branch1.1.running_var", "backbone.ssh4.branch1.1.num_batches_tracked", "backbone.ssh4.branch2a.0.weight", "backbone.ssh4.branch2a.1.weight", "backbone.ssh4.branch2a.1.bias", "backbone.ssh4.branch2a.1.running_mean", "backbone.ssh4.branch2a.1.running_var", "backbone.ssh4.branch2a.1.num_batches_tracked", "backbone.ssh4.branch2b.0.weight", "backbone.ssh4.branch2b.1.weight", "backbone.ssh4.branch2b.1.bias", "backbone.ssh4.branch2b.1.running_mean", "backbone.ssh4.branch2b.1.running_var", "backbone.ssh4.branch2b.1.num_batches_tracked", "backbone.ssh4.branch2c.0.weight", "backbone.ssh4.branch2c.1.weight", "backbone.ssh4.branch2c.1.bias", "backbone.ssh4.branch2c.1.running_mean", "backbone.ssh4.branch2c.1.running_var", "backbone.ssh4.branch2c.1.num_batches_tracked", "backbone.ssh4.ssh_1.weight", "backbone.ssh4.ssh_1.bias", "backbone.ssh4.ssh_dimred.weight", "backbone.ssh4.ssh_dimred.bias", "backbone.ssh4.ssh_2.weight", "backbone.ssh4.ssh_2.bias", "backbone.ssh4.ssh_3a.weight", "backbone.ssh4.ssh_3a.bias", "backbone.ssh4.ssh_3b.weight", "backbone.ssh4.ssh_3b.bias", "backbone.ssh4.ssh_final.0.weight", "backbone.ssh4.ssh_final.1.weight", "backbone.ssh4.ssh_final.1.bias", "backbone.ssh4.ssh_final.1.running_mean", "backbone.ssh4.ssh_final.1.running_var", "backbone.ssh4.ssh_final.1.num_batches_tracked". what should i do?

opened by RichardoMrMu 3
raise ValueError("Anchors should be Tuple[Tuple[int]] because each feature "

anchors = self.grid_anchors(grid_sizes, strides) File "/home/haojie/下载/ENTER/envs/head_detection2/lib/python3.8/site-packages/torchvision/models/detection/anchor_utils.py", line 103, in grid_anchors raise ValueError("Anchors should be Tuple[Tuple[int]] because each feature " ValueError: Anchors should be Tuple[Tuple[int]] because each feature map could potentially have different sizes and aspect ratios. There needs to be a match between the number of feature maps passed and the number of sizes / aspect ratios specified. Killing subprocess 1055

opened by haojie0616 2
About GPU memory requirement.

How much GPU memory is recommanded when training the model using ScutHead dataset? I've found a memory insufficience error when trying to train the model on a idel GPU with 4041MB memory and batch size 1. Besides, there is some minor bugs in the "head_detection/data/create_scuthead.py" file, as args.dset_path is defined but args.dset_dir is used (and the same bugs to the arg.save_path and args.out_path).

opened by NonameZ-HTY 2
Loss is nan

Thanks for providing the code to your paper. I managed to get started, even though my gpu only has 8 GB. I just added A.SmallestMaxSize(max_size=400, p=1.0), to the transforms in dataset.py.

Everything else is just used as is on the ScutHead-Dataset (and correct versions of torch=1.6.0 Now I get the error, directly on the first iteration: Loss is nan, stopping training

I suppose this is because in fast_rcnn.py CustomRoIHead.forward the passed features are filled with nan Also setting ohem=soft_nms=upscale_rpn=False and thereby using torchvisions RoiHead doesn't help. Did you experience something like this during training?

opened by stvogel 1
Questions about some results in the CVPR'21 paper
Hi @Sentient07,

I'm confused about some results in your "CVPR21 paper: Tracking Pedestrian Heads in Dense Crowd" cause some details are missing in the paper. Hope you can help me sort them out. Thanks!

The test set of SCUT-HEAD dataset consists of two parts: Part-A and Part-B. I find that other methods compared in Table 2 report respective results on both parts in their papers. So, which part did you use to run the evaluation to obtain the results in Table 2? Or is it the fact that the results in Table 2 are obtained by evaluating methods on the combination of both parts? Since I couldn't find the exact results of other compared methods in Table 2 in the corresponding cited papers, I'd like to know how did you obtain the results of these methods?

Table 4 shows the tracking results of your proposed tracker HeadHunter-T and other state-of-the-art trackers on the test set of CroHD. However, the results of HeadHunter-T in Table 4 are quite different from the results shown on the website of the MOT benchmark. Can you tell me what's the difference between these two results?

Looking forward to your reply. Thanks!
opened by Johnqczhang 1
Is this model suitable for low-density pedestrian tracking?

Two questions： 1. Is this model suitable for low-density pedestrian tracking? 2. The groudTruth file is loaded in a different format than the readme file

opened by neversettle-tech 1
Can I use this model to train our own data set

Two questions： 1. Can I use this model to train our own data set？If so, what are the differences from the three data sets mentioned by the author. 2. The groudTruth file is loaded in a different format than the readme file

opened by neversettle-tech 1
Could you share your CroHD dataset?

Hi, I am very interested in your great work, I try to reproduce your work from scratch. But I cannot find anywhere to download CroHD dataset. Could you please give me a way to download it?

opened by yoyokitartora 1
About test set results

Why are the results published on the croHD test set in your paper different from those published on the mot challeng website. The MOTA published in the paper is 63.6, compared to 57.8 on the mot challenge website

opened by neversettle-tech 0
Reproduce the results
Has anyone been able to run the program properly? I followed the instruction in the repo, but haven't got any luck yet. I am confused by the mismatch of cuda versions between the one indicated in this repo and the one actually used by the author, and the extra packages I need to install to stop receiving errors. Below are the packages I installed aside from the one in the requirements, but I still couldn't get any output.

pip install pyyaml pip install ipykernel pip install albumentations==0.4.6 pip install scipy==1.1.0 pip install pycocotools pip install munkres pip install scikit-image==0.16.2
opened by Xinxinatg 0
Context Module introduced in PyramidBox paper

Just to put a context: I was asked to find a paper and reproduce some results from scratch (it weights the 50% of the subject), I've my deadline around the 10 of June of 2022.

While rewriting the detection network (in order to fully understand the paper) I found strange the CPM part and I would like to ask for advice.

Papers Text

The paper says:

with Context Sensitive feature extractor followed by series of transpose convolutions to enhance spatial resolution of feature maps.

and

we augmented on top of each individual FPNs, a Context-sensitive Prediction Module (CPM) [63]. This contextual module consists of 4 Inception-ResNet-A blocks [62] with 128 and 256 filters for 3 × 3 convolution and 1024 filters for 1 × 1 convolution.

The reference 63 says:

We design the Context-sensitive Predict Module (CPM), see Fig. 3(b), in which we replace the convolution layers of context module in SSH by the residual-free prediction module of DSSD.

Issues

From the previous cites, I understand the CPM as a SSH with different convolution operations. But your Figure 4 (from the paper) and your code shows a channel expansion which seems like the prediction module of DSSD (a kind of simplified Inception) followed by a standard SSH.

I did not find any Inception-ResNet-A blocks.

Additionally, I did not find the transpose convolutions part.

Sorry for the inconvenience, I just want to make sure I don't miss any detail and have it done correctly as soon as possible...

opened by ignasi00 1

ValueError: Anchors should be Tuple[Tuple[int]] ... with GPU RTX 3000 series

Hello, If I try to run test.py the pretrained_model you provided on CroHD, I am facing a problem with Anchors:

python test.py --test_dataset CroHD/test/HT21-11/img1 --plot_folder outputs --outfile outputs --pretrained_model FT_R50_epoch_24.pth --context cpm

Output, with the Traceback:

256
FT_R50_epoch_24.pth
0it [00:00, ?it/s]/mnt/sdb/anaconda3/envs/headhunter-TT/lib/python3.8/site-packages/torch/nn/functional.py:3502: UserWarning: The default behavior for interpolate/upsample with float scale_factor changed in 1.6.0 to align with other frameworks/libraries, and now uses scale_factor directly, instead of relying on the computed output size. If you wish to restore the old behavior, please set recompute_scale_factor=True. See the documentation of nn.Upsample for details. 
  warnings.warn(
0it [00:02, ?it/s]
Traceback (most recent call last):
  File "test.py", line 176, in <module>
    test()
  File "/mnt/sdb/anaconda3/envs/headhunter-TT/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "test.py", line 165, in test
    outputs = model(images)
  File "/mnt/sdb/anaconda3/envs/headhunter-TT/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/mnt/sdb/anaconda3/envs/headhunter-TT/lib/python3.8/site-packages/torchvision/models/detection/generalized_rcnn.py", line 97, in forward
    proposals, proposal_losses = self.rpn(images, features, targets)
  File "/mnt/sdb/anaconda3/envs/headhunter-TT/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/mnt/sdb/anaconda3/envs/headhunter-TT/lib/python3.8/site-packages/torchvision/models/detection/rpn.py", line 345, in forward
    anchors = self.anchor_generator(images, features)
  File "/mnt/sdb/anaconda3/envs/headhunter-TT/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/mnt/sdb/anaconda3/envs/headhunter-TT/lib/python3.8/site-packages/torchvision/models/detection/anchor_utils.py", line 150, in forward
    anchors_over_all_feature_maps = self.cached_grid_anchors(grid_sizes, strides)
  File "/mnt/sdb/anaconda3/envs/headhunter-TT/lib/python3.8/site-packages/torchvision/models/detection/anchor_utils.py", line 139, in cached_grid_anchors
    anchors = self.grid_anchors(grid_sizes, strides)
  File "/mnt/sdb/anaconda3/envs/headhunter-TT/lib/python3.8/site-packages/torchvision/models/detection/anchor_utils.py", line 103, in grid_anchors
    raise ValueError("Anchors should be Tuple[Tuple[int]] because each feature "
ValueError: Anchors should be Tuple[Tuple[int]] because each feature map could potentially have different sizes and aspect ratios. There needs to be a match between the number of feature maps passed and the number of sizes / aspect ratios specified.

These are the contents of my virtual environment:

name: headhunter-TT
channels:
  - defaults
dependencies:
  - _libgcc_mutex=0.1=main
  - ca-certificates=2021.4.13=h06a4308_1
  - certifi=2020.12.5=py38h06a4308_0
  - ld_impl_linux-64=2.33.1=h53a641e_7
  - libffi=3.3=he6710b0_2
  - libgcc-ng=9.1.0=hdf63c60_0
  - libstdcxx-ng=9.1.0=hdf63c60_0
  - ncurses=6.2=he6710b0_1
  - openssl=1.1.1k=h27cfd23_0
  - pip=21.1.1=py38h06a4308_0
  - python=3.8.10=hdb3f193_7
  - readline=8.1=h27cfd23_0
  - setuptools=52.0.0=py38h06a4308_0
  - sqlite=3.35.4=hdfb4753_0
  - tk=8.6.10=hbc83047_0
  - wheel=0.36.2=pyhd3eb1b0_0
  - xz=5.2.5=h7b6447c_0
  - zlib=1.2.11=h7b6447c_3
  - pip:
    - chardet==4.0.0
    - cycler==0.10.0
    - decorator==4.4.2
    - h5py==3.2.1
    - idna==2.10
    - imageio==2.9.0
    - kiwisolver==1.3.1
    - matplotlib==3.4.2
    - networkx==2.5.1
    - numpy==1.20.3
    - ordered-set==4.0.2
    - pillow==8.2.0
    - plyfile==0.7.4
    - pyparsing==2.4.7
    - python-dateutil==2.8.1
    - pywavelets==1.1.1
    - requests==2.25.1
    - scikit-image==0.18.1
    - scipy==1.6.3
    - six==1.16.0
    - tifffile==2021.4.8
    - torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
    - torchmeta==1.7.0
    - torchvision==0.9.1
    - tqdm==4.61.0
    - trimesh==3.9.19
    - typing-extensions==3.10.0.0
    - urllib3==1.26.5
    - munkres
    - albumentations==0.5.2
    - pyyaml

If I try to downgrade torch to 1.6.0 and torchvision to 0.7.0, I run through the following error:

RuntimeError: CUDA error: no kernel image is available for execution on the device

Moreover, I get this warning message if I use torch 1.6.0, when I try to get a device info via: torch.cuda.get_device_name(0)

NVIDIA GeForce RTX 3xxx with CUDA capability sm_86 is not compatible with the current PyTorch installation. The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.

opened by MounirB 0

Owner

Ramana Subramanyam

GitHub

code for our ICCV 2021 paper "DeepCAD: A Deep Generative Network for Computer-Aided Design Models"

DeepCAD This repository provides source code for our paper: DeepCAD: A Deep Generative Network for Computer-Aided Design Models Rundi Wu, Chang Xiao,

85 Dec 31, 2022

CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

LED2-Net This is PyTorch implementation of our CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering". Y

83 Jan 4, 2023

Learning Camera Localization via Dense Scene Matching, CVPR2021

This repository contains code of our CVPR 2021 paper - "Learning Camera Localization via Dense Scene Matching" by Shitao Tang, Chengzhou Tang, Rui Hua

65 Dec 1, 2022

This is the code for our paper DAAIN: Detection of Anomalous and AdversarialInput using Normalizing Flows

Merantix-Labs: DAAIN This is the code for our paper DAAIN: Detection of Anomalous and Adversarial Input using Normalizing Flows which can be found at

14 Oct 12, 2022

Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo Thomas Kollar, Michael Laskey, Kevin Stone, Brijen Thananjeyan

68 Dec 14, 2022

Implement 'Single Shot Text Detector with Regional Attention, ICCV 2017 Spotlight'

SSTDNet Implement 'Single Shot Text Detector with Regional Attention, ICCV 2017 Spotlight' using pytorch. This code is work for general object detecti

84 Jan 5, 2022

This is a c++ project deploying a deep scene text reading pipeline with tensorflow. It reads text from natural scene images. It uses frozen tensorflow graphs. The detector detect scene text locations. The recognizer reads word from each detected bounding box.

DeepSceneTextReader This is a c++ project deploying a deep scene text reading pipeline. It reads text from natural scene images. Prerequsites The proj

49 Sep 10, 2022

Single Shot Text Detector with Regional Attention

Single Shot Text Detector with Regional Attention Introduction SSTD is initially described in our ICCV 2017 spotlight paper. A third-party implementat

215 Dec 7, 2022

A tensorflow implementation of EAST text detector

EAST: An Efficient and Accurate Scene Text Detector Introduction This is a tensorflow re-implementation of EAST: An Efficient and Accurate Scene Text

2.9k Jan 2, 2023

Implementation of EAST scene text detector in Keras

EAST: An Efficient and Accurate Scene Text Detector This is a Keras implementation of EAST based on a Tensorflow implementation made by argman. The or

208 Nov 15, 2022

This is a pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector.

EAST: An Efficient and Accurate Scene Text Detector Description: This version will be updated soon, please pay attention to this work. The motivation

544 Dec 20, 2022

PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

Description This is a PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector. Only RBOX part is implemented. Using dice loss

365 Dec 20, 2022

Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector

CRAFT: Character-Region Awareness For Text detection Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector | Paper |

188 Dec 28, 2022

TextBoxes++: A Single-Shot Oriented Scene Text Detector

TextBoxes++: A Single-Shot Oriented Scene Text Detector Introduction This is an application for scene text detection (TextBoxes++) and recognition (CR

930 Jan 4, 2023

TextBoxes: A Fast Text Detector with a Single Deep Neural Network https://github.com/MhLiao/TextBoxes 基于SSD改进的文本检测算法，textBoxes_note记录了之前整理的笔记。

TextBoxes: A Fast Text Detector with a Single Deep Neural Network Introduction This paper presents an end-to-end trainable fast scene text detector, n

24 Apr 28, 2022

python ocr using tesseract/ with EAST opencv detector

pytextractor python ocr using tesseract/ with EAST opencv text detector Uses the EAST opencv detector defined here with pytesseract to extract text(de

38 Dec 5, 2022

Augmenting Anchors by the Detector Itself

Augmenting Anchors by the Detector Itself Introduction It is difficult to determine the scale and aspect ratio of anchors for anchor-based object dete

4 Nov 6, 2022

Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

Security camera running OpenCV for object and motion detection. The camera will send email with image of any objects it detects. It also runs a server that provides web interface with live stream video.

10 Jun 30, 2021

Image Detector and Convertor App created using python's Pillow, OpenCV, cvlib, numpy and streamlit packages.

11 Jan 2, 2022

Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Related tags

Overview

Head Detector

Requirements

Training

Evaluation and Testing

Config file

Note :

Citation :

Comments

Papers Text

Issues

Owner

Ramana Subramanyam

code for our ICCV 2021 paper "DeepCAD: A Deep Generative Network for Computer-Aided Design Models"

CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

Learning Camera Localization via Dense Scene Matching, CVPR2021

This is the code for our paper DAAIN: Detection of Anomalous and AdversarialInput using Normalizing Flows

Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

Implement 'Single Shot Text Detector with Regional Attention, ICCV 2017 Spotlight'

This is a c++ project deploying a deep scene text reading pipeline with tensorflow. It reads text from natural scene images. It uses frozen tensorflow graphs. The detector detect scene text locations. The recognizer reads word from each detected bounding box.

Single Shot Text Detector with Regional Attention

A tensorflow implementation of EAST text detector

Implementation of EAST scene text detector in Keras

This is a pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector.

PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector

TextBoxes++: A Single-Shot Oriented Scene Text Detector

TextBoxes: A Fast Text Detector with a Single Deep Neural Network https://github.com/MhLiao/TextBoxes 基于SSD改进的文本检测算法，textBoxes_note记录了之前整理的笔记。

python ocr using tesseract/ with EAST opencv detector

Augmenting Anchors by the Detector Itself

Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

Image Detector and Convertor App created using python's Pillow, OpenCV, cvlib, numpy and streamlit packages.