Official code for ROCA: Robust CAD Model Retrieval and Alignment from a Single Image (CVPR 2022)

Last update: Dec 25, 2022

Related tags

Computer Vision ROCA

Overview

ROCA: Robust CAD Model Alignment and Retrieval from a Single Image (CVPR 2022)

Code release of our paper ROCA. Check out our video, paper, and website!

If you find our paper or this repository helpful, please cite:

@article{gumeli2022roca,
  title={ROCA: Robust CAD Model Retrieval and Alignment from a Single Image},
  author={G{\"u}meli, Can and Dai, Angela and Nie{\ss}ner, Matthias},
  booktitle={Proc. Computer Vision and Pattern Recognition (CVPR), IEEE},
  year={2022}
}

Development Environment

We use the following development environment for this project:

Nvidia RTX 3090 GPU
Intel Xeon W-1370
Ubuntu 20.04
CUDA Version 11.2
cudatoolkit 11.0
Pytorch 1.7
Pytorch3D 0.5 or 0.6
Detectron2 0.3

Installation

This code is developed using anaconda3 with Python 3.8 (download here), therefore we recommend a similar setup.

You can simply run the following code in the command line to create the development environment:

$ source setup.sh

For visualizing some demo results or using the data preprocessing code, you need our custom rasterizer. In case the provided x86-64 linux shared object does not work for you, you may install the rasterizer here.

Running the Demo

We provide four sample input images in network/assets folder. The images are captured with a smartphone and then preprocessed to be compatible with ROCA format. To run the demo, you first need to download data and config from this Google Drive folder. Models folder contains the pre-trained model and used config, while Data folder contains images and dataset.

Assuming contents of the Models directory are in $MODEL_DIR and contents of the Data directory are in $DATA_DIR, you can run:

$ cd network
$ python demo.py --model_path $MODEL_DIR/model_best.pth --data_dir $DATA_DIR/Dataset --config_path $MODEL_DIR/config.yaml

You will see image overlay and CAD visualization are displayed one by one. Open3D mesh visualization is an interactive window where you can see geometries from different viewpoints. Close the Open3D window to continue to the next visualization. You will see similar results to the image above.

For headless visualization, you can specify an output directory where resulting images and meshes are placed:

$ python demo.py --model_path $MODEL_DIR/model_best.pth --data_dir $DATA_DIR/Dataset --config_path $MODEL_DIR/config.yaml --output_dir $OUTPUT_DIR

You may use the --wild option to visualize results with "wild retrieval". Note that we omit the table category in this case due to large size diversity.

Preparing Data

Downloading Processed Data (Recommended)

We provide preprocessed images and labels in this Google Drive folder. Download and extract all folders to a desired location before running the training and evaluation code.

Rendering Data

Alternatively, you can render data yourself. Our data preparation code lives in the renderer folder.

Our project depends on ShapeNet (Chang et al., '15), ScanNet (Dai et al. '16), and Scan2CAD (Avetisyan et al. '18) datasets. For ScanNet, we use ScanNet25k images which are provided as a zip file via the ScanNet download script.

Once you get the data, check renderer/env.sh file for the locations of different datasets. The meanings of environment variables are described as inline comments in env.sh.

After editing renderer/env.sh, run the data generation script:

$ cd renderer
$ sh run.sh

Please check run.sh to see how individual scripts are running for data preprocessing and feel free to customize the data pipeline!

Training and Evaluating Models

Our training code lives in the network directory. Navigate to the network/env.sh and edit the environment variables. Make sure data directories are consistent with the ones locations downloaded and extracted folders. If you manually prepared data, make sure locations in /network/env.sh are consistent with the variables set in renderer/env.sh.

After you are done with network/env.sh, run the run.sh script to train a new model or evaluate an existing model based on the environment variables you set in env.sh:

$ cd network
$ sh run.sh

Replicating Experiments from the Main Paper

Based on the configurations in network/env.sh, you can run different ablations from the paper. The default config will run the (final) experiment. You can do the following edits cumulatively for different experiments:

For P+E+W+R, set RETRIEVAL_MODE=resnet_resnet+image
For P+E+W, set RETRIEVAL_MODE=nearest
For P+E, set NOC_WEIGHTS=0
For P, set E2E=0

Resources

To get the datasets and gain further insight regarding our implementation, we refer to the following datasets and open-source codebases:

Datasets and Metadata

Libraries

Projects

Comments

A small bug
Hi，thank for your excellent work on single image secen understanding. I found a small bug in line 165 of network/roca/modeling/retrieval_head/retrieval_ops.py.

except ValueError: assert len(feats) == 0 #feats.numel() == 0 is invalid, since feats is a List.
opened by louzq16 6

RuntimeError: Error(s) in loading state_dict for ROCA

Hello @cangumeli, Sorry to bother you, but I am getting the following error while running the demo.py with model_best.pth Appreciate any help!

File "/home/aston/Desktop/python/ROCA-main/network/roca/engine/predictor.py", line 44, in __init__
    model.load_state_dict(backup['model'])
  File "/home/aston/anaconda3/envs/roca/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1604, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ROCA:
	Unexpected key(s) in state_dict: "pixel_mean", "pixel_std", "proposal_generator.anchor_generator.cell_anchors.0", "proposal_generator.anchor_generator.cell_anchors.1", "proposal_generator.anchor_generator.cell_anchors.2", "proposal_generator.anchor_generator.cell_anchors.3", "proposal_generator.anchor_generator.cell_anchors.4". 

Process finished with exit code 1

opened by WenZhiKun 4

Ask about the positive and negative exemple mining : )

Hi dear author,

Thanks for your wonderful job and making the source code public so quickly. I think the idea in the paper of learning a joint embedding between depth object and cad model is clever and easy to understand. But it seems that you are not explaining in paper how to find postive and negative exemples in joint embedding learning. I would be appreciate if you can explain more details about it.

Thanks a lot : )

opened by DoctorXK 4
Errors encountered during training

I tried to train the model from scratch using both the provided dataset and processed data following the instructions. But I met the following error.

`[06/13 16:27:47 d2.data.datasets.coco]: Loaded 5436 images in COCO format from /workspace/ROCA/dataset/Data/Dataset/scan2cad_instances_val.json [06/13 16:27:47 d2.data.common]: Serializing 5436 elements to byte tensors and concatenating them all ... [06/13 16:27:47 d2.data.common]: Serialized dataset takes 11.65 MiB [06/13 16:27:54 d2.evaluation.evaluator]: Start inference on 5436 images [06/13 16:27:55 d2.evaluation.evaluator]: Inference done 11/5436. 0.0710 s / img. ETA=0:07:35 [06/13 16:28:00 d2.evaluation.evaluator]: Inference done 72/5436. 0.0697 s / img. ETA=0:07:21 [06/13 16:28:05 d2.evaluation.evaluator]: Inference done 134/5436. 0.0693 s / img. ETA=0:07:13 [06/13 16:28:10 d2.evaluation.evaluator]: Inference done 199/5436. 0.0683 s / img. ETA=0:07:01 [06/13 16:28:15 d2.evaluation.evaluator]: Inference done 263/5436. 0.0681 s / img. ETA=0:06:54 [06/13 16:28:20 d2.evaluation.evaluator]: Inference done 326/5436. 0.0681 s / img. ETA=0:06:50 [06/13 16:28:25 d2.evaluation.evaluator]: Inference done 390/5436. 0.0679 s / img. ETA=0:06:43 [06/13 16:28:30 d2.evaluation.evaluator]: Inference done 446/5436. 0.0691 s / img. ETA=0:06:46 ERROR [06/13 16:28:34 d2.engine.train_loop]: Exception during training: Traceback (most recent call last): File "/workspace/ROCA/network/roca/modeling/retrieval_head/retrieval_ops.py", line 162, in voxelize_nocs volumes = add_pointclouds_to_volumes(points, volumes) File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/pytorch3d/ops/points_to_volumes.py", line 275, in add_pointclouds_to_volumes raise ValueError("'pointclouds' have to have their 'features' defined.") ValueError: 'pointclouds' have to have their 'features' defined.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/detectron2/engine/train_loop.py", line 135, in train self.after_step() File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/detectron2/engine/train_loop.py", line 165, in after_step h.after_step() File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/detectron2/engine/hooks.py", line 353, in after_step self._do_eval() File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/detectron2/engine/hooks.py", line 328, in _do_eval results = self._func() File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/detectron2/engine/defaults.py", line 366, in test_and_save_results self._last_eval_results = self.test(self.cfg, self.model) File "/workspace/ROCA/network/roca/engine/trainer.py", line 180, in test results = super().test(cfg, model, evaluators) File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/detectron2/engine/defaults.py", line 534, in test results_i = inference_on_dataset(model, data_loader, evaluator) File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/detectron2/evaluation/evaluator.py", line 141, in inference_on_dataset outputs = model(inputs) File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/workspace/ROCA/network/roca/modeling/meta_arch/meta_arch.py", line 40, in forward return self.inference(batched_inputs) File "/workspace/ROCA/network/roca/modeling/meta_arch/meta_arch.py", line 124, in inference results, extra_outputs = self.roi_heads( File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/workspace/ROCA/network/roca/modeling/roi_heads/roi_heads.py", line 132, in forward pred_instances, alignment_outputs = self._forward_alignment( File "/workspace/ROCA/network/roca/modeling/roi_heads/roi_heads.py", line 180, in _forward_alignment return self._forward_alignment_inference( File "/workspace/ROCA/network/roca/modeling/roi_heads/roi_heads.py", line 282, in _forward_alignment_inference predictions, extra_outputs = self.alignment_head( File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/workspace/ROCA/network/roca/modeling/alignment_head/alignment_head.py", line 137, in forward return self.forward_inference(*args, **kwargs) File "/workspace/ROCA/network/roca/modeling/alignment_head/alignment_head.py", line 333, in forward_inference predictions, extra_outputs = self._forward_retrieval_inference( File "/workspace/ROCA/network/roca/modeling/alignment_head/alignment_head.py", line 803, in _forward_retrieval_inference cad_ids, pred_indices = self.retrieval_head( File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/workspace/ROCA/network/roca/modeling/retrieval_head/retrieval_head.py", line 201, in forward return self._embedding_lookup( File "/workspace/ROCA/network/roca/modeling/retrieval_head/retrieval_head.py", line 353, in _embedding_lookup noc_embeds = self.embed_nocs(shape_code, noc_points, pred_masks) File "/workspace/ROCA/network/roca/modeling/retrieval_head/retrieval_ops.py", line 165, in voxelize_nocs assert len(feats) == 0 AssertionError [06/13 16:28:34 d2.engine.hooks]: Overall training speed: 7497 iterations in 1:56:25 (0.9318 s / it) [06/13 16:28:34 d2.engine.hooks]: Total training time: 2:18:02 (0:21:37 on hooks) [06/13 16:28:34 d2.utils.events]: eta: 18:51:20 iter: 7499 total_loss: 5.823 loss_cls: 0.3354 loss_box_reg: 0.482 3 loss_image_depth: 0.3101 loss_mask: 0.3888 loss_mask_iou: 0.3305 loss_roi_depth: 0.2764 loss_mean_depth: 0.2694 loss_scale: 0.3106 loss_depth_min: 0.04275 loss_depth_max: 0.05285 loss_trans: 0.4288 loss_noc: 0.9828 loss_pro c: 0.5804 loss_trans_proc: 0.3947 loss_noc_comp: 0.1564 loss_triplet: 0.2907 loss_rpn_cls: 0.03125 loss_rpn_loc: 0.01378 time: 0.9317 data_time: 0.0259 lr: 0.001 max_mem: 4627M Traceback (most recent call last): File "/workspace/ROCA/network/roca/modeling/retrieval_head/retrieval_ops.py", line 162, in voxelize_nocs volumes = add_pointclouds_to_volumes(points, volumes) File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/pytorch3d/ops/points_to_volumes.py", line 275, in add_pointclouds_to_volumes raise ValueError("'pointclouds' have to have their 'features' defined.") ValueError: 'pointclouds' have to have their 'features' defined.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "main.py", line 175, in main(parse_args()) File "main.py", line 171, in main train_or_eval(args, cfg) File "main.py", line 164, in train_or_eval trainer.train() File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/detectron2/engine/defaults.py", line 413, in train super().train(self.start_iter, self.max_iter) File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/detectron2/engine/train_loop.py", line 135, in train self.after_step() File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/detectron2/engine/train_loop.py", line 165, in after_step h.after_step() File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/detectron2/engine/hooks.py", line 353, in after_step self._do_eval() File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/detectron2/engine/hooks.py", line 328, in _do_eval results = self._func() File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/detectron2/engine/defaults.py", line 366, in test_and_save_results self._last_eval_results = self.test(self.cfg, self.model) File "/workspace/ROCA/network/roca/engine/trainer.py", line 180, in test results = super().test(cfg, model, evaluators) File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/detectron2/engine/defaults.py", line 534, in test results_i = inference_on_dataset(model, data_loader, evaluator) File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/detectron2/evaluation/evaluator.py", line 141, in inference_on_dataset outputs = model(inputs) File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/workspace/ROCA/network/roca/modeling/meta_arch/meta_arch.py", line 40, in forward return self.inference(batched_inputs) File "/workspace/ROCA/network/roca/modeling/meta_arch/meta_arch.py", line 124, in inference results, extra_outputs = self.roi_heads( File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/workspace/ROCA/network/roca/modeling/roi_heads/roi_heads.py", line 132, in forward pred_instances, alignment_outputs = self._forward_alignment( File "/workspace/ROCA/network/roca/modeling/roi_heads/roi_heads.py", line 180, in _forward_alignment return self._forward_alignment_inference( File "/workspace/ROCA/network/roca/modeling/roi_heads/roi_heads.py", line 282, in _forward_alignment_inference predictions, extra_outputs = self.alignment_head( File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/workspace/ROCA/network/roca/modeling/alignment_head/alignment_head.py", line 137, in forward return self.forward_inference(*args, **kwargs) File "/workspace/ROCA/network/roca/modeling/alignment_head/alignment_head.py", line 333, in forward_inference predictions, extra_outputs = self._forward_retrieval_inference( File "/workspace/ROCA/network/roca/modeling/alignment_head/alignment_head.py", line 803, in _forward_retrieval_inference cad_ids, pred_indices = self.retrieval_head( File "/root/anaconda3/envs/pytorch3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/workspace/ROCA/network/roca/modeling/retrieval_head/retrieval_head.py", line 201, in forward return self._embedding_lookup( File "/workspace/ROCA/network/roca/modeling/retrieval_head/retrieval_head.py", line 353, in _embedding_lookup noc_embeds = self.embed_nocs(shape_code, noc_points, pred_masks) File "/workspace/ROCA/network/roca/modeling/retrieval_head/retrieval_head.py", line 226, in embed_nocs noc_points = voxelize_nocs(grid_to_point_list(noc_points, mask)) File "/workspace/ROCA/network/roca/modeling/retrieval_head/retrieval_ops.py", line 165, in voxelize_nocs assert len(feats) == 0 AssertionError

`

I am using pytorch3d 0.6.2, pytorch 1.7.0

opened by WenM1222 2
i can't import something

in file predictor.py : i can't import these as follows: from roca.config import roca_config from roca.data import CADCatalog from roca.data.constants import CAD_TAXONOMY, COLOR_BY_CLASS from roca.data.datasets import register_scan2cad from roca.structures import Intrinsics from roca.utils.alignment_errors import translation_diff from roca.utils.linalg import make_M_from_tqs the errors say: Unresolved reference "roca" it 's a basic issues however, i can't solve .

opened by noviceswing 2
unable to load materials from model_normalized.mtl

skipping vacant point sample ('02933112', '37e5fcf70007bc26788f926f4d51e733')...

WARNING - 2022-05-30 21:59:28,861 - obj - unable to load materials from: model_normalized.mtl

WARNING - 2022-05-30 21:59:28,868 - obj - specified material (material_52_24) not loaded!

opened by noviceswing 1
A minor error in README
Hi! I tried the demo by following the instructions in README and it works successfully, except that command should be

$ python demo.py --model_path $MODEL_DIR/model_best.pth --data_dir $DATA_DIR/Dataset --config_path $MODEL_DIR/config.yaml

where --model_dir is changed to --model_path and --config_dir is changed to --config_path.

Please correct me if I am wrong. Thanks!
opened by C-H-Chien 1

Error while training the model

Hello @cangumeli, Sorry to bother you again, but I am getting following error while training the ROCA model. Appreciate any help.

[09/07 15:47:04 d2.evaluation.evaluator]: Inference done 5388/5436. 0.0699 s / img. ETA=0:00:03
[09/07 15:47:07 d2.evaluation.evaluator]: Total inference time: 0:07:07.043241 (0.078631 s / img per device, on 1 devices)
[09/07 15:47:07 d2.evaluation.evaluator]: Total inference pure compute time: 0:06:19 (0.069837 s / img per device, on 1 devices)

Starting per-frame evaluation
Frame: 0/5436
Frame: 500/5436
Frame: 1000/5436
Frame: 1500/5436
Frame: 2000/5436
Frame: 2500/5436
Frame: 3000/5436
Frame: 3500/5436
Frame: 4000/5436
Frame: 4500/5436
Frame: 5000/5436
Traceback (most recent call last):
  File "C:\Users\Anaconda3\envs\roca\lib\contextlib.py", line 131, in __exit__
    self.gen.throw(type, value, traceback)
  File "D:\research\code\roca\network\roca\engine\trainer.py", line 207, in cad_context
    yield
  File "D:\research\code\roca\network\roca\engine\trainer.py", line 180, in test
    results = super().test(cfg, model, evaluators)
  File "d:\research\code\detectron2-0.3\detectron2\engine\defaults.py", line 534, in test
    results_i = inference_on_dataset(model, data_loader, evaluator)
  File "d:\research\code\detectron2-0.3\detectron2\evaluation\evaluator.py", line 176, in inference_on_dataset
    results = evaluator.evaluate()
  File "d:\research\code\detectron2-0.3\detectron2\evaluation\evaluator.py", line 91, in evaluate
    result = evaluator.evaluate()
  File "D:\research\code\roca\network\roca\evaluation\per_frame_evaluation.py", line 81, in evaluate
    compute_ap(scores, labels, npos).item() * 100,
AttributeError: 'float' object has no attribute 'item'

opened by supriya-gdptl 1

Error while trying to run demo.py

I followed all your steps for installation and it is successfully finished but when I try to run your demo.py example I get error: "anaconda3/envs/roca/lib/python3.8/site-packages/torch/lib/../../../../libcublas.so.11: undefined symbol: free_gemm_select, version libcublasLt.so.11" According to google, people who had this error suggested to change version of pytorch or cudatoolkit but when I change it the rest of code fails and I get other errors. What could it be to cause this error and how to solve it?

opened by PeterARVR 1
Question about demo

Hello @cangumeli ,

Thank you for sharing the code.

I have a question about the code written in demo.py. On line 25, why do you use scene names from ScanNet dataset?

I want to try the demo code for images taken by phone camera. Could you please tell what steps I need to follow for preprocessing? What scene names I need to choose to write on line 25 of demo.py to work on such not-in-dataset images?

Thank you, Supriya

opened by supriya-gdptl 2
First issue :) Trying to run it on new images
Hi,

Thanks for this great work, it looks very promising and exciting! I am doing some tests. I did not have major issues with installation and the demo runs. Congrats!

I want to adapt the code to get a simple CLI with an input image and its intrinsics. It would be great if such a demo was part of the codebase IMHO.

To this end, could you kindly explain, in demo.py, what is the "scene" argument here ?

for name, scene in zip( ('3m', 'sofa', 'lab', 'desk'), ('scene0474_02', 'scene0207_00', 'scene0378_02', 'scene0474_02') ):

Thanks Thibault
opened by ThibaultGROUEIX 2

Official code for ROCA: Robust CAD Model Retrieval and Alignment from a Single Image (CVPR 2022)

Related tags

Overview

ROCA: Robust CAD Model Alignment and Retrieval from a Single Image (CVPR 2022)

Development Environment

Installation

Running the Demo

Preparing Data

Downloading Processed Data (Recommended)

Rendering Data

Training and Evaluating Models

Replicating Experiments from the Main Paper

Resources

Datasets and Metadata

Libraries

Projects

Comments

Owner

Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).

Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

Code for CVPR 2022 paper "SoftGroup for Instance Segmentation on 3D Point Clouds"

Code for CVPR 2022 paper "Bailando: 3D dance generation via Actor-Critic GPT with Choreographic Memory"

A Joint Video and Image Encoder for End-to-End Retrieval

textspotter - An End-to-End TextSpotter with Explicit Alignment and Attention

Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.

WACV 2022 Paper - Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching

Code for the paper "DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks" (ICCV '19)

Slice a single image into multiple pieces and create a dataset from them

Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

caffe re-implementation of R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection

This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:

PSENet - Shape Robust Text Detection with Progressive Scale Expansion Network.

Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.