Code & Models for 3DETR - an End-to-end transformer model for 3D object detection

Related tags

Deep Learning 3detr
Overview

3DETR: An End-to-End Transformer Model for 3D Object Detection

PyTorch implementation and models for 3DETR.

3DETR (3D DEtection TRansformer) is a simpler alternative to complex hand-crafted 3D detection pipelines. It does not rely on 3D backbones such as PointNet++ and uses few 3D-specific operators. 3DETR obtains comparable or better performance than 3D detection methods such as VoteNet. The encoder can also be used for other 3D tasks such as shape classification. More details in the paper "An End-to-End Transformer Model for 3D Object Detection".

[website] [arXiv] [bibtex]

Code description. Our code is based on prior work such as DETR and VoteNet and we aim for simplicity in our implementation. We hope it can ease research in 3D detection.

3DETR Approach Decoder Detections

Pretrained Models

We provide the pretrained model weights and the corresponding metrics on the val set (per class APs, Recalls). We provide a Python script utils/download_weights.py to easily download the weights/metrics files.

Arch Dataset Epochs AP25 AP50 Model weights Eval metrics
3DETR-m SUN RGB-D 1080 59.1 30.3 weights metrics
3DETR SUN RGB-D 1080 58.0 30.3 weights metrics
3DETR-m ScanNet 1080 65.0 47.0 weights metrics
3DETR ScanNet 1080 62.1 37.9 weights metrics

Model Zoo

For convenience, we provide model weights for 3DETR trained for different number of epochs.

Arch Dataset Epochs AP25 AP50 Model weights Eval metrics
3DETR-m SUN RGB-D 90 51.0 22.0 weights metrics
3DETR-m SUN RGB-D 180 55.6 27.5 weights metrics
3DETR-m SUN RGB-D 360 58.2 30.6 weights metrics
3DETR-m SUN RGB-D 720 58.1 30.4 weights metrics
3DETR SUN RGB-D 90 43.7 16.2 weights metrics
3DETR SUN RGB-D 180 52.1 25.8 weights metrics
3DETR SUN RGB-D 360 56.3 29.6 weights metrics
3DETR SUN RGB-D 720 56.0 27.8 weights metrics
3DETR-m ScanNet 90 47.1 19.5 weights metrics
3DETR-m ScanNet 180 58.7 33.6 weights metrics
3DETR-m ScanNet 360 62.4 37.7 weights metrics
3DETR-m ScanNet 720 63.7 44.5 weights metrics
3DETR ScanNet 90 42.8 15.3 weights metrics
3DETR ScanNet 180 54.5 28.8 weights metrics
3DETR ScanNet 360 59.0 35.4 weights metrics
3DETR ScanNet 720 61.1 40.2 weights metrics

Running 3DETR

Installation

Our code is tested with PyTorch 1.4.0, CUDA 10.2 and Python 3.6. It may work with other versions.

You will need to install pointnet2 layers by running

cd third_party/pointnet2 && python setup.py install

You will also need Python dependencies (either conda install or pip install)

matplotlib
opencv-python
plyfile
'trimesh>=2.35.39,<2.35.40'
'networkx>=2.2,<2.3'
scipy

Some users have experienced issues using CUDA 11 or higher. Please try using CUDA 10.2 if you run into CUDA issues.

Optionally, you can install a Cythonized implementation of gIOU for faster training.

conda install cython
cd utils && python cython_compile.py build_ext --inplace

Benchmarking

Dataset preparation

We follow the VoteNet codebase for preprocessing our data. The instructions for preprocessing SUN RGB-D are [here] and ScanNet are [here].

You can edit the dataset paths in datasets/sunrgbd.py and datasets/scannet.py or choose to specify at runtime.

Testing

Once you have the datasets prepared, you can test pretrained models as

python main.py --dataset_name <dataset_name> --nqueries <number of queries> --test_ckpt <path_to_checkpoint> --test_only [--enc_type masked]

We use 128 queries for the SUN RGB-D dataset and 256 queries for the ScanNet dataset. You will need to add the flag --enc_type masked when testing the 3DETR-m checkpoints. Please note that the testing process is stochastic (due to randomness in point cloud sampling and sampling the queries) and so results can vary within 1% AP25 across runs. This stochastic nature of the inference process is also common for methods such as VoteNet.

If you have not edited the dataset paths for the files in the datasets folder, you can pass the path to the datasets using the --dataset_root_dir flag.

Training

The model can be simply trained by running main.py.

python main.py --dataset_name <dataset_name> --checkpoint_dir <path to store outputs>

To reproduce the results in the paper, we provide the arguments in the scripts folder. A variance of 1% AP25 across different training runs can be expected.

You can quickly verify your installation by training a 3DETR model for 90 epochs on ScanNet following the file scripts/scannet_quick.sh and compare it to the pretrained checkpoint from the Model Zoo.

License

The majority of 3DETR is licensed under the Apache 2.0 license as found in the LICENSE file, however portions of the project are available under separate license terms: licensing information for pointnet2 is available at https://github.com/erikwijmans/Pointnet2_PyTorch/blob/master/UNLICENSE

Contributing

We welcome your pull requests! Please see CONTRIBUTING and CODE_OF_CONDUCT for more info.

Citation

If you find this repository useful, please consider starring us and citing

@inproceedings{misra2021-3detr,
    title={{An End-to-End Transformer Model for 3D Object Detection}},
    author={Misra, Ishan and Girdhar, Rohit and Joulin, Armand},
    booktitle={{ICCV}},
    year={2021},
}
Comments
  • The network does not perform well on scan

    The network does not perform well on scan

    Dear author, your work is very good, but when I run your code, I don't perform very well. Can you help me see what's wrong?

    environment: PyTorch 1.9.0, CUDA 10.2, Python 3.6.

    Run: python main.py
    --dataset_name scannet
    --max_epoch 1080
    --nqueries 256
    --matcher_giou_cost 2
    --matcher_cls_cost 1
    --matcher_center_cost 0
    --matcher_objectness_cost 0
    --loss_giou_weight 1
    --loss_no_object_weight 0.25
    --save_separate_checkpoint_every_epoch -1
    --checkpoint_dir outputs/scannet_ep1080 Result: Training Finished. ====================Final Eval Numbers. mAP0.25, mAP0.50: 16.68, 4.53 AR0.25, AR0.50: 23.55, 9.46

    IOU Thresh=0.25 cabinet Average Precision: 5.25 bed Average Precision: 46.70 chair Average Precision: 40.31 sofa Average Precision: 16.30 table Average Precision: 15.51 door Average Precision: 9.95 window Average Precision: 2.23 bookshelf Average Precision: 2.74 picture Average Precision: 0.56 counter Average Precision: 10.11 desk Average Precision: 32.43 curtain Average Precision: 6.47 refrigerator Average Precision: 0.02 showercurtrain Average Precision: 18.62 toilet Average Precision: 44.26 sink Average Precision: 23.71 bathtub Average Precision: 16.13 garbagebin Average Precision: 8.95 cabinet Recall: 11.83 bed Recall: 53.09 chair Recall: 46.49 sofa Recall: 29.90 table Recall: 29.71 door Recall: 16.06 window Recall: 6.03 bookshelf Recall: 9.09 picture Recall: 1.80 counter Recall: 17.31 desk Recall: 50.39 curtain Recall: 8.96 refrigerator Recall: 3.51 showercurtrain Recall: 25.00 toilet Recall: 48.28 sink Recall: 33.67 bathtub Recall: 16.13 garbagebin Recall: 16.60

    IOU Thresh=0.5 cabinet Average Precision: 0.74 bed Average Precision: 13.76 chair Average Precision: 11.42 sofa Average Precision: 5.46 table Average Precision: 3.97 door Average Precision: 1.42 window Average Precision: 0.47 bookshelf Average Precision: 2.01 picture Average Precision: 0.03 counter Average Precision: 0.96 desk Average Precision: 9.44 curtain Average Precision: 0.50 refrigerator Average Precision: 0.02 showercurtrain Average Precision: 7.14 toilet Average Precision: 13.95 sink Average Precision: 3.75 bathtub Average Precision: 3.23 garbagebin Average Precision: 3.19 cabinet Recall: 2.96 bed Recall: 22.22 chair Recall: 22.73 sofa Recall: 14.43 table Recall: 13.43 door Recall: 5.14 window Recall: 2.13 bookshelf Recall: 5.19 picture Recall: 0.45 counter Recall: 1.92 desk Recall: 24.41 curtain Recall: 1.49 refrigerator Recall: 1.75 showercurtrain Recall: 7.14 toilet Recall: 20.69 sink Recall: 10.20 bathtub Recall: 6.45 garbagebin Recall: 7.55 ====================Best Eval Numbers. mAP0.25, mAP0.50: 31.15, 7.70 AR0.25, AR0.50: 58.95, 20.68

    IOU Thresh=0.25 cabinet Average Precision: 12.00 bed Average Precision: 70.82 chair Average Precision: 56.46 sofa Average Precision: 56.72 table Average Precision: 29.51 door Average Precision: 15.84 window Average Precision: 8.84 bookshelf Average Precision: 22.21 picture Average Precision: 0.26 counter Average Precision: 31.04 desk Average Precision: 49.17 curtain Average Precision: 6.81 refrigerator Average Precision: 15.72 showercurtrain Average Precision: 29.42 toilet Average Precision: 76.88 sink Average Precision: 32.54 bathtub Average Precision: 37.64 garbagebin Average Precision: 8.90 cabinet Recall: 47.58 bed Recall: 86.42 chair Recall: 76.83 sofa Recall: 89.69 table Recall: 67.14 door Recall: 36.62 window Recall: 28.01 bookshelf Recall: 67.53 picture Recall: 7.21 counter Recall: 57.69 desk Recall: 85.83 curtain Recall: 50.75 refrigerator Recall: 61.40 showercurtrain Recall: 60.71 toilet Recall: 89.66 sink Recall: 50.00 bathtub Recall: 61.29 garbagebin Recall: 36.79

    IOU Thresh=0.5 cabinet Average Precision: 0.91 bed Average Precision: 29.13 chair Average Precision: 11.63 sofa Average Precision: 20.43 table Average Precision: 5.33 door Average Precision: 1.59 window Average Precision: 1.30 bookshelf Average Precision: 4.18 picture Average Precision: 0.00 counter Average Precision: 0.25 desk Average Precision: 16.56 curtain Average Precision: 0.13 refrigerator Average Precision: 4.87 showercurtrain Average Precision: 1.69 toilet Average Precision: 26.55 sink Average Precision: 1.95 bathtub Average Precision: 10.95 garbagebin Average Precision: 1.18 cabinet Recall: 10.75 bed Recall: 51.85 chair Recall: 30.12 sofa Recall: 45.36 table Recall: 22.29 door Recall: 8.14 window Recall: 6.38 bookshelf Recall: 28.57 picture Recall: 0.45 counter Recall: 7.69 desk Recall: 39.37 curtain Recall: 5.97 refrigerator Recall: 24.56 showercurtrain Recall: 10.71 toilet Recall: 41.38 sink Recall: 9.18 bathtub Recall: 19.35 garbagebin Recall: 10.19

    Looking forward to your reply!

    opened by sunmaosheng755 14
  • RuntimeError: output with shape [32, 2048, 2048] doesn't match the broadcast shape [1, 32, 2048, 2048]

    RuntimeError: output with shape [32, 2048, 2048] doesn't match the broadcast shape [1, 32, 2048, 2048]

    Hi, Thank you for sharing your great work!

    I'm running your pre-trained model to reproduce the results but I see the runtime error below. The error seems about an input shape issue. Do you have any suggestions for this?

    This is the command that I put.

    python main.py --dataset_name sunrgbd --nqueries 128 --test_ckpt weights/sunrgbd_masked_ep1080.pth --test_only --enc_type masked
    

    This is the error that I have.

    Traceback (most recent call last):
      File "main.py", line 427, in <module>
        launch_distributed(args)
      File "main.py", line 415, in launch_distributed
        main(local_rank=0, args=args)
      File "main.py", line 388, in main
        test_model(args, model, model_no_ddp, criterion, dataset_config, dataloaders)
      File "main.py", line 316, in test_model
        curr_iter,
      File "/home/user/anaconda3/envs/3detr/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
        return func(*args, **kwargs)
      File "/home/user/Desktop/3detr/engine.py", line 179, in evaluate
        outputs = model(inputs)
      File "/home/user/anaconda3/envs/3detr/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/user/Desktop/3detr/models/model_3detr.py", line 306, in forward
        enc_xyz, enc_features, enc_inds = self.run_encoder(point_clouds)
      File "/home/user/Desktop/3detr/models/model_3detr.py", line 200, in run_encoder
        pre_enc_features, xyz=pre_enc_xyz
      File "/home/user/anaconda3/envs/3detr/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/user/Desktop/3detr/models/transformer.py", line 190, in forward
        output = layer(output, src_mask=mask, src_key_padding_mask=src_key_padding_mask, pos=pos)
      File "/home/user/anaconda3/envs/3detr/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/user/Desktop/3detr/models/transformer.py", line 288, in forward
        return self.forward_pre(src, src_mask, src_key_padding_mask, pos, return_attn_weights)
      File "/home/user/Desktop/3detr/models/transformer.py", line 272, in forward_pre
        key_padding_mask=src_key_padding_mask)
      File "/home/user/anaconda3/envs/3detr/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/user/anaconda3/envs/3detr/lib/python3.6/site-packages/torch/nn/modules/activation.py", line 819, in forward
        attn_mask=attn_mask)
      File "/home/user/anaconda3/envs/3detr/lib/python3.6/site-packages/torch/nn/functional.py", line 3362, in multi_head_attention_forward
        attn_output_weights += attn_mask
    RuntimeError: output with shape [32, 2048, 2048] doesn't match the broadcast shape [1, 32, 2048, 2048]
    
    opened by bhkim94 14
  • Model Training is Behind in Performance

    Model Training is Behind in Performance

    When training with one batch, and scaling all training rate parameters(except the weight decay) by the sqrt(1/8), I notice my model's training performance is way behind the checkpoints(90 epoch, 180 epochs, 360 epochs, 720 epochs, etc), but a significant amounts by having the results of mAP that is often 40% to 60% of the mAP of the checkpoints. I also tried gradient accumulation up to a batch size of 8, and used the included training parameters, but my model performance, even when training with the scannet_quick script, I still only have a mAP of 30-35. Any tips for getting full performance with a batch size of 1, for setups with limited memory?

    opened by stanleyshly 10
  • Basic question about dataset

    Basic question about dataset

    Hi, I just wonder why you chose RGB-D data for training

    For example, KITTI-3D data (only with point cloud) haven't reached the performance as expected or

    without any reason?

    Just a simple curiosity, thanks

    opened by tk1star2 6
  • _pickle.PicklingError: Can't pickle <class 'numpy.core._exceptions.UFuncTypeError'>: it's not the same object as numpy.core._exceptions.UFuncTypeError

    _pickle.PicklingError: Can't pickle : it's not the same object as numpy.core._exceptions.UFuncTypeError

    Sorry, I accidentally closed my question. Have you tried to use Python 3.8? Traceback (most recent call last): File "/usr/local/python38/lib/python3.8/multiprocessing/queues.py", line 239, in _feed obj = _ForkingPickler.dumps(obj) File "/usr/local/python38/lib/python3.8/multiprocessing/reduction.py", line 51, in dumps cls(buf, protocol).dump(obj) _pickle.PicklingError: Can't pickle <class 'numpy.core._exceptions.UFuncTypeError'>: it's not the same object as numpy.core._exceptions.UFuncTypeError ^CTraceback (most recent call last): File "/home/lab30201/sdb/lwb/3detr-main/main.py", line 430, in launch_distributed(args) File "/home/lab30201/sdb/lwb/3detr-main/main.py", line 418, in launch_distributed main(local_rank=0, args=args) File "/home/lab30201/sdb/lwb/3detr-main/main.py", line 403, in main do_train( File "/home/lab30201/sdb/lwb/3detr-main/main.py", line 179, in do_train aps = train_one_epoch( File "/home/lab30201/sdb/lwb/3detr-main/engine.py", line 74, in train_one_epoch for batch_idx, batch_data_label in enumerate(dataset_loader): File "/usr/local/python38/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in next data = self._next_data() File "/usr/local/python38/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1068, in _next_data idx, data = self._get_data() File "/usr/local/python38/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1034, in _get_data success, data = self._try_get_data() File "/usr/local/python38/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 872, in _try_get_data data = self._data_queue.get(timeout=timeout) File "/usr/local/python38/lib/python3.8/multiprocessing/queues.py", line 107, in get if not self._poll(timeout): File "/usr/local/python38/lib/python3.8/multiprocessing/connection.py", line 257, in poll return self._poll(timeout) File "/usr/local/python38/lib/python3.8/multiprocessing/connection.py", line 424, in _poll r = wait([self], timeout) File "/usr/local/python38/lib/python3.8/multiprocessing/connection.py", line 930, in wait ready = selector.select(timeout) File "/usr/local/python38/lib/python3.8/selectors.py", line 415, in select fd_event_list = self._selector.poll(timeout)

    opened by liudaxia96 5
  • A possible bug about enc_inds

    A possible bug about enc_inds

    There are two downsampling operations when using MaskedTransformerEncoder. However, the final enc_inds is confusing.

    The range of pre_enc_inds (0-N) is different from that of enc_inds (0-preenc_npoints) returned by self.encoder(). And the final output inds should be given by indexing the pre_enc_inds using the enc_inds. I am not sure whether it is a bug.

        def run_encoder(self, point_clouds):
            xyz, features = self._break_up_pc(point_clouds)
            pre_enc_xyz, pre_enc_features, pre_enc_inds = self.pre_encoder(xyz, features)
            # xyz: batch x npoints x 3
            # features: batch x channel x npoints
            # inds: batch x npoints
    
            # nn.MultiHeadAttention in encoder expects npoints x batch x channel features
            pre_enc_features = pre_enc_features.permute(2, 0, 1)
    
            # xyz points are in batch x npointx channel order
            enc_xyz, enc_features, enc_inds = self.encoder(
                pre_enc_features, xyz=pre_enc_xyz
            )
            if enc_inds is None:
                # encoder does not perform any downsampling
                enc_inds = pre_enc_inds
            return enc_xyz, enc_features, enc_inds
    
    opened by Ghostish 5
  • Fix issues with training on point colors

    Fix issues with training on point colors

    Hi there,

    Just noticed this tiny issue that "use_color" option is not included into datasets.build_dataset(), albeit provided in the dataset and main.py. I've fixed the related issues and added a new training script scripts/scannet_masked_eq1080_color.sh for training with point colors.

    I hope this small tweak can be helpful :)

    Best Dave

    CLA Signed 
    opened by daveredrum 4
  • question regarding the optimizer

    question regarding the optimizer

    Hi Ishan Thanks a lot for such an amazing work. I have a question and appreciate it if you can please provide your insight and opinion on that. I see in this work and a couple of your other works you used adamw, I notice that the parameters for the adamw is quite different from the ones that we use for adam though.

    • I was wondering why is that and if there is a reason behind it?

    My other question is that

    • I notice that there is no stablished parameters for adamw yet, in different works of you I notice different parameters for adamw. I would be really thankful if you can provide some suggestion on what should I take into account to choose the right hyper parameters for adamw and if you have any additional suggestion for that matter. Thanks a lot for ur help :)
    opened by seyeeet 4
  • Questions about the code about the criterion.py

    Questions about the code about the criterion.py

    Hello, I do not understand the code "assign = linear_sum_assignment(final_cost[b, :, : nactual_gt[b]])", especially the nactual_gt. why not "assign = linear_sum_assignment(final_cost[b, :, : ])" ? Thanks for your great work and hope your reply soon!

    opened by syy-whu 3
  • Hangs with ngpus > 1

    Hangs with ngpus > 1

    I know this is explicitly not supported, but was wondering if you (or anyone) ran into hangs when trying to parallelize across multiple GPUs? I don't get any explicit errors, the process seems to just stop after 1000 or so steps (and also hangs)

    I've tried pytorch 1.08 (LTS) with CUDA 11.1 and pytorch 1.09 (also with CUDA 11.1). Our cluster uses 3090s so we are unable to run with 10.2. I'm going to try with CUDA 11.3/pytorch 1.10 this week after our nvidia driver is updated.

    opened by mjlbach 3
  • Doesn't Work with Pytorch 1.10

    Doesn't Work with Pytorch 1.10

    With Pytorch 1.9, I get no errors. However, with Pytorch 1.10, I get this error. RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [256, 1, 256]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

    Even though Pytorch 1.10 isn't supported, I was wondering, did any large behavior change happen between pytorch 1.9 and 1.10? It seems odd that this error won't have been raised with 1.9.

    opened by stanleyshly 3
  • How to visualize the output bbox

    How to visualize the output bbox

    Thanks for your amazing work! I would like to visualize the predicted bounding box output, yet I dont know how to do it. I would really appreciate it if anyone can offer some guidance!

    opened by rzhevcherkasy 0
  • Possible to avoid PointNet?

    Possible to avoid PointNet?

    Hi,

    thanks for the wonderful work and the nice repo! So far, both the code and the paper have been a pleasure to read.

    Is it possible to avoid using PointNet2 for the initial downsampling (the PointnetSAModuleVotes module)? I have problems compiling the code for pointnet, probably because I'm on a newer cuda version. Even if the performance is not exactly identical (just in case somebody cares, each error pertains to a wrong number of arguments to std::tuple), some workaround might be useful.

    opened by FabianSchuetze 0
  • Predictions with colored point cloud input

    Predictions with colored point cloud input

    I notice that there is a script called scannet_masked_ep1080_color.sh that uses color + point cloud as model input. However, I did not see any discussions about using color input in the paper or in this code repo. Do you intend to release checkpoints of a model that take color input for reference? Furthermore, since SUNRGBD also has color information, is there any plan to support SUNRGBD with color input as well?

    opened by TonyLianLong 0
  • Question about size loss.

    Question about size loss.

    In function loss_size, the "gt_box_sizes" is scaled by point_cloud_dims on the scale of 0 to ∞ while the "pred_box_sizes" is scaled into (0, 1) by sigmoid. F.l1_loss is used to match "gt_box_sizes" and "pred_box_sizes". Is this the right thing to do?

    In [1]: pred_box_sizes.max()
    Out[1]: tensor(0.8369, device='cuda:0', grad_fn=<MaxBackward1>)
    
    In [2]: gt_box_sizes.max()
    Out[2]: tensor(2.5446, device='cuda:0')
    
    In [3]:
    
    opened by Sharpiless 1
  • Comment on VoxSeT statement about outdoor PCs?

    Comment on VoxSeT statement about outdoor PCs?

    In their recent paper on a "Voxel Set Transformer", He et al. mention that 3DETR can only be applied to indoor datasets:

    3DETR present a promising solution by computing self-attention on a reduced set of seed points, this solution is only applicable to indoor scenes, where the point clouds are relatively dense and concentrated.

    I saw issues #2 #15 #20, where you seem to suggest 3DETR could work on outdoor data, so do you agree with the claim of the authors? Do you know of any successful applications of 3DETR outside of indoor datasets?

    opened by segments-tobias 0
Owner
Facebook Research
Facebook Research
End-to-end face detection, cropping, norm estimation, and landmark detection in a single onnx model

onnx-facial-lmk-detector End-to-end face detection, cropping, norm estimation, and landmark detection in a single onnx model, model.onnx. Demo You can

atksh 42 Dec 30, 2022
This repository is an official implementation of the paper MOTR: End-to-End Multiple-Object Tracking with TRansformer.

MOTR: End-to-End Multiple-Object Tracking with TRansformer This repository is an official implementation of the paper MOTR: End-to-End Multiple-Object

null 348 Jan 7, 2023
Pytorch library for end-to-end transformer models training and serving

Pytorch library for end-to-end transformer models training and serving

Mikhail Grankin 768 Jan 1, 2023
End-to-End Object Detection with Fully Convolutional Network

This project provides an implementation for "End-to-End Object Detection with Fully Convolutional Network" on PyTorch.

null 472 Dec 22, 2022
Official repository for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'21, Oral Presentation)

Official PyTorch Implementation for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'2021, Oral Presentation) HOTR: End-to-

Kakao Brain 114 Nov 28, 2022
Sparse R-CNN: End-to-End Object Detection with Learnable Proposals, CVPR2021

End-to-End Object Detection with Learnable Proposal, CVPR2021

Peize Sun 1.2k Dec 27, 2022
Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework

This repo is the official implementation of "Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework". @inproceedings{zhou2021insta

null 34 Dec 31, 2022
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Object Detection and Instance Segmentation.

Swin Transformer for Object Detection This repo contains the supported code and configuration files to reproduce object detection results of Swin Tran

Swin Transformer 1.4k Dec 30, 2022
Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera. This project prepares training and testing data for various deep learning projects such as 6D object pose estimation projects singleshotpose, as well as object detection and instance segmentation projects.

null 305 Dec 16, 2022
Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Hybrid-Supervised Object Detection System Object detection system trained by hybrid-supervision/weakly semi-supervision (HSOD/WSSOD): This project is

null 5 Dec 10, 2022
Yolo object detection - Yolo object detection with python

How to run download required files make build_image make download Docker versio

null 3 Jan 26, 2022
[CVPR'21] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

TransFuser This repository contains the code for the CVPR 2021 paper Multi-Modal Fusion Transformer for End-to-End Autonomous Driving. If you find our

null 695 Jan 5, 2023
"SOLQ: Segmenting Objects by Learning Queries", SOLQ is an end-to-end instance segmentation framework with Transformer.

SOLQ: Segmenting Objects by Learning Queries This repository is an official implementation of the paper SOLQ: Segmenting Objects by Learning Queries.

MEGVII Research 179 Jan 2, 2023
METER: Multimodal End-to-end TransformER

METER Code and pre-trained models will be publicized soon. Citation @article{dou2021meter, title={An Empirical Study of Training End-to-End Vision-a

Zi-Yi Dou 257 Jan 6, 2023
Deformable DETR is an efficient and fast-converging end-to-end object detector.

Deformable DETR: Deformable Transformers for End-to-End Object Detection.

null 2k Jan 5, 2023
[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

EPro-PnP EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation In CVPR 2022 (Oral). [paper] Hanshen

 同济大学智能汽车研究所综合感知研究组 ( Comprehensive Perception Research Group under Institute of Intelligent Vehicles, School of Automotive Studies, Tongji University) 842 Jan 4, 2023
End-to-end machine learning project for rices detection

Basmatinet Welcome to this project folks ! Whether you like it or not this project is all about riiiiice or riz in french. It is also about Deep Learn

Béranger 47 Jun 18, 2022