Embracing Single Stride 3D Object Detector with Sparse Transformer

Related tags

Deep Learning SST
Overview

SST: Single-stride Sparse Transformer

This is the official implementation of paper:

Embracing Single Stride 3D Object Detector with Sparse Transformer

Authors: Lue Fan, Ziqi Pang, Tianyuan Zhang, Yu-Xiong Wang, Hang Zhao, Feng Wang, Naiyan Wang, Zhaoxiang Zhang

Paper Link (Check again on Monday)

Introduction and Highlights

  • SST is a single-stride network, which maintains original feature resolution from the beginning to the end of the network. Due to the characterisric of single stride, SST achieves exciting performances on small object detection (Pedestrian, Cyclist).
  • For simplicity, except for backbone, SST is almost the same with the basic PointPillars in MMDetection3D. With such a basic setting, SST achieves state-of-the-art performance in Pedestrian and Cyclist and outperforms PointPillars more than 10 AP only at a cost of 1.5x latency.
  • SST consists of 6 Regional Sparse Attention (SRA) blocks, which deal with the sparse voxel set. It's similar to Submanifold Sparse Convolution (SSC), but much more powerful than SSC. It's locality and sparsity guarantee the efficiency in the single stride setting.
  • The SRA can also be used in many other task to process sparse point clouds. Our implementation of SRA only relies on the pure Python APIs in PyTorch without engineering efforts as taken in the CUDA implementation of sparse convolution.
  • Large room for further improvements. For example, second stage, anchor-free head, IoU scores and advanced techniques from ViT, etc.

Usage

PyTorch >= 1.9 is highly recommended for a better support of the checkpoint technique.

Our immplementation is based on MMDetection3D, so just follow their getting_started and simply run the script: run.sh. Then you will get a basic results of SST after 5~7 hours (depends on your devices).

We only provide the single-stage model here, as for our two-stage models, please follow LiDAR-RCNN. It's also a good choice to apply other powerful second stage detectors to our single-stage SST.

Main results

Single-stage Model (based on PointPillars) on Waymo validation split

#Sweeps Veh_L1 Ped_L1 Cyc_L1
SST_1f 1 73.57 80.01 70.72
SST_3f 3 75.16 83.24 75.96

Note that we train the 3 classes together, so the performance above is a little bit lower than that reported in our paper.

TODO

  • Build SRA block with similar API as Sparse Convolution for more convenient usage.

Acknowlegement

This project is based on the following codebases.

Comments
  • KeyError: 'type

    KeyError: 'type

    Hi, thanks for this fantastic work. I have met a problem seeing blow.

    Traceback (most recent call last):
      File "tools/train.py", line 230, in <module>
        main()
      File "tools/train.py", line 194, in main
        datasets = [build_dataset(cfg.data.train)]
      File "/home/chen_haoye/WS_SST/SST/mmdet3d/datasets/builder.py", line 32, in build_dataset
        build_dataset(cfg['dataset'], default_args), cfg['times'])
      File "/home/chen_haoye/WS_SST/SST/mmdet3d/datasets/builder.py", line 26, in build_dataset
        elif cfg['type'] == 'ConcatDataset':
      File "/home/chen_haoye/anaconda3/envs/SST/lib/python3.7/site-packages/mmcv/utils/config.py", line 36, in __missing__
        raise KeyError(name)
    KeyError: 'type'
    

    I have printed the keys in cfg, but the cfg did have the type key. What could be wrong?

    opened by chyohoo 14
  • question about fast run

    question about fast run

    Hi, The waymo dataset is too large. Is it possible to run the code without downloading the waymo dataset, e.g., a lightweight demo? (I only want to know the implementation details about the code so that I can better use it, e.g., input and output shape of each function.) Best,

    opened by pansanity666 11
  • Installation

    Installation

    Hey! FIrst of all thanks for this amazing work and thanks a lot for providing open source code for the same. But I have query with installation. I have latest mm3ddetection running with pytorch 1.12.1 but if I try to install using the same versions of libraries library is not able to setup properly. Also the link to GETTING STARTED of mm3ddetection is expired. Besides that I have custom dataloader for my dataset, one question was if I want to use it for more than one class for example with nuscenes dataset is it possible to have a config for that?

    Thanks a lot!

    opened by Dhagash4 9
  • Inf in attached conv layers

    Inf in attached conv layers

    Hi, recently I used SST to train on nuScenes dataset. Everything worked fine in the beginning. But after several epchos, I got nan in bbox_loss and dir_los. And I found that the loss is caused by inf output from the attached conv layer at the end of the sstv1, where the output from recover_bev is fp32 and the intermediate feature maps from conv2d in attached_conv output fp16 values, with the training going on the output values becoming inf. I print the weights of conv which is normal. I tried to clamp the inf value in the feature map, but inf value occurs in the following layers, what could be wrong?

    opened by chyohoo 9
  • About other datasets

    About other datasets

    Hi, have you experimented on some other outdoor datasets such as nuscenes? As i used SST to train on nuScenes dataset, the results i got were not ideal. I just modified the hyperparameters about the voxel size and replaced the head .I would like to ask whether there is a problem. Thanks!

    opened by Zoeeeing 8
  • question about specifications of evaluation

    question about specifications of evaluation

    I had an exciting time to travel your codes the last few weeks, and I finally got the below results.

    image

    but I have a question about Waymo evaluation code.

    How could I check the configuration of these metrics such as "Does it use bev or 3D, does it use @11 or @40 for mAP"

    Because in KITTI, there is an option for choosing 'bev' or '3D', but I can't find it in waymo, except "compute_detection_metrics_main" created by binary file.

    Summary: How could I check configurations of metrics on Waymo evaluation.

    opened by seonhoon1002 7
  • Training on kitti dataset

    Training on kitti dataset

    Hi, I tried to train SST on kitti dataset. According to some previous issues in this repo, I modified the config file. But It did not work. And I noticed that the error happen when the dynamic voxelize layer received the point clouds of shape [0, 4]. I print the "empty" point cloud file name, but it was not empty actually. I am so confuse about this error. Could you give me some suggestion to fix it? Thank you in advance!

    2022-03-17 21:00:37,919 - mmdet - INFO - Checkpoints will be saved to /data1/cqy/code/SST/work_dirs/sst_kittiD1_1x_3class_8heads/3161933 by HardDiskBackend.
    2022-03-17 21:00:38.164014: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
    drop_info is set to {0: {'max_tokens': 30, 'drop_range': (0, 30)}, 1: {'max_tokens': 60, 'drop_range': (30, 60)}, 2: {'max_tokens': 100, 'drop_range': (60, 100000)}}, in input_layer
    /opt/anaconda3/envs/SST/lib/python3.8/site-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
    To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at  /pytorch/aten/src/ATen/native/BinaryOps.cpp:467.)
      return torch.floor_divide(self, other)
    2022-03-17 21:00:54,870 - mmcv - INFO - Reducer buckets have been rebuilt in this iteration.
    No voxel belongs to drop_level:1 in shift 0
    No voxel belongs to drop_level:2 in shift 1
    No voxel belongs to drop_level:1 in shift 1
    No voxel belongs to drop_level:2 in shift 0
    2022-03-17 21:01:12,848 - mmdet - INFO - Epoch [1][50/7424]     lr: 1.000e-03, eta: 1 day, 17:31:34, time: 0.671, data_time: 0.293, memory: 3440, loss_cls: 25.1039, loss_bbox: 0.8919, loss_dir: 0.2588, loss: 26.2546, grad_norm: 138.2142
    2022-03-17 21:01:31,472 - mmdet - INFO - Epoch [1][100/7424]    lr: 1.000e-03, eta: 1 day, 8:16:30, time: 0.372, data_time: 0.003, memory: 3440, loss_cls: 0.6193, loss_bbox: 0.5477, loss_dir: 0.1365, loss: 1.3034, grad_norm: 8.0083
    2022-03-17 21:01:49,711 - mmdet - INFO - Epoch [1][150/7424]    lr: 1.000e-03, eta: 1 day, 5:01:46, time: 0.365, data_time: 0.003, memory: 3440, loss_cls: 0.5878, loss_bbox: 0.5475, loss_dir: 0.1352, loss: 1.2706, grad_norm: 7.4713
    2022-03-17 21:02:08,372 - mmdet - INFO - Epoch [1][200/7424]    lr: 1.000e-03, eta: 1 day, 3:32:03, time: 0.373, data_time: 0.003, memory: 3440, loss_cls: 0.5782, loss_bbox: 0.5325, loss_dir: 0.1384, loss: 1.2492, grad_norm: 6.6802
    data/kitti/training/velodyne_reduced/006279.bin
    Traceback (most recent call last):
      File "tools/train.py", line 230, in <module>
        main()
      File "tools/train.py", line 220, in main
        train_model(
      File "/data1/cqy/code/SST/mmdet3d/apis/train.py", line 27, in train_model
        train_detector(
      File "/opt/anaconda3/envs/SST/lib/python3.8/site-packages/mmdet/apis/train.py", line 170, in train_detector
        runner.run(data_loaders, cfg.workflow)
      File "/opt/anaconda3/envs/SST/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
        epoch_runner(data_loaders[i], **kwargs)
      File "/opt/anaconda3/envs/SST/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
        self.run_iter(data_batch, train_mode=True, **kwargs)
      File "/opt/anaconda3/envs/SST/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 29, in run_iter
        outputs = self.model.train_step(data_batch, self.optimizer,
      File "/opt/anaconda3/envs/SST/lib/python3.8/site-packages/mmcv/parallel/distributed.py", line 52, in train_step
        output = self.module.train_step(*inputs[0], **kwargs[0])
      File "/opt/anaconda3/envs/SST/lib/python3.8/site-packages/mmdet/models/detectors/base.py", line 237, in train_step
        losses = self(**data)
      File "/opt/anaconda3/envs/SST/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
        return forward_call(*input, **kwargs)
      File "/opt/anaconda3/envs/SST/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 98, in new_func
        return old_func(*args, **kwargs)
      File "/data1/cqy/code/SST/mmdet3d/models/detectors/base.py", line 58, in forward
        return self.forward_train(**kwargs)
      File "/data1/cqy/code/SST/mmdet3d/models/detectors/voxelnet.py", line 90, in forward_train
        x = self.extract_feat(points, img_metas)
      File "/data1/cqy/code/SST/mmdet3d/models/detectors/dynamic_voxelnet.py", line 39, in extract_feat
        voxels, coors = self.voxelize(points)
      File "/opt/anaconda3/envs/SST/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
        return func(*args, **kwargs)
      File "/opt/anaconda3/envs/SST/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 186, in new_func
        return old_func(*args, **kwargs)
      File "/data1/cqy/code/SST/mmdet3d/models/detectors/dynamic_voxelnet.py", line 62, in voxelize
        res_coors = self.voxel_layer(res)
      File "/opt/anaconda3/envs/SST/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
        return forward_call(*input, **kwargs)
      File "/data1/cqy/code/SST/mmdet3d/ops/voxel/voxelize.py", line 113, in forward
        return voxelization(input, self.voxel_size, self.point_cloud_range,
      File "/data1/cqy/code/SST/mmdet3d/ops/voxel/voxelize.py", line 44, in forward
        dynamic_voxelize(points, coors, voxel_size, coors_range, 3)
    RuntimeError: CUDA error: invalid configuration argument
    
    opened by QYChan 7
  • Questions about the training hyperparameters

    Questions about the training hyperparameters

    Hi, I have some questions about the training hyperparameters.

    1. What’s the best model’s training batchsize(samples_per_gpu) exactly is in the Table.2 & 3 of your paper? All the training configs you provided in the repo set samples_per_gpu=1, is the same as the best model?

    2. The ablation study in the paper using 20% data for training. What about other training hyperparameters? Such as training epochs, batchsize, training 3 classes together or separately.

    I’m doing some reproducing experiments, so I need the training hyperparameters mentioned above.

    opened by 1349949 7
  • About configuration question

    About configuration question

    Hello, I have two questions about your codes.

    1. In comments on config files in the D5 series, "D5 in the config name means the whole dataset is divided into 5 folds, We only use one fold for efficient experiments." So this means that it uses 20% of the waymo dataset?
    2. There are 5dim in configs, what is the mean of 5dim in lidar?

    Summary

    1. What is the mean of D5 in config files?
    2. What is the mean of 5dim in config files in lidar?
    opened by seonhoon1002 6
  • pretrained weights about backbone?

    pretrained weights about backbone?

    1. Did you release the pre training weight of backbone?
    2. I used two frames of data to superimpose and input on the 128 line lidar. The results of 1.3W data after 20epoch are as follows. Can you give me some comments?
    opened by gp1234567 6
  • Training config for paper results

    Training config for paper results

    Would you mind sharing the configs used for table 2 & 3? Since they are trained for 24 epochs on the full waymo dataset, which takes a long time to run, I want to get it right for reproducing the results.

    Thank you very much for the help!

    opened by Stephanie-Shen324 6
  • WOD version of 3-frame waymo config

    WOD version of 3-frame waymo config

    Hi @Abyssaledge ,

    I noticed some differences in configs of 3-frame and single-frame FSD on waymo. It seems like different versions of WOD dataset is used (because the in_channels is different). I have the following questions

    • I have re-generated waymo_dbinfo_train.pkl using your code. Should I use newer version of dbinfo to train the 3f model, or I just change the in_channels back to 5? Does it have a big impact on the final result?
    • What is the tanh_dims and voxel_downsampling_size mean in 3f config? I wonder why there are used in 3f but not in single frame config.
    • feat_channels changed from 64 to 32. Could you also explain the reason?

    image

    Thanks!

    opened by shawnding 3
  • Dyanmic voxel in_channels size (kitti dataset problem)

    Dyanmic voxel in_channels size (kitti dataset problem)

    你好,我想用kitti的数据集跑一下, 我修改了config里面的点云范围,现在DynamicVFE里面输入的input channel和点云的特征通道匹配不上

    Traceback (most recent call last): File "tools/train.py", line 230, in main() File "tools/train.py", line 220, in main train_model( File "/home/tan/SST/mmdet3d/apis/train.py", line 27, in train_model train_detector( File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/mmdet/apis/train.py", line 244, in train_detector runner.run(data_loaders, cfg.workflow) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run epoch_runner(data_loaders[i], **kwargs) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train self.run_iter(data_batch, train_mode=True, **kwargs) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 29, in run_iter outputs = self.model.train_step(data_batch, self.optimizer, File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/mmcv/parallel/data_parallel.py", line 75, in train_step return self.module.train_step(*inputs[0], **kwargs[0]) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/mmdet/models/detectors/base.py", line 248, in train_step losses = self(**data) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 98, in new_func return old_func(*args, **kwargs) File "/home/tan/SST/mmdet3d/models/detectors/base.py", line 58, in forward return self.forward_train(**kwargs) File "/home/tan/SST/mmdet3d/models/detectors/dynamic_voxelnet.py", line 122, in forward_train x = self.extract_feat(points, img_metas) File "/home/tan/SST/mmdet3d/models/detectors/dynamic_voxelnet.py", line 43, in extract_feat voxel_features, feature_coors = self.voxel_encoder(voxels, coors) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 186, in new_func return old_func(*args, **kwargs) File "/home/tan/SST/mmdet3d/models/voxel_encoders/voxel_encoder.py", line 285, in forward point_feats = vfe(features) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 98, in new_func return old_func(*args, **kwargs) File "/home/tan/SST/mmdet3d/models/voxel_encoders/utils.py", line 141, in forward x = self.linear(inputs) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 103, in forward return F.linear(input, self.weight, self.bias) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/torch/nn/functional.py", line 1848, in linear return torch._C._nn.linear(input, weight, bias) RuntimeError: mat1 and mat2 shapes cannot be multiplied (29510x10 and 11x64)

    在mmdet3d/models/detectors/dynamic_voxelnet.py 43行点云体素化的特征矩阵大小为(M*4) 在mmdet3d/models/voxel_encoders/utils.py 141行linear曾输入维度为config里面的5+3+3=11,与上面4+3+3=10对不上 现在不太清楚需要修改什么地方,是模型用在kiiti上里面网络层参数也需要改变吗?

    opened by SH-Tan 8
  • installation and run problem

    installation and run problem

    After when I run command run.sh, it gives the error as follows. importError: /mnt/cache/wangyingjie/SST/mmdet3d/ops/ball_query/ball_query_ext.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZNK2at10TensorBase8data_ptrIfEEPT_v

    When install FSD, I refers to this: https://github.com/tusen-ai/SST/issues/6 My environment is as follows:

    sys.platform: linux Python: 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:18) [GCC 10.3.0] CUDA available: True GPU 0,1,2,3,4,5,6,7: NVIDIA A100-SXM4-80GB CUDA_HOME: /usr/local/cuda NVCC: Build cuda_11.2.r11.2/compiler.29618528_0 GCC: gcc (GCC) 5.4.0 PyTorch: 1.9.0+cu111 PyTorch compiling details: PyTorch built with:

    • GCC 7.3
    • C++ Version: 201402
    • Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
    • Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
    • OpenMP 201511 (a.k.a. OpenMP 4.5)
    • NNPACK is enabled
    • CPU capability usage: AVX2
    • CUDA Runtime 11.1
    • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
    • CuDNN 8.0.5
    • Magma 2.5.2
    • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

    TorchVision: 0.10.0+cu111 OpenCV: 4.6.0 MMCV: 1.3.9 MMCV Compiler: GCC 7.3 MMCV CUDA Compiler: 11.1 MMDetection: 2.14.0+2028b0c MMSegmentation: 0.14.1 MMDetection3D: 0.15.0

    opened by JessieW0806 20
  • Detailed Performance of FSD

    Detailed Performance of FSD

    Performance of fsd_waymoD1_1x_submission.py (test set): submission Performance of fsd_waymoD1_1x.py (validation set):

    OBJECT_TYPE_TYPE_VEHICLE_LEVEL_1: [mAP 0.792293] [mAPH 0.787888]                                                                                              
    OBJECT_TYPE_TYPE_VEHICLE_LEVEL_2: [mAP 0.704975] [mAPH 0.700932]
    OBJECT_TYPE_TYPE_PEDESTRIAN_LEVEL_1: [mAP 0.825951] [mAPH 0.773279]
    OBJECT_TYPE_TYPE_PEDESTRIAN_LEVEL_2: [mAP 0.739378] [mAPH 0.690696]
    OBJECT_TYPE_TYPE_SIGN_LEVEL_1: [mAP 0] [mAPH 0]     
    OBJECT_TYPE_TYPE_SIGN_LEVEL_2: [mAP 0] [mAPH 0]                   
    OBJECT_TYPE_TYPE_CYCLIST_LEVEL_1: [mAP 0.77096] [mAPH 0.759925]   
    OBJECT_TYPE_TYPE_CYCLIST_LEVEL_2: [mAP 0.743786] [mAPH 0.73312]    
    RANGE_TYPE_VEHICLE_[0, 30)_LEVEL_1: [mAP 0.928933] [mAPH 0.925183] 
    RANGE_TYPE_VEHICLE_[0, 30)_LEVEL_2: [mAP 0.916398] [mAPH 0.91269]    
    RANGE_TYPE_VEHICLE_[30, 50)_LEVEL_1: [mAP 0.782089] [mAPH 0.777451]  
    RANGE_TYPE_VEHICLE_[30, 50)_LEVEL_2: [mAP 0.714469] [mAPH 0.710126]
    RANGE_TYPE_VEHICLE_[50, +inf)_LEVEL_1: [mAP 0.593706] [mAPH 0.586526]
    RANGE_TYPE_VEHICLE_[50, +inf)_LEVEL_2: [mAP 0.461929] [mAPH 0.456148]
    RANGE_TYPE_PEDESTRIAN_[0, 30)_LEVEL_1: [mAP 0.865141] [mAPH 0.823203]
    RANGE_TYPE_PEDESTRIAN_[0, 30)_LEVEL_2: [mAP 0.82562] [mAPH 0.784474]
    RANGE_TYPE_PEDESTRIAN_[30, 50)_LEVEL_1: [mAP 0.814913] [mAPH 0.756908]
    RANGE_TYPE_PEDESTRIAN_[30, 50)_LEVEL_2: [mAP 0.738424] [mAPH 0.684682]
    RANGE_TYPE_PEDESTRIAN_[50, +inf)_LEVEL_1: [mAP 0.74548] [mAPH 0.662026]
    RANGE_TYPE_PEDESTRIAN_[50, +inf)_LEVEL_2: [mAP 0.588333] [mAPH 0.5195]
    RANGE_TYPE_SIGN_[0, 30)_LEVEL_1: [mAP 0] [mAPH 0]
    RANGE_TYPE_SIGN_[0, 30)_LEVEL_2: [mAP 0] [mAPH 0]
    RANGE_TYPE_SIGN_[30, 50)_LEVEL_1: [mAP 0] [mAPH 0]
    RANGE_TYPE_SIGN_[30, 50)_LEVEL_2: [mAP 0] [mAPH 0]
    RANGE_TYPE_SIGN_[50, +inf)_LEVEL_1: [mAP 0] [mAPH 0]
    RANGE_TYPE_SIGN_[50, +inf)_LEVEL_2: [mAP 0] [mAPH 0]
    RANGE_TYPE_CYCLIST_[0, 30)_LEVEL_1: [mAP 0.854262] [mAPH 0.843677]
    RANGE_TYPE_CYCLIST_[0, 30)_LEVEL_2: [mAP 0.848135] [mAPH 0.837626]
    RANGE_TYPE_CYCLIST_[30, 50)_LEVEL_1: [mAP 0.733669] [mAPH 0.722557]
    RANGE_TYPE_CYCLIST_[30, 50)_LEVEL_2: [mAP 0.695306] [mAPH 0.684749]
    RANGE_TYPE_CYCLIST_[50, +inf)_LEVEL_1: [mAP 0.616862] [mAPH 0.602011]
    RANGE_TYPE_CYCLIST_[50, +inf)_LEVEL_2: [mAP 0.576674] [mAPH 0.562745]
    
    
    opened by Abyssaledge 4
  • Detailed Performance of CenterHead SST

    Detailed Performance of CenterHead SST

    Validation split. Single frame; lidar-only; No ensemble; Just for reference.

    OBJECT_TYPE_TYPE_VEHICLE_LEVEL_1: [mAP 0.751252] [mAPH 0.746433]
    OBJECT_TYPE_TYPE_VEHICLE_LEVEL_2: [mAP 0.666054] [mAPH 0.661673]
    OBJECT_TYPE_TYPE_PEDESTRIAN_LEVEL_1: [mAP 0.800718] [mAPH 0.72117]
    OBJECT_TYPE_TYPE_PEDESTRIAN_LEVEL_2: [mAP 0.723756] [mAPH 0.650058]
    OBJECT_TYPE_TYPE_SIGN_LEVEL_1: [mAP 0] [mAPH 0]
    OBJECT_TYPE_TYPE_SIGN_LEVEL_2: [mAP 0] [mAPH 0]
    OBJECT_TYPE_TYPE_CYCLIST_LEVEL_1: [mAP 0.714879] [mAPH 0.702029]
    OBJECT_TYPE_TYPE_CYCLIST_LEVEL_2: [mAP 0.688492] [mAPH 0.67611]
    RANGE_TYPE_VEHICLE_[0, 30)_LEVEL_1: [mAP 0.919894] [mAPH 0.915598]
    RANGE_TYPE_VEHICLE_[0, 30)_LEVEL_2: [mAP 0.9077] [mAPH 0.903451]
    RANGE_TYPE_VEHICLE_[30, 50)_LEVEL_1: [mAP 0.734855] [mAPH 0.729633]
    RANGE_TYPE_VEHICLE_[30, 50)_LEVEL_2: [mAP 0.669229] [mAPH 0.664388]
    RANGE_TYPE_VEHICLE_[50, +inf)_LEVEL_1: [mAP 0.506878] [mAPH 0.500256]
    RANGE_TYPE_VEHICLE_[50, +inf)_LEVEL_2: [mAP 0.389791] [mAPH 0.384518]
    RANGE_TYPE_PEDESTRIAN_[0, 30)_LEVEL_1: [mAP 0.8519] [mAPH 0.779192]
    RANGE_TYPE_PEDESTRIAN_[0, 30)_LEVEL_2: [mAP 0.818607] [mAPH 0.747716]
    RANGE_TYPE_PEDESTRIAN_[30, 50)_LEVEL_1: [mAP 0.790649] [mAPH 0.709307]
    RANGE_TYPE_PEDESTRIAN_[30, 50)_LEVEL_2: [mAP 0.722803] [mAPH 0.64698]
    RANGE_TYPE_PEDESTRIAN_[50, +inf)_LEVEL_1: [mAP 0.692475] [mAPH 0.587309]
    RANGE_TYPE_PEDESTRIAN_[50, +inf)_LEVEL_2: [mAP 0.556325] [mAPH 0.469353]
    RANGE_TYPE_SIGN_[0, 30)_LEVEL_1: [mAP 0] [mAPH 0]
    RANGE_TYPE_SIGN_[0, 30)_LEVEL_2: [mAP 0] [mAPH 0]
    RANGE_TYPE_SIGN_[30, 50)_LEVEL_1: [mAP 0] [mAPH 0]
    RANGE_TYPE_SIGN_[30, 50)_LEVEL_2: [mAP 0] [mAPH 0]
    RANGE_TYPE_SIGN_[50, +inf)_LEVEL_1: [mAP 0] [mAPH 0]
    RANGE_TYPE_SIGN_[50, +inf)_LEVEL_2: [mAP 0] [mAPH 0]
    RANGE_TYPE_CYCLIST_[0, 30)_LEVEL_1: [mAP 0.799435] [mAPH 0.787482]
    RANGE_TYPE_CYCLIST_[0, 30)_LEVEL_2: [mAP 0.793702] [mAPH 0.781835]
    RANGE_TYPE_CYCLIST_[30, 50)_LEVEL_1: [mAP 0.677939] [mAPH 0.66574]
    RANGE_TYPE_CYCLIST_[30, 50)_LEVEL_2: [mAP 0.640499] [mAPH 0.628967]
    RANGE_TYPE_CYCLIST_[50, +inf)_LEVEL_1: [mAP 0.558871] [mAPH 0.541335]
    RANGE_TYPE_CYCLIST_[50, +inf)_LEVEL_2: [mAP 0.520811] [mAPH 0.504459]
    
    opened by Abyssaledge 0
Owner
TuSimple
The Future of Trucking
TuSimple
A CNN implementation using only numpy. Supports multidimensional images, stride, etc.

A CNN implementation using only numpy. Supports multidimensional images, stride, etc. Speed up due to heavy use of slicing and mathematical simplification..

null 2 Nov 30, 2021
Lane follower: Lane-detector (OpenCV) + Object-detector (YOLO5) + CAN-bus

Lane Follower This code is for the lane follower, including perception and control, as shown below. Environment Hardware Industrial Camera Intel-NUC(1

Siqi Fan 3 Jul 7, 2022
A whale detector design for the Kaggle whale-detector challenge!

CNN (InceptionV1) + STFT based Whale Detection Algorithm So, this repository is my PyTorch solution for the Kaggle whale-detection challenge. The obje

Tarin Ziyaee 92 Sep 28, 2021
HeartRate detector with ArduinoandPython - Use Arduino and Python create a heartrate detector.

Syllabus of Contents Syllabus of Contents Introduction Of Project Features Develop With Python code introduction Installation License Developer Contac

null 1 Jan 5, 2022
Video lie detector using xgboost - A video lie detector using OpenFace and xgboost

video_lie_detector_using_xgboost a video lie detector using OpenFace and xgboost

null 2 Jan 11, 2022
Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

HackED 2022 Team 3IQ - 2022 Imposter Detector By Aneeljyot Alagh, Curtis Kan, Jo

Joshua Ji 3 Aug 20, 2022
Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

Differentiable Neural Computers and family, for Pytorch Includes: Differentiable Neural Computers (DNC) Sparse Access Memory (SAM) Sparse Differentiab

ixaxaar 302 Dec 14, 2022
ViDT: An Efficient and Effective Fully Transformer-based Object Detector

ViDT: An Efficient and Effective Fully Transformer-based Object Detector by Hwanjun Song1, Deqing Sun2, Sanghyuk Chun1, Varun Jampani2, Dongyoon Han1,

NAVER AI 262 Dec 27, 2022
SSD: Single Shot MultiBox Detector pytorch implementation focusing on simplicity

SSD: Single Shot MultiBox Detector Introduction Here is my pytorch implementation of 2 models: SSD-Resnet50 and SSDLite-MobilenetV2.

Viet Nguyen 149 Jan 7, 2023
A PyTorch Implementation of Single Shot Scale-invariant Face Detector.

S³FD: Single Shot Scale-invariant Face Detector A PyTorch Implementation of Single Shot Scale-invariant Face Detector. Eval python wider_eval_pytorch.

carwin 235 Jan 7, 2023
A PyTorch Implementation of Single Shot MultiBox Detector

SSD: Single Shot MultiBox Object Detector, in PyTorch A PyTorch implementation of Single Shot MultiBox Detector from the 2016 paper by Wei Liu, Dragom

Max deGroot 4.8k Jan 7, 2023
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Phil Wang 12.6k Jan 9, 2023
Official code for paper "Demystifying Local Vision Transformer: Sparse Connectivity, Weight Sharing, and Dynamic Weight"

Demysitifing Local Vision Transformer, arxiv This is the official PyTorch implementation of our paper. We simply replace local self attention by (dyna

null 138 Dec 28, 2022
VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).

VSR-Transformer By Jiezhang Cao, Yawei Li, Kai Zhang, Luc Van Gool This paper proposes a new Transformer for video super-resolution (called VSR-Transf

Jiezhang Cao 225 Nov 13, 2022
Official code of the paper "ReDet: A Rotation-equivariant Detector for Aerial Object Detection" (CVPR 2021)

ReDet: A Rotation-equivariant Detector for Aerial Object Detection ReDet: A Rotation-equivariant Detector for Aerial Object Detection (CVPR2021), Jiam

csuhan 334 Dec 23, 2022
Deformable DETR is an efficient and fast-converging end-to-end object detector.

Deformable DETR: Deformable Transformers for End-to-End Object Detection.

null 2k Jan 5, 2023
Official Implementation of DDOD (Disentangle your Dense Object Detector), ACM MM2021

Disentangle Your Dense Object Detector This repo contains the supported code and configuration files to reproduce object detection results of Disentan

loveSnowBest 51 Jan 7, 2023
LiDAR R-CNN: An Efficient and Universal 3D Object Detector

LiDAR R-CNN: An Efficient and Universal 3D Object Detector Introduction This is the official code of LiDAR R-CNN: An Efficient and Universal 3D Object

TuSimple 295 Jan 5, 2023
Morphable Detector for Object Detection on Demand

Morphable Detector for Object Detection on Demand (ICCV 2021) PyTorch implementation of the paper Morphable Detector for Object Detection on Demand. I

null 9 Feb 23, 2022