Embracing Single Stride 3D Object Detector with Sparse Transformer

TuSimple

Last update: Dec 28, 2022

Related tags

Deep Learning SST

Overview

SST: Single-stride Sparse Transformer

This is the official implementation of paper:

Embracing Single Stride 3D Object Detector with Sparse Transformer

Authors: Lue Fan, Ziqi Pang, Tianyuan Zhang, Yu-Xiong Wang, Hang Zhao, Feng Wang, Naiyan Wang, Zhaoxiang Zhang

Paper Link (Check again on Monday)

Introduction and Highlights

SST is a single-stride network, which maintains original feature resolution from the beginning to the end of the network. Due to the characterisric of single stride, SST achieves exciting performances on small object detection (Pedestrian, Cyclist).
For simplicity, except for backbone, SST is almost the same with the basic PointPillars in MMDetection3D. With such a basic setting, SST achieves state-of-the-art performance in Pedestrian and Cyclist and outperforms PointPillars more than 10 AP only at a cost of 1.5x latency.
SST consists of 6 Regional Sparse Attention (SRA) blocks, which deal with the sparse voxel set. It's similar to Submanifold Sparse Convolution (SSC), but much more powerful than SSC. It's locality and sparsity guarantee the efficiency in the single stride setting.
The SRA can also be used in many other task to process sparse point clouds. Our implementation of SRA only relies on the pure Python APIs in PyTorch without engineering efforts as taken in the CUDA implementation of sparse convolution.
Large room for further improvements. For example, second stage, anchor-free head, IoU scores and advanced techniques from ViT, etc.

Usage

PyTorch >= 1.9 is highly recommended for a better support of the checkpoint technique.

Our immplementation is based on MMDetection3D, so just follow their getting_started and simply run the script: run.sh. Then you will get a basic results of SST after 5~7 hours (depends on your devices).

We only provide the single-stage model here, as for our two-stage models, please follow LiDAR-RCNN. It's also a good choice to apply other powerful second stage detectors to our single-stage SST.

Main results

Single-stage Model (based on PointPillars) on Waymo validation split

	#Sweeps	Veh_L1	Ped_L1	Cyc_L1
SST_1f	1	73.57	80.01	70.72
SST_3f	3	75.16	83.24	75.96

Note that we train the 3 classes together, so the performance above is a little bit lower than that reported in our paper.

TODO

Build SRA block with similar API as Sparse Convolution for more convenient usage.

Acknowlegement

This project is based on the following codebases.

Comments

KeyError: 'type

Hi, thanks for this fantastic work. I have met a problem seeing blow.

Traceback (most recent call last):
  File "tools/train.py", line 230, in <module>
    main()
  File "tools/train.py", line 194, in main
    datasets = [build_dataset(cfg.data.train)]
  File "/home/chen_haoye/WS_SST/SST/mmdet3d/datasets/builder.py", line 32, in build_dataset
    build_dataset(cfg['dataset'], default_args), cfg['times'])
  File "/home/chen_haoye/WS_SST/SST/mmdet3d/datasets/builder.py", line 26, in build_dataset
    elif cfg['type'] == 'ConcatDataset':
  File "/home/chen_haoye/anaconda3/envs/SST/lib/python3.7/site-packages/mmcv/utils/config.py", line 36, in __missing__
    raise KeyError(name)
KeyError: 'type'

I have printed the keys in cfg, but the cfg did have the type key. What could be wrong?

opened by chyohoo 14

question about fast run

Hi, The waymo dataset is too large. Is it possible to run the code without downloading the waymo dataset, e.g., a lightweight demo? (I only want to know the implementation details about the code so that I can better use it, e.g., input and output shape of each function.) Best,

opened by pansanity666 11
Installation

Hey! FIrst of all thanks for this amazing work and thanks a lot for providing open source code for the same. But I have query with installation. I have latest mm3ddetection running with pytorch 1.12.1 but if I try to install using the same versions of libraries library is not able to setup properly. Also the link to GETTING STARTED of mm3ddetection is expired. Besides that I have custom dataloader for my dataset, one question was if I want to use it for more than one class for example with nuscenes dataset is it possible to have a config for that?

Thanks a lot!

opened by Dhagash4 9
Inf in attached conv layers

Hi, recently I used SST to train on nuScenes dataset. Everything worked fine in the beginning. But after several epchos, I got nan in bbox_loss and dir_los. And I found that the loss is caused by inf output from the attached conv layer at the end of the sstv1, where the output from recover_bev is fp32 and the intermediate feature maps from conv2d in attached_conv output fp16 values, with the training going on the output values becoming inf. I print the weights of conv which is normal. I tried to clamp the inf value in the feature map, but inf value occurs in the following layers, what could be wrong？

opened by chyohoo 9
About other datasets

Hi, have you experimented on some other outdoor datasets such as nuscenes? As i used SST to train on nuScenes dataset, the results i got were not ideal. I just modified the hyperparameters about the voxel size and replaced the head .I would like to ask whether there is a problem. Thanks!

opened by Zoeeeing 8
question about specifications of evaluation

I had an exciting time to travel your codes the last few weeks, and I finally got the below results.

but I have a question about Waymo evaluation code.

How could I check the configuration of these metrics such as "Does it use bev or 3D, does it use @11 or @40 for mAP"

Because in KITTI, there is an option for choosing 'bev' or '3D', but I can't find it in waymo, except "compute_detection_metrics_main" created by binary file.

Summary: How could I check configurations of metrics on Waymo evaluation.

opened by seonhoon1002 7

Training on kitti dataset

Hi, I tried to train SST on kitti dataset. According to some previous issues in this repo, I modified the config file. But It did not work. And I noticed that the error happen when the dynamic voxelize layer received the point clouds of shape [0, 4]. I print the "empty" point cloud file name, but it was not empty actually. I am so confuse about this error. Could you give me some suggestion to fix it? Thank you in advance!

2022-03-17 21:00:37,919 - mmdet - INFO - Checkpoints will be saved to /data1/cqy/code/SST/work_dirs/sst_kittiD1_1x_3class_8heads/3161933 by HardDiskBackend.
2022-03-17 21:00:38.164014: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
drop_info is set to {0: {'max_tokens': 30, 'drop_range': (0, 30)}, 1: {'max_tokens': 60, 'drop_range': (30, 60)}, 2: {'max_tokens': 100, 'drop_range': (60, 100000)}}, in input_layer
/opt/anaconda3/envs/SST/lib/python3.8/site-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at  /pytorch/aten/src/ATen/native/BinaryOps.cpp:467.)
  return torch.floor_divide(self, other)
2022-03-17 21:00:54,870 - mmcv - INFO - Reducer buckets have been rebuilt in this iteration.
No voxel belongs to drop_level:1 in shift 0
No voxel belongs to drop_level:2 in shift 1
No voxel belongs to drop_level:1 in shift 1
No voxel belongs to drop_level:2 in shift 0
2022-03-17 21:01:12,848 - mmdet - INFO - Epoch [1][50/7424]     lr: 1.000e-03, eta: 1 day, 17:31:34, time: 0.671, data_time: 0.293, memory: 3440, loss_cls: 25.1039, loss_bbox: 0.8919, loss_dir: 0.2588, loss: 26.2546, grad_norm: 138.2142
2022-03-17 21:01:31,472 - mmdet - INFO - Epoch [1][100/7424]    lr: 1.000e-03, eta: 1 day, 8:16:30, time: 0.372, data_time: 0.003, memory: 3440, loss_cls: 0.6193, loss_bbox: 0.5477, loss_dir: 0.1365, loss: 1.3034, grad_norm: 8.0083
2022-03-17 21:01:49,711 - mmdet - INFO - Epoch [1][150/7424]    lr: 1.000e-03, eta: 1 day, 5:01:46, time: 0.365, data_time: 0.003, memory: 3440, loss_cls: 0.5878, loss_bbox: 0.5475, loss_dir: 0.1352, loss: 1.2706, grad_norm: 7.4713
2022-03-17 21:02:08,372 - mmdet - INFO - Epoch [1][200/7424]    lr: 1.000e-03, eta: 1 day, 3:32:03, time: 0.373, data_time: 0.003, memory: 3440, loss_cls: 0.5782, loss_bbox: 0.5325, loss_dir: 0.1384, loss: 1.2492, grad_norm: 6.6802
data/kitti/training/velodyne_reduced/006279.bin
Traceback (most recent call last):
  File "tools/train.py", line 230, in <module>
    main()
  File "tools/train.py", line 220, in main
    train_model(
  File "/data1/cqy/code/SST/mmdet3d/apis/train.py", line 27, in train_model
    train_detector(
  File "/opt/anaconda3/envs/SST/lib/python3.8/site-packages/mmdet/apis/train.py", line 170, in train_detector
    runner.run(data_loaders, cfg.workflow)
  File "/opt/anaconda3/envs/SST/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/opt/anaconda3/envs/SST/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
    self.run_iter(data_batch, train_mode=True, **kwargs)
  File "/opt/anaconda3/envs/SST/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 29, in run_iter
    outputs = self.model.train_step(data_batch, self.optimizer,
  File "/opt/anaconda3/envs/SST/lib/python3.8/site-packages/mmcv/parallel/distributed.py", line 52, in train_step
    output = self.module.train_step(*inputs[0], **kwargs[0])
  File "/opt/anaconda3/envs/SST/lib/python3.8/site-packages/mmdet/models/detectors/base.py", line 237, in train_step
    losses = self(**data)
  File "/opt/anaconda3/envs/SST/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/anaconda3/envs/SST/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 98, in new_func
    return old_func(*args, **kwargs)
  File "/data1/cqy/code/SST/mmdet3d/models/detectors/base.py", line 58, in forward
    return self.forward_train(**kwargs)
  File "/data1/cqy/code/SST/mmdet3d/models/detectors/voxelnet.py", line 90, in forward_train
    x = self.extract_feat(points, img_metas)
  File "/data1/cqy/code/SST/mmdet3d/models/detectors/dynamic_voxelnet.py", line 39, in extract_feat
    voxels, coors = self.voxelize(points)
  File "/opt/anaconda3/envs/SST/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "/opt/anaconda3/envs/SST/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 186, in new_func
    return old_func(*args, **kwargs)
  File "/data1/cqy/code/SST/mmdet3d/models/detectors/dynamic_voxelnet.py", line 62, in voxelize
    res_coors = self.voxel_layer(res)
  File "/opt/anaconda3/envs/SST/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/data1/cqy/code/SST/mmdet3d/ops/voxel/voxelize.py", line 113, in forward
    return voxelization(input, self.voxel_size, self.point_cloud_range,
  File "/data1/cqy/code/SST/mmdet3d/ops/voxel/voxelize.py", line 44, in forward
    dynamic_voxelize(points, coors, voxel_size, coors_range, 3)
RuntimeError: CUDA error: invalid configuration argument

opened by QYChan 7

Questions about the training hyperparameters
Hi, I have some questions about the training hyperparameters.

What’s the best model’s training batchsize(samples_per_gpu) exactly is in the Table.2 & 3 of your paper? All the training configs you provided in the repo set samples_per_gpu=1, is the same as the best model?

The ablation study in the paper using 20% data for training. What about other training hyperparameters? Such as training epochs, batchsize, training 3 classes together or separately.

I’m doing some reproducing experiments, so I need the training hyperparameters mentioned above.
opened by 1349949 7
About configuration question
Hello, I have two questions about your codes.

In comments on config files in the D5 series, "D5 in the config name means the whole dataset is divided into 5 folds, We only use one fold for efficient experiments." So this means that it uses 20% of the waymo dataset?

There are 5dim in configs, what is the mean of 5dim in lidar?

Summary

What is the mean of D5 in config files?

What is the mean of 5dim in config files in lidar?
opened by seonhoon1002 6
pretrained weights about backbone?
Did you release the pre training weight of backbone？

I used two frames of data to superimpose and input on the 128 line lidar. The results of 1.3W data after 20epoch are as follows. Can you give me some comments?
opened by gp1234567 6
Training config for paper results

Would you mind sharing the configs used for table 2 & 3? Since they are trained for 24 epochs on the full waymo dataset, which takes a long time to run, I want to get it right for reproducing the results.

Thank you very much for the help!

opened by Stephanie-Shen324 6
WOD version of 3-frame waymo config
Hi @Abyssaledge ,

I noticed some differences in configs of 3-frame and single-frame FSD on waymo. It seems like different versions of WOD dataset is used (because the in_channels is different). I have the following questions

I have re-generated waymo_dbinfo_train.pkl using your code. Should I use newer version of dbinfo to train the 3f model, or I just change the in_channels back to 5? Does it have a big impact on the final result?

What is the tanh_dims and voxel_downsampling_size mean in 3f config? I wonder why there are used in 3f but not in single frame config.

feat_channels changed from 64 to 32. Could you also explain the reason?

Thanks!
opened by shawnding 3
Dyanmic voxel in_channels size (kitti dataset problem)

你好，我想用kitti的数据集跑一下，我修改了config里面的点云范围，现在DynamicVFE里面输入的input channel和点云的特征通道匹配不上

Traceback (most recent call last): File "tools/train.py", line 230, in main() File "tools/train.py", line 220, in main train_model( File "/home/tan/SST/mmdet3d/apis/train.py", line 27, in train_model train_detector( File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/mmdet/apis/train.py", line 244, in train_detector runner.run(data_loaders, cfg.workflow) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run epoch_runner(data_loaders[i], **kwargs) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train self.run_iter(data_batch, train_mode=True, **kwargs) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 29, in run_iter outputs = self.model.train_step(data_batch, self.optimizer, File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/mmcv/parallel/data_parallel.py", line 75, in train_step return self.module.train_step(*inputs[0], **kwargs[0]) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/mmdet/models/detectors/base.py", line 248, in train_step losses = self(**data) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 98, in new_func return old_func(*args, **kwargs) File "/home/tan/SST/mmdet3d/models/detectors/base.py", line 58, in forward return self.forward_train(**kwargs) File "/home/tan/SST/mmdet3d/models/detectors/dynamic_voxelnet.py", line 122, in forward_train x = self.extract_feat(points, img_metas) File "/home/tan/SST/mmdet3d/models/detectors/dynamic_voxelnet.py", line 43, in extract_feat voxel_features, feature_coors = self.voxel_encoder(voxels, coors) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 186, in new_func return old_func(*args, **kwargs) File "/home/tan/SST/mmdet3d/models/voxel_encoders/voxel_encoder.py", line 285, in forward point_feats = vfe(features) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 98, in new_func return old_func(*args, **kwargs) File "/home/tan/SST/mmdet3d/models/voxel_encoders/utils.py", line 141, in forward x = self.linear(inputs) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 103, in forward return F.linear(input, self.weight, self.bias) File "/home/tan/anaconda3/envs/mmdet3d/lib/python3.8/site-packages/torch/nn/functional.py", line 1848, in linear return torch._C._nn.linear(input, weight, bias) RuntimeError: mat1 and mat2 shapes cannot be multiplied (29510x10 and 11x64)

在mmdet3d/models/detectors/dynamic_voxelnet.py 43行点云体素化的特征矩阵大小为（M*4）在mmdet3d/models/voxel_encoders/utils.py 141行linear曾输入维度为config里面的5+3+3=11，与上面4+3+3=10对不上现在不太清楚需要修改什么地方，是模型用在kiiti上里面网络层参数也需要改变吗？

opened by SH-Tan 8
installation and run problem
After when I run command run.sh, it gives the error as follows. importError: /mnt/cache/wangyingjie/SST/mmdet3d/ops/ball_query/ball_query_ext.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZNK2at10TensorBase8data_ptrIfEEPT_v

When install FSD, I refers to this: https://github.com/tusen-ai/SST/issues/6 My environment is as follows:

sys.platform: linux Python: 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:18) [GCC 10.3.0] CUDA available: True GPU 0,1,2,3,4,5,6,7: NVIDIA A100-SXM4-80GB CUDA_HOME: /usr/local/cuda NVCC: Build cuda_11.2.r11.2/compiler.29618528_0 GCC: gcc (GCC) 5.4.0 PyTorch: 1.9.0+cu111 PyTorch compiling details: PyTorch built with:

GCC 7.3

C++ Version: 201402

Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications

Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)

OpenMP 201511 (a.k.a. OpenMP 4.5)

NNPACK is enabled

CPU capability usage: AVX2

CUDA Runtime 11.1

NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86

CuDNN 8.0.5

Magma 2.5.2

Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.10.0+cu111 OpenCV: 4.6.0 MMCV: 1.3.9 MMCV Compiler: GCC 7.3 MMCV CUDA Compiler: 11.1 MMDetection: 2.14.0+2028b0c MMSegmentation: 0.14.1 MMDetection3D: 0.15.0
opened by JessieW0806 20

Detailed Performance of FSD

Performance of fsd_waymoD1_1x_submission.py (test set): submission Performance of fsd_waymoD1_1x.py (validation set):

OBJECT_TYPE_TYPE_VEHICLE_LEVEL_1: [mAP 0.792293] [mAPH 0.787888]                                                                                              
OBJECT_TYPE_TYPE_VEHICLE_LEVEL_2: [mAP 0.704975] [mAPH 0.700932]
OBJECT_TYPE_TYPE_PEDESTRIAN_LEVEL_1: [mAP 0.825951] [mAPH 0.773279]
OBJECT_TYPE_TYPE_PEDESTRIAN_LEVEL_2: [mAP 0.739378] [mAPH 0.690696]
OBJECT_TYPE_TYPE_SIGN_LEVEL_1: [mAP 0] [mAPH 0]     
OBJECT_TYPE_TYPE_SIGN_LEVEL_2: [mAP 0] [mAPH 0]                   
OBJECT_TYPE_TYPE_CYCLIST_LEVEL_1: [mAP 0.77096] [mAPH 0.759925]   
OBJECT_TYPE_TYPE_CYCLIST_LEVEL_2: [mAP 0.743786] [mAPH 0.73312]    
RANGE_TYPE_VEHICLE_[0, 30)_LEVEL_1: [mAP 0.928933] [mAPH 0.925183] 
RANGE_TYPE_VEHICLE_[0, 30)_LEVEL_2: [mAP 0.916398] [mAPH 0.91269]    
RANGE_TYPE_VEHICLE_[30, 50)_LEVEL_1: [mAP 0.782089] [mAPH 0.777451]  
RANGE_TYPE_VEHICLE_[30, 50)_LEVEL_2: [mAP 0.714469] [mAPH 0.710126]
RANGE_TYPE_VEHICLE_[50, +inf)_LEVEL_1: [mAP 0.593706] [mAPH 0.586526]
RANGE_TYPE_VEHICLE_[50, +inf)_LEVEL_2: [mAP 0.461929] [mAPH 0.456148]
RANGE_TYPE_PEDESTRIAN_[0, 30)_LEVEL_1: [mAP 0.865141] [mAPH 0.823203]
RANGE_TYPE_PEDESTRIAN_[0, 30)_LEVEL_2: [mAP 0.82562] [mAPH 0.784474]
RANGE_TYPE_PEDESTRIAN_[30, 50)_LEVEL_1: [mAP 0.814913] [mAPH 0.756908]
RANGE_TYPE_PEDESTRIAN_[30, 50)_LEVEL_2: [mAP 0.738424] [mAPH 0.684682]
RANGE_TYPE_PEDESTRIAN_[50, +inf)_LEVEL_1: [mAP 0.74548] [mAPH 0.662026]
RANGE_TYPE_PEDESTRIAN_[50, +inf)_LEVEL_2: [mAP 0.588333] [mAPH 0.5195]
RANGE_TYPE_SIGN_[0, 30)_LEVEL_1: [mAP 0] [mAPH 0]
RANGE_TYPE_SIGN_[0, 30)_LEVEL_2: [mAP 0] [mAPH 0]
RANGE_TYPE_SIGN_[30, 50)_LEVEL_1: [mAP 0] [mAPH 0]
RANGE_TYPE_SIGN_[30, 50)_LEVEL_2: [mAP 0] [mAPH 0]
RANGE_TYPE_SIGN_[50, +inf)_LEVEL_1: [mAP 0] [mAPH 0]
RANGE_TYPE_SIGN_[50, +inf)_LEVEL_2: [mAP 0] [mAPH 0]
RANGE_TYPE_CYCLIST_[0, 30)_LEVEL_1: [mAP 0.854262] [mAPH 0.843677]
RANGE_TYPE_CYCLIST_[0, 30)_LEVEL_2: [mAP 0.848135] [mAPH 0.837626]
RANGE_TYPE_CYCLIST_[30, 50)_LEVEL_1: [mAP 0.733669] [mAPH 0.722557]
RANGE_TYPE_CYCLIST_[30, 50)_LEVEL_2: [mAP 0.695306] [mAPH 0.684749]
RANGE_TYPE_CYCLIST_[50, +inf)_LEVEL_1: [mAP 0.616862] [mAPH 0.602011]
RANGE_TYPE_CYCLIST_[50, +inf)_LEVEL_2: [mAP 0.576674] [mAPH 0.562745]

opened by Abyssaledge 4

Detailed Performance of CenterHead SST

Validation split. Single frame; lidar-only; No ensemble; Just for reference.

OBJECT_TYPE_TYPE_VEHICLE_LEVEL_1: [mAP 0.751252] [mAPH 0.746433]
OBJECT_TYPE_TYPE_VEHICLE_LEVEL_2: [mAP 0.666054] [mAPH 0.661673]
OBJECT_TYPE_TYPE_PEDESTRIAN_LEVEL_1: [mAP 0.800718] [mAPH 0.72117]
OBJECT_TYPE_TYPE_PEDESTRIAN_LEVEL_2: [mAP 0.723756] [mAPH 0.650058]
OBJECT_TYPE_TYPE_SIGN_LEVEL_1: [mAP 0] [mAPH 0]
OBJECT_TYPE_TYPE_SIGN_LEVEL_2: [mAP 0] [mAPH 0]
OBJECT_TYPE_TYPE_CYCLIST_LEVEL_1: [mAP 0.714879] [mAPH 0.702029]
OBJECT_TYPE_TYPE_CYCLIST_LEVEL_2: [mAP 0.688492] [mAPH 0.67611]
RANGE_TYPE_VEHICLE_[0, 30)_LEVEL_1: [mAP 0.919894] [mAPH 0.915598]
RANGE_TYPE_VEHICLE_[0, 30)_LEVEL_2: [mAP 0.9077] [mAPH 0.903451]
RANGE_TYPE_VEHICLE_[30, 50)_LEVEL_1: [mAP 0.734855] [mAPH 0.729633]
RANGE_TYPE_VEHICLE_[30, 50)_LEVEL_2: [mAP 0.669229] [mAPH 0.664388]
RANGE_TYPE_VEHICLE_[50, +inf)_LEVEL_1: [mAP 0.506878] [mAPH 0.500256]
RANGE_TYPE_VEHICLE_[50, +inf)_LEVEL_2: [mAP 0.389791] [mAPH 0.384518]
RANGE_TYPE_PEDESTRIAN_[0, 30)_LEVEL_1: [mAP 0.8519] [mAPH 0.779192]
RANGE_TYPE_PEDESTRIAN_[0, 30)_LEVEL_2: [mAP 0.818607] [mAPH 0.747716]
RANGE_TYPE_PEDESTRIAN_[30, 50)_LEVEL_1: [mAP 0.790649] [mAPH 0.709307]
RANGE_TYPE_PEDESTRIAN_[30, 50)_LEVEL_2: [mAP 0.722803] [mAPH 0.64698]
RANGE_TYPE_PEDESTRIAN_[50, +inf)_LEVEL_1: [mAP 0.692475] [mAPH 0.587309]
RANGE_TYPE_PEDESTRIAN_[50, +inf)_LEVEL_2: [mAP 0.556325] [mAPH 0.469353]
RANGE_TYPE_SIGN_[0, 30)_LEVEL_1: [mAP 0] [mAPH 0]
RANGE_TYPE_SIGN_[0, 30)_LEVEL_2: [mAP 0] [mAPH 0]
RANGE_TYPE_SIGN_[30, 50)_LEVEL_1: [mAP 0] [mAPH 0]
RANGE_TYPE_SIGN_[30, 50)_LEVEL_2: [mAP 0] [mAPH 0]
RANGE_TYPE_SIGN_[50, +inf)_LEVEL_1: [mAP 0] [mAPH 0]
RANGE_TYPE_SIGN_[50, +inf)_LEVEL_2: [mAP 0] [mAPH 0]
RANGE_TYPE_CYCLIST_[0, 30)_LEVEL_1: [mAP 0.799435] [mAPH 0.787482]
RANGE_TYPE_CYCLIST_[0, 30)_LEVEL_2: [mAP 0.793702] [mAPH 0.781835]
RANGE_TYPE_CYCLIST_[30, 50)_LEVEL_1: [mAP 0.677939] [mAPH 0.66574]
RANGE_TYPE_CYCLIST_[30, 50)_LEVEL_2: [mAP 0.640499] [mAPH 0.628967]
RANGE_TYPE_CYCLIST_[50, +inf)_LEVEL_1: [mAP 0.558871] [mAPH 0.541335]
RANGE_TYPE_CYCLIST_[50, +inf)_LEVEL_2: [mAP 0.520811] [mAPH 0.504459]

opened by Abyssaledge 0

Owner

TuSimple

The Future of Trucking

GitHub

A CNN implementation using only numpy. Supports multidimensional images, stride, etc.

A CNN implementation using only numpy. Supports multidimensional images, stride, etc. Speed up due to heavy use of slicing and mathematical simplification..

2 Nov 30, 2021

Lane follower: Lane-detector (OpenCV) + Object-detector (YOLO5) + CAN-bus

Lane Follower This code is for the lane follower, including perception and control, as shown below. Environment Hardware Industrial Camera Intel-NUC(1

3 Jul 7, 2022

A whale detector design for the Kaggle whale-detector challenge!

CNN (InceptionV1) + STFT based Whale Detection Algorithm So, this repository is my PyTorch solution for the Kaggle whale-detection challenge. The obje

92 Sep 28, 2021

HeartRate detector with ArduinoandPython - Use Arduino and Python create a heartrate detector.

Syllabus of Contents Syllabus of Contents Introduction Of Project Features Develop With Python code introduction Installation License Developer Contac

1 Jan 5, 2022

Video lie detector using xgboost - A video lie detector using OpenFace and xgboost

video_lie_detector_using_xgboost a video lie detector using OpenFace and xgboost

2 Jan 11, 2022

Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

HackED 2022 Team 3IQ - 2022 Imposter Detector By Aneeljyot Alagh, Curtis Kan, Jo

3 Aug 20, 2022

Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

Differentiable Neural Computers and family, for Pytorch Includes: Differentiable Neural Computers (DNC) Sparse Access Memory (SAM) Sparse Differentiab

302 Dec 14, 2022

ViDT: An Efficient and Effective Fully Transformer-based Object Detector

ViDT: An Efficient and Effective Fully Transformer-based Object Detector by Hwanjun Song1, Deqing Sun2, Sanghyuk Chun1, Varun Jampani2, Dongyoon Han1,

262 Dec 27, 2022

SSD: Single Shot MultiBox Detector pytorch implementation focusing on simplicity

SSD: Single Shot MultiBox Detector Introduction Here is my pytorch implementation of 2 models: SSD-Resnet50 and SSDLite-MobilenetV2.

149 Jan 7, 2023

A PyTorch Implementation of Single Shot Scale-invariant Face Detector.

S³FD: Single Shot Scale-invariant Face Detector A PyTorch Implementation of Single Shot Scale-invariant Face Detector. Eval python wider_eval_pytorch.

235 Jan 7, 2023

A PyTorch Implementation of Single Shot MultiBox Detector

SSD: Single Shot MultiBox Object Detector, in PyTorch A PyTorch implementation of Single Shot MultiBox Detector from the 2016 paper by Wei Liu, Dragom

4.8k Jan 7, 2023

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

12.6k Jan 9, 2023

Embracing Single Stride 3D Object Detector with Sparse Transformer

Related tags

Overview

SST: Single-stride Sparse Transformer

Introduction and Highlights

Usage

Main results

Single-stage Model (based on PointPillars) on Waymo validation split

TODO

Acknowlegement

Comments

Owner

TuSimple

A CNN implementation using only numpy. Supports multidimensional images, stride, etc.

Lane follower: Lane-detector (OpenCV) + Object-detector (YOLO5) + CAN-bus

A whale detector design for the Kaggle whale-detector challenge!

HeartRate detector with ArduinoandPython - Use Arduino and Python create a heartrate detector.

Video lie detector using xgboost - A video lie detector using OpenFace and xgboost

Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

ViDT: An Efficient and Effective Fully Transformer-based Object Detector

SSD: Single Shot MultiBox Detector pytorch implementation focusing on simplicity

A PyTorch Implementation of Single Shot Scale-invariant Face Detector.

A PyTorch Implementation of Single Shot MultiBox Detector

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Official code for paper "Demystifying Local Vision Transformer: Sparse Connectivity, Weight Sharing, and Dynamic Weight"

VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).

Official code of the paper "ReDet: A Rotation-equivariant Detector for Aerial Object Detection" (CVPR 2021)

Deformable DETR is an efficient and fast-converging end-to-end object detector.

Official Implementation of DDOD (Disentangle your Dense Object Detector), ACM MM2021

LiDAR R-CNN: An Efficient and Universal 3D Object Detector

Morphable Detector for Object Detection on Demand