PointPillars inference with TensorRT

Overview

PointPillars inference with TensorRT

This repository contains sources and model for pointpillars inference using TensorRT. The model is created by OpenPCDet and modified by onnx_graphsurgeon.

Inference has four parts: generateVoxels: convert points cloud into voxels which has 4 channles generateFeatures: convert voxels into feature maps which has 10 channles Inference: convert feature maps to raw data of bounding box, class source and direction Postprocessing: parse bounding box, class source and direction

Data

The demo use the data from KITTI Dataset and more data can be downloaded following the linker GETTING_STARTED

Model

The onnx file can be converted from a model trainned by OpenPCDet with the tool in the demo.

Build

Prerequisites

To build the pointpillars inference, TensorRT with PillarScatter layer and CUDA are needed. PillarScatter layer plugin is already implemented as a plugin for TRT in the demo.

  • Jetpack 4.5
  • TensorRT v7.1.3
  • CUDA-10.2 + cuDNN-8.0.0
  • PCL is optinal to store pcd pointcloud file

Compile


$ cd test
$ mkdir build
$ cd build
$ make -j$(nproc)

Run

$ ./demo

Enviroments

  • Jetpack 4.5
  • Cuda10.2 + cuDNN8.0.0 + TensorRT 7.1.3
  • Nvidia Jetson AGX Xavier

Performance

  • FP16
|                   | GPU/ms | 
| ----------------- | ------ |
| generateVoxels    | 0.22   |
| generateFeatures  | 0.21   |
| Inference         | 30.75  |
| Postprocessing    | 3.19   |

Note

  1. GPU processes all points at the same time and points selected form points cloud for a voxel randomly, so the output of generateVoxels has random value. Because CPU will select the first 32 points, the output of generateVoxels by CPU has fixed value.

  2. The demo will cache the onnx file to improve performance. If a new onnx will be used, please remove the cache file in "./model"

  3. MAX_VOXELS in params.h is used to allocate cache during inference. Decrease the value to save memory.

References

Comments
  • when I generate onnx by exporter.py, i got the error

    when I generate onnx by exporter.py, i got the error

    ['Car', 'Pedestrian', 'Cyclist']
    3
    0 -39.68 -3 69.12 39.68 1
    [0.16, 0.16, 4]
    32
    40000
    4
    64
    0.78539
    0.0
    2
    [3.9, 1.6, 1.56, 0.0, 3.9, 1.6, 1.56, 1.57, 0.8, 0.6, 1.73, 0.0, 0.8, 0.6, 1.73, 1.57, 1.76, 0.6, 1.73, 0.0, 1.76, 0.6, 1.73, 1.57]
    [-1.78, -0.6, -0.6]
    0.1
    0.01
    anchors:      const float anchors[num_anchors * len_per_anchor] = {
          3.9,1.6,1.56,0.0,
          3.9,1.6,1.56,1.57,
          0.8,0.6,1.73,0.0,
          0.8,0.6,1.73,1.57,
          1.76,0.6,1.73,0.0,
          1.76,0.6,1.73,1.57,
          };
    
    anchors:      const float anchor_bottom_heights[num_classes] = {-1.78,-0.6,-0.6,};
    
    ########
    2022-03-15 11:08:59,269   INFO  ------ Convert OpenPCDet model for TensorRT ------
    2022-03-15 11:09:05,030   INFO  ==> Loading parameters from checkpoint ../../checkpoint_epoch_1.pth to CPU
    2022-03-15 11:09:05,171   INFO  ==> Checkpoint trained from version: pcdet+0.3.0+0642cf0
    2022-03-15 11:09:05,462   INFO  ==> Done (loaded 127/127)
    /home/nvidia/project/pointpillar/CUDA-PointPillars-main/tool/pcdet/models/backbones_3d/vfe/pillar_vfe.py:45: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
      if inputs.shape[0] > self.part:
    /home/nvidia/project/pointpillar/CUDA-PointPillars-main/tool/pcdet/models/backbones_2d/map_to_bev/pointpillar_scatter.py:31: TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
      batch_size = coords[:, 0].max().int().item() + 1
    Traceback (most recent call last):
      File "exporter.py", line 150, in <module>
        main()
      File "exporter.py", line 135, in main
        output_names = ['cls_preds', 'box_preds', 'dir_cls_preds'], # the model's output names
      File "/home/nvidia/.local/lib/python3.6/site-packages/torch/onnx/__init__.py", line 208, in export
        custom_opsets, enable_onnx_checker, use_external_data_format)
      File "/home/nvidia/.local/lib/python3.6/site-packages/torch/onnx/utils.py", line 92, in export
        use_external_data_format=use_external_data_format)
      File "/home/nvidia/.local/lib/python3.6/site-packages/torch/onnx/utils.py", line 530, in _export
        fixed_batch_size=fixed_batch_size)
      File "/home/nvidia/.local/lib/python3.6/site-packages/torch/onnx/utils.py", line 366, in _model_to_graph
        graph, torch_out = _trace_and_get_graph_from_model(model, args)
      File "/home/nvidia/.local/lib/python3.6/site-packages/torch/onnx/utils.py", line 319, in _trace_and_get_graph_from_model
        torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True)
      File "/home/nvidia/.local/lib/python3.6/site-packages/torch/jit/__init__.py", line 338, in _get_trace_graph
        outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
      File "/home/nvidia/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/home/nvidia/.local/lib/python3.6/site-packages/torch/jit/__init__.py", line 426, in forward
        self._force_outplace,
      File "/home/nvidia/.local/lib/python3.6/site-packages/torch/jit/__init__.py", line 412, in wrapper
        outs.append(self.inner(*trace_inputs))
      File "/home/nvidia/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 720, in _call_impl
        result = self._slow_forward(*input, **kwargs)
      File "/home/nvidia/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 704, in _slow_forward
        result = self.forward(*input, **kwargs)
      File "/home/nvidia/project/pointpillar/CUDA-PointPillars-main/tool/pcdet/models/detectors/pointpillar.py", line 31, in forward
        spatial_features_2d = self.module_list[2](spatial_features) #"BaseBEVBackbone"
      File "/home/nvidia/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 720, in _call_impl
        result = self._slow_forward(*input, **kwargs)
      File "/home/nvidia/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 704, in _slow_forward
        result = self.forward(*input, **kwargs)
      File "/home/nvidia/project/pointpillar/CUDA-PointPillars-main/tool/pcdet/models/backbones_2d/base_bev_backbone.py", line 103, in forward
        stride = int(spatial_features.shape[2] / x.shape[2])
    RuntimeError: Integer division of tensors using div or / is no longer supported, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.```
    
    
    
    when I generate onnx by exporter.py, i got the error 。 How can i fix it?
    
    opened by liveforday 4
  • Could not export onnx.

    Could not export onnx.

    I tried export.py but could not do it because it didn't get the right input. 2022-10-28 04:02:25,943 INFO ==> Loading parameters from checkpoint ./pointpillar_7728.pth to CPU

    Traceback (most recent call last):
      File "exporter.py", line 150, in <module>
        main()
      File "exporter.py", line 134, in main
        output_names = ['cls_preds', 'box_preds', 'dir_cls_preds'], # the model's output names
      File "/home/rjh1016/anaconda3/envs/openpcdet/lib/python3.7/site-packages/torch/onnx/__init__.py", line 365, in export
        export_modules_as_functions,
      File "/home/rjh1016/anaconda3/envs/openpcdet/lib/python3.7/site-packages/torch/onnx/utils.py", line 178, in export
        export_modules_as_functions=export_modules_as_functions,
      File "/home/rjh1016/anaconda3/envs/openpcdet/lib/python3.7/site-packages/torch/onnx/utils.py", line 1097, in _export
        dynamic_axes=dynamic_axes,
      File "/home/rjh1016/anaconda3/envs/openpcdet/lib/python3.7/site-packages/torch/onnx/utils.py", line 737, in _model_to_graph
        graph, params, torch_out, module = _create_jit_graph(model, args)
      File "/home/rjh1016/anaconda3/envs/openpcdet/lib/python3.7/site-packages/torch/onnx/utils.py", line 611, in _create_jit_graph
        graph, torch_out = _trace_and_get_graph_from_model(model, args)
      File "/home/rjh1016/anaconda3/envs/openpcdet/lib/python3.7/site-packages/torch/onnx/utils.py", line 527, in _trace_and_get_graph_from_model
        model, args, strict=False, _force_outplace=False, _return_inputs_states=True
      File "/home/rjh1016/anaconda3/envs/openpcdet/lib/python3.7/site-packages/torch/jit/_trace.py", line 1179, in _get_trace_graph
        outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
      File "/home/rjh1016/anaconda3/envs/openpcdet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/rjh1016/anaconda3/envs/openpcdet/lib/python3.7/site-packages/torch/jit/_trace.py", line 134, in forward
        self._force_outplace,
      File "/home/rjh1016/anaconda3/envs/openpcdet/lib/python3.7/site-packages/torch/jit/_trace.py", line 120, in wrapper
        outs.append(self.inner(*trace_inputs))
      File "/home/rjh1016/anaconda3/envs/openpcdet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/rjh1016/anaconda3/envs/openpcdet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1118, in _slow_forward
        result = self.forward(*input, **kwargs)
    **TypeError: forward() missing 1 required positional argument: 'batch_dict'**
    

    I guess the model architecture is still same with the OpenPCDet. But I don't know what to do first to export onnx from this file. My torch version is 1.12.1+cu102, cuda version is 11.3 and onnx version is 1.12.0

    opened by Domwis-IR 3
  • parse onnx model wrong

    parse onnx model wrong

    trt_infer: 1: [stdArchiveReader.cpp::nvinfer1::rt::StdArchiveReader::StdArchiveReader::35] Error Code 1: Serialization (Serialization assertion safeVersionRead == safeSerializationVersion failed.Version tag does not match. Note: Current Version: 0, Serialized Engine Version: 97) trt_infer: 4: [runtime.cpp::nvinfer1::Runtime::deserializeCudaEngine::50] Error Code 4: Internal Error (Engine deserialization failed.)

    opened by zhouliuer 2
  • Got error when run demo

    Got error when run demo

    Hi, I use jetson nano 2 gb Jetpack 4.6 CUDA 10.2 tensorrt 8.0 onnx 1.8

    After compling, I run the demo, got error: GPU has cuda devices: 1 ----device id: 0 info---- GPU : NVIDIA Tegra X1 Capbility: 5.3 Global memory: 1979MB Const memory: 64KB SM in a block: 48KB warp size: 32 threads in a block: 1024 block dim: (1024,1024,64) grid dim: (2147483647,65535,65535)

    Building TRT engine. [libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:9: Message type “onnx2trt_onnx.ModelProto” has no field named “version”. trt_infer: ModelImporter.cpp:682: Failed to parse ONNX model from file: …/…/model/pointpillar.onnx : failed to parse onnx model file, please check the onnx version and trt support op!

    So can you tell me how to solve this error or which version of onnx do you use?

    opened by wuwenxuan0226 2
  • Cant find a pytorch version matched with CUDA11.4

    Cant find a pytorch version matched with CUDA11.4

    I just try to transform my own pth to onnx ,but exporter.py has an issue "report pytorch" , i try to set the env as readme in tools ,but cant find a pytorch1.11.0 with cuda11.4 (pytorch.org only has cu113,cu115,cu116)

    thank you very much

    opened by Aracher 1
  • export onnx error

    export onnx error

    When I follow export gride export onnx from pointpillar_7729.pth

    I found this error

    root@pc-MS-7B89:/workspace/ssh-docker/workspace/CUDA-PointPillars/tool# python exporter.py --ckpt ../model/pointpillar_7729.pth 2022-06-08 10:17:39,104 INFO ------ Convert OpenPCDet model for TensorRT ------ 2022-06-08 10:17:40,746 INFO ==> Loading parameters from checkpoint ../model/pointpillar_7729.pth to CPU 2022-06-08 10:17:40,760 INFO ==> Done (loaded 127/127) Traceback (most recent call last): File "exporter.py", line 150, in main() File "exporter.py", line 126, in main torch.onnx.export(model, # model being run File "/usr/local/lib/python3.8/dist-packages/torch/onnx/init.py", line 225, in export return utils.export(model, args, f, export_params, verbose, training, File "/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py", line 85, in export _export(model, args, f, export_params, verbose, training, input_names, output_names, File "/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py", line 632, in _export _model_to_graph(model, args, verbose, input_names, File "/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py", line 409, in _model_to_graph graph, params, torch_out = _create_jit_graph(model, args, File "/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py", line 379, in _create_jit_graph graph, torch_out = _trace_and_get_graph_from_model(model, args) File "/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py", line 342, in _trace_and_get_graph_from_model torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True) File "/usr/local/lib/python3.8/dist-packages/torch/jit/_trace.py", line 1148, in _get_trace_graph outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs) File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/usr/local/lib/python3.8/dist-packages/torch/jit/_trace.py", line 93, in forward in_vars, in_desc = _flatten(args) RuntimeError: Only tuples, lists and Variables supported as JIT inputs/outputs. Dictionaries and strings are also accepted but their usage is not recommended. But got unsupported type int

    opened by PeterJaq 1
  • Two problems were found

    Two problems were found

    1. running demo twice,but has different nms_pred. first time result: `0 0 Pedestrian 0 0 0 0 0 0 0 8.801592 -22.979778 -0.887550 0.699091 1.681269 0.955211 6.057534 0.886240

    0 1 Car 0 0 0 0 0 0 0 12.719068 -28.110558 -0.953992 1.454717 1.440981 3.581757 1.694893 0.868306

    0 2 Car 0 0 0 0 0 0 0 47.362293 -28.365889 -0.973110 1.458381 1.436911 3.486596 1.667450 0.857365

    0 3 Cyclist 0 0 0 0 0 0 0 6.083002 -20.606743 -0.793952 0.521009 1.870310 1.742529 6.440432 0.723123

    0 4 Pedestrian 0 0 0 0 0 0 0 38.801476 -24.652239 -0.866866 0.633589 1.681634 0.863217 7.058193 0.640031

    1 0 Car 0 0 0 0 0 0 0 12.139722 -28.273060 -0.901822 1.527452 1.430591 3.621138 1.697318 0.906442

    1 1 Car 0 0 0 0 0 0 0 46.858337 -28.242884 -0.900239 1.531207 1.431682 3.568392 1.667983 0.901812

    1 2 Pedestrian 0 0 0 0 0 0 0 8.465889 -23.032154 -0.846444 0.645208 1.671196 0.881344 6.123196 0.870731

    1 3 Pedestrian 0 0 0 0 0 0 0 38.177109 -24.687294 -0.824349 0.597974 1.776301 0.773330 6.553991 0.749789

    1 4 Cyclist 0 0 0 0 0 0 0 6.105674 -20.675518 -0.786540 0.537707 1.857258 1.734642 6.339981 0.717096

    2 0 Car 0 0 0 0 0 0 0 11.585449 -28.402311 -0.755955 1.502408 1.468927 3.476464 1.708760 0.914412

    2 1 Car 0 0 0 0 0 0 0 46.331028 -28.403627 -0.756222 1.510054 1.459696 3.494760 1.701342 0.892783

    2 2 Pedestrian 0 0 0 0 0 0 0 2.864436 -24.456514 -0.795387 0.683705 1.716824 0.740641 6.564042 0.670138`

    and second result:

    `0 0 Pedestrian 0 0 0 0 0 0 0 8.803576 -22.977654 -0.887780 0.696375 1.681500 0.958435 6.070025 0.884506

    0 1 Car 0 0 0 0 0 0 0 12.719075 -28.110571 -0.954006 1.454742 1.441006 3.581787 1.694892 0.868305

    0 2 Car 0 0 0 0 0 0 0 47.362247 -28.365870 -0.973042 1.458364 1.436905 3.486571 1.667452 0.857371

    0 3 Cyclist 0 0 0 0 0 0 0 6.087093 -20.607618 -0.795005 0.520003 1.870543 1.743158 6.437171 0.718292

    0 4 Pedestrian 0 0 0 0 0 0 0 38.801697 -24.651608 -0.866716 0.631521 1.683618 0.862992 3.981019 0.638015

    1 0 Car 0 0 0 0 0 0 0 12.137152 -28.273653 -0.905428 1.524646 1.433306 3.596805 1.698718 0.907847

    1 1 Car 0 0 0 0 0 0 0 46.859871 -28.246393 -0.904772 1.531922 1.435942 3.555500 1.668782 0.903067

    2 0 Car 0 0 0 0 0 0 0 11.584668 -28.394022 -0.754803 1.500816 1.474386 3.450029 1.709120 0.922249

    2 1 Car 0 0 0 0 0 0 0 46.331684 -28.388617 -0.756081 1.507665 1.464798 3.450339 1.705996 0.903383

    2 2 Pedestrian 0 0 0 0 0 0 0 2.865523 -24.451092 -0.794181 0.643265 1.719354 0.729399 7.015512 0.763062

    2 3 Pedestrian 0 0 0 0 0 0 0 8.110272 -23.076586 -0.753614 0.546372 1.720677 0.841813 2.003850 0.684011

    2 4 Pedestrian 0 0 0 0 0 0 0 37.609577 -24.749792 -0.808255 0.581791 1.683830 0.778918 4.070107 0.650024`

    Is this normal??

    1. reset score_thresh=0.5, then has much nms bbox num_obj:241072 numbers of Bndbox need to be nms:241072 why??

    2. 同样的模型,openPcdet输出结果显示正常,onnx部署之后检测结果显示异常。。。。

    thanks and waiting for your reply.

    opened by OPPOA113 1
  • [8] Assertion failed: creator &&

    [8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"

    i use tenrorrt8.4 and when i am running ./demo ,Building TRT engine: there are some errors: Building TRT engine. trt_infer: Could not register plugin creator - ::PillarScatterPlugin version 1 trt_infer: parsers/onnx/ModelImporter.cpp:780: While parsing node number 4 [ScatterBEV -> "479"]: trt_infer: parsers/onnx/ModelImporter.cpp:781: --- Begin node --- trt_infer: parsers/onnx/ModelImporter.cpp:782: input: "403" input: "coords" input: "params" output: "479" name: "onnx_graphsurgeon_node_0" op_type: "ScatterBEV" trt_infer: ModelImporter.cpp:751: --- End node --- trt_infer: ModelImporter.cpp:754: ERROR: builtin_op_importers.cpp:4951 In function importFallbackPluginImporter: [8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?" : failed to parse onnx model file, please check the onnx version and trt support op! How can i fix the problem?

    opened by iloveai8086 1
  • calculate offset in the file which named preprocess_kernels.cu

    calculate offset in the file which named preprocess_kernels.cu

    at line 323 of preprocess_kernels.cu, the code calcuate offset.

    //calculate offset
    float x_offset = voxel_x / 2 + cordsSM[pillar_idx_inBlock].w * voxel_x + range_min_x;
    float y_offset = voxel_y / 2 + cordsSM[pillar_idx_inBlock].z * voxel_y + range_min_y;
    float z_offset = voxel_z / 2 + cordsSM[pillar_idx_inBlock].y * voxel_z + range_min_z;
    

    I think the w means intensity, when calcuate x_offset, why voxel_x multiply by cordsSM[pillar_idx_inBlock].w, not cordsSM[pillar_idx_inBlock].x when calcuate y_offset, why voxel_y multiply by cordsSM[pillar_idx_inBlock].z, not cordsSM[pillar_idx_inBlock].y when calcuate z_offset, why voxel_z multiply by cordsSM[pillar_idx_inBlock].y, not cordsSM[pillar_idx_inBlock].z

    opened by liveforday 0
  • Found two crashes caused by two defects in preprocess_kernels.cu

    Found two crashes caused by two defects in preprocess_kernels.cu

    Hello, we saw sometime crashes happened in our application with CUDA-PointPillars integrated, after investigated and found the causes are in preprocess_kernels.cu.

    1. in generateVoxels_random_kernel():

      int voxel_idx = floorf((point.x - min_x_range)/pillar_x_size); int voxel_idy = floorf((point.y - min_y_range)/pillar_y_size); our grid_y_size is 320, when (point.y - min_y_range)/pillar_y_size == 319.999998, voxel_idy = floorf(319.999998), the value of voxel_idy is 320 not 319 ! voxel_idy == grid_y_size, this caused invalid out-of-bounding access to mask and voxels, and the memory was then corrupted, so, a crash happened accordingly.

    2. in generateBaseFeatures_kernel(): current_pillarId = atomicAdd(pillar_num, 1); Hereafter no checking is done for current_pillarId, if there are points scattering in a very large range and current_pillarId is greater than MAX_VOXELS(40000), memory will be corrupted by "((float4*)voxel_features)[outIndex] = ((float4*)voxels)[inIndex];" because of invalid out-of-bounding access to voxel_features.

    Did a sample fix here: https://github.com/arnoldfychen/CUDA-PointPillars/commit/2c9c4a15b432e7abee2de13b0b91a4a3a52fb1ce then no crash was seen.

    opened by arnoldfychen 0
  • How can i know about time of scatter processing

    How can i know about time of scatter processing

    Hello,

    Thanks about your excellent job,

    I want to know about the time for all processing modules

    So, how can i know about time of scatter procedure

    Thank u for reading

    opened by rkdckddnjs9 0
  • Change  ROS's .bag file to kitti's format, But the performance is terrible

    Change ROS's .bag file to kitti's format, But the performance is terrible

    When I rosplay the .bag file and change the format into kitti's format as the input, the detected results is very terrible. Does anyone have the same problem?

    opened by beyoursel 5
  • onnxruntime evaluation error

    onnxruntime evaluation error

    Hello, author! I wanted to use onnxruntime to compare the onnx to the pytorch model, but the call got the following error.

    sess = onnxruntime.InferenceSession("pointpillar.onnx", None) File "E:\Anaconda3\envs\innovativePractice\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 347, in init self._create_inference_session(providers, provider_options, disabled_optimizers) File "E:\Anaconda3\envs\innovativePractice\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 384, in _create_inference_session sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model) onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from pointpillar.onnx failed:This is an invalid model. In Node, ("ReduceMax_166", ReduceMax, "", -1) : ("onnx::ReduceMax_332": tensor(float),) -> ("onnx::Squeeze_333": tensor(float),) , Error Mismatched attribute type in 'ReduceMax_166 : keepdims'

    What is the reason for this? I am a beginner, if it is convenient, can you give me some suggestions for evaluating pointpillar.onnx with onnxruntime.

    opened by DZ2584793260 0
  • how to convert .pt into onnx for pointpillars of mmdetection3d?

    how to convert .pt into onnx for pointpillars of mmdetection3d?

    @NVIDIA-AI-IOT thank you for your works,this is very helpful for me. but there are some questions in mmdet3d conversion:i can't directly convert the model of mmdet3d, and i have no specific idea of revision ,because there are differences in the definition and use of pointpillars in mmdet3d and pcdet。 --------------mmdet3d-pointpillars-------------------- 0 backbone.blocks.0.0.weight 1 backbone.blocks.0.1.weight 2 backbone.blocks.0.1.bias 3 backbone.blocks.0.3.weight 4 backbone.blocks.0.4.weight 5 backbone.blocks.0.4.bias 6 backbone.blocks.0.6.weight 7 backbone.blocks.0.7.weight 8 backbone.blocks.0.7.bias 9 backbone.blocks.0.9.weight 10 backbone.blocks.0.10.weight 11 backbone.blocks.0.10.bias 12 backbone.blocks.1.0.weight 13 backbone.blocks.1.1.weight 14 backbone.blocks.1.1.bias 15 backbone.blocks.1.3.weight 16 backbone.blocks.1.4.weight 17 backbone.blocks.1.4.bias 18 backbone.blocks.1.6.weight 19 backbone.blocks.1.7.weight 20 backbone.blocks.1.7.bias 21 backbone.blocks.1.9.weight 22 backbone.blocks.1.10.weight 23 backbone.blocks.1.10.bias 24 backbone.blocks.1.12.weight 25 backbone.blocks.1.13.weight 26 backbone.blocks.1.13.bias 27 backbone.blocks.1.15.weight 28 backbone.blocks.1.16.weight 29 backbone.blocks.1.16.bias 30 backbone.blocks.2.0.weight 31 backbone.blocks.2.1.weight 32 backbone.blocks.2.1.bias 33 backbone.blocks.2.3.weight 34 backbone.blocks.2.4.weight 35 backbone.blocks.2.4.bias 36 backbone.blocks.2.6.weight 37 backbone.blocks.2.7.weight 38 backbone.blocks.2.7.bias 39 backbone.blocks.2.9.weight 40 backbone.blocks.2.10.weight 41 backbone.blocks.2.10.bias 42 backbone.blocks.2.12.weight 43 backbone.blocks.2.13.weight 44 backbone.blocks.2.13.bias 45 backbone.blocks.2.15.weight 46 backbone.blocks.2.16.weight 47 backbone.blocks.2.16.bias 48 neck.deblocks.0.0.weight 49 neck.deblocks.0.1.weight 50 neck.deblocks.0.1.bias 51 neck.deblocks.1.0.weight 52 neck.deblocks.1.1.weight 53 neck.deblocks.1.1.bias 54 neck.deblocks.2.0.weight 55 neck.deblocks.2.1.weight 56 neck.deblocks.2.1.bias 57 bbox_head.conv_cls.weight 58 bbox_head.conv_cls.bias 59 bbox_head.conv_reg.weight 60 bbox_head.conv_reg.bias 61 bbox_head.conv_dir_cls.weight 62 bbox_head.conv_dir_cls.bias 63 voxel_encoder.pfn_layers.0.norm.weight 64 voxel_encoder.pfn_layers.0.norm.bias 65 voxel_encoder.pfn_layers.0.linear.weight

    --------------pcdet-pointpillars----------------------------------- 0 vfe.pfn_layers.0.linear.weight 1 vfe.pfn_layers.0.norm.weight 2 vfe.pfn_layers.0.norm.bias 3 backbone_2d.blocks.0.1.weight 4 backbone_2d.blocks.0.2.weight 5 backbone_2d.blocks.0.2.bias 6 backbone_2d.blocks.0.4.weight 7 backbone_2d.blocks.0.5.weight 8 backbone_2d.blocks.0.5.bias 9 backbone_2d.blocks.0.7.weight 10 backbone_2d.blocks.0.8.weight 11 backbone_2d.blocks.0.8.bias 12 backbone_2d.blocks.0.10.weight 13 backbone_2d.blocks.0.11.weight 14 backbone_2d.blocks.0.11.bias 15 backbone_2d.blocks.1.1.weight 16 backbone_2d.blocks.1.2.weight 17 backbone_2d.blocks.1.2.bias 18 backbone_2d.blocks.1.4.weight 19 backbone_2d.blocks.1.5.weight 20 backbone_2d.blocks.1.5.bias 21 backbone_2d.blocks.1.7.weight 22 backbone_2d.blocks.1.8.weight 23 backbone_2d.blocks.1.8.bias 24 backbone_2d.blocks.1.10.weight 25 backbone_2d.blocks.1.11.weight 26 backbone_2d.blocks.1.11.bias 27 backbone_2d.blocks.1.13.weight 28 backbone_2d.blocks.1.14.weight 29 backbone_2d.blocks.1.14.bias 30 backbone_2d.blocks.1.16.weight 31 backbone_2d.blocks.1.17.weight 32 backbone_2d.blocks.1.17.bias 33 backbone_2d.blocks.2.1.weight 34 backbone_2d.blocks.2.2.weight 35 backbone_2d.blocks.2.2.bias 36 backbone_2d.blocks.2.4.weight 37 backbone_2d.blocks.2.5.weight 38 backbone_2d.blocks.2.5.bias 39 backbone_2d.blocks.2.7.weight 40 backbone_2d.blocks.2.8.weight 41 backbone_2d.blocks.2.8.bias 42 backbone_2d.blocks.2.10.weight 43 backbone_2d.blocks.2.11.weight 44 backbone_2d.blocks.2.11.bias 45 backbone_2d.blocks.2.13.weight 46 backbone_2d.blocks.2.14.weight 47 backbone_2d.blocks.2.14.bias 48 backbone_2d.blocks.2.16.weight 49 backbone_2d.blocks.2.17.weight 50 backbone_2d.blocks.2.17.bias 51 backbone_2d.deblocks.0.0.weight 52 backbone_2d.deblocks.0.1.weight 53 backbone_2d.deblocks.0.1.bias 54 backbone_2d.deblocks.1.0.weight 55 backbone_2d.deblocks.1.1.weight 56 backbone_2d.deblocks.1.1.bias 57 backbone_2d.deblocks.2.0.weight 58 backbone_2d.deblocks.2.1.weight 59 backbone_2d.deblocks.2.1.bias 60 dense_head.conv_cls.weight 61 dense_head.conv_cls.bias 62 dense_head.conv_box.weight 63 dense_head.conv_box.bias 64 dense_head.conv_dir_cls.weight 65 dense_head.conv_dir_cls.bias

    opened by sylivahf 0
Owner
NVIDIA AI IOT
NVIDIA AI IOT
Monocular 3D pose estimation. OpenVINO. CPU inference or iGPU (OpenCL) inference.

human-pose-estimation-3d-python-cpp RealSenseD435 (RGB) 480x640 + CPU Corei9 45 FPS (Depth is not used) 1. Run 1-1. RealSenseD435 (RGB) 480x640 + CPU

Katsuya Hyodo 8 Oct 3, 2022
PyTorch-LIT is the Lite Inference Toolkit (LIT) for PyTorch which focuses on easy and fast inference of large models on end-devices.

PyTorch-LIT PyTorch-LIT is the Lite Inference Toolkit (LIT) for PyTorch which focuses on easy and fast inference of large models on end-devices. With

Amin Rezaei 157 Dec 11, 2022
Data-depth-inference - Data depth inference with python

Welcome! This readme will guide you through the use of the code in this reposito

Marco 3 Feb 8, 2022
tensorrt int8 量化yolov5 4.0 onnx模型

onnx模型转换为 int8 tensorrt引擎

null 123 Dec 28, 2022
3D ResNet Video Classification accelerated by TensorRT

Activity Recognition TensorRT Perform video classification using 3D ResNets trained on Kinetics-400 dataset and accelerated with TensorRT P.S Click on

Akash James 39 Nov 21, 2022
PyTorch ,ONNX and TensorRT implementation of YOLOv4

PyTorch ,ONNX and TensorRT implementation of YOLOv4

null 4.2k Jan 1, 2023
EfficientNetv2 TensorRT int8

EfficientNetv2_TensorRT_int8 EfficientNetv2模型实现来自https://github.com/d-li14/efficientnetv2.pytorch 环境配置 ubuntu:18.04 cuda:11.0 cudnn:8.0 tensorrt:7

null 34 Apr 24, 2022
TensorRT examples (Jetson, Python/C++)(object detection)

TensorRT examples (Jetson, Python/C++)(object detection)

Nobuo Tsukamoto 53 Dec 22, 2022
Export CenterPoint PonintPillars ONNX Model For TensorRT

CenterPoint-PonintPillars Pytroch model convert to ONNX and TensorRT Welcome to CenterPoint! This project is fork from tianweiy/CenterPoint. I impleme

CarkusL 149 Dec 13, 2022
A high-performance anchor-free YOLO. Exceeding yolov3~v5 with ONNX, TensorRT, NCNN, and Openvino supported.

YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and industrial communities. For more details, please refer to our report on Arxiv.

null 7.7k Jan 6, 2023
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with ONNX, TensorRT, ncnn, and OpenVINO supported.

Introduction YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and ind

null 7.7k Jan 3, 2023
WHENet - ONNX, OpenVINO, TFLite, TensorRT, EdgeTPU, CoreML, TFJS, YOLOv4/YOLOv4-tiny-3L

HeadPoseEstimation-WHENet-yolov4-onnx-openvino ONNX, OpenVINO, TFLite, TensorRT, EdgeTPU, CoreML, TFJS, YOLOv4/YOLOv4-tiny-3L 1. Usage $ git clone htt

Katsuya Hyodo 49 Sep 21, 2022
Real-time pose estimation accelerated with NVIDIA TensorRT

trt_pose Want to detect hand poses? Check out the new trt_pose_hand project for real-time hand pose and gesture recognition! trt_pose is aimed at enab

NVIDIA AI IOT 803 Jan 6, 2023
Convert openmmlab (not only mmdetection) series model to tensorrt

MMDet to TensorRT This project aims to convert the mmdetection model to TensorRT model end2end. Focus on object detection for now. Mask support is exp

JinTian 4 Dec 17, 2021
Using image super resolution models with vapoursynth and speeding them up with TensorRT

vs-RealEsrganAnime-tensorrt-docker Using image super resolution models with vapoursynth and speeding them up with TensorRT. Also a docker image since

null 4 Aug 23, 2022
Using VapourSynth with super resolution models and speeding them up with TensorRT.

VSGAN-tensorrt-docker Using image super resolution models with vapoursynth and speeding them up with TensorRT. Using NVIDIA/Torch-TensorRT combined wi

null 111 Jan 5, 2023
This project aims to explore the deployment of Swin-Transformer based on TensorRT, including the test results of FP16 and INT8.

Swin Transformer This project aims to explore the deployment of SwinTransformer based on TensorRT, including the test results of FP16 and INT8. Introd

maggiez 87 Dec 21, 2022
The modify PyTorch version of Siam-trackers which are speed-up by TensorRT.

SiamTracker-with-TensorRT The modify PyTorch version of Siam-trackers which are speed-up by TensorRT or ONNX. [Updating...] Examples demonstrating how

null 9 Dec 13, 2022