Following issue https://github.com/XiaoMi/mace/issues/595 and specifically @lu229 suggestion https://github.com/XiaoMi/mace/issues/595#issuecomment-599338955, I've implemented a basic data layout transformation to get correct results from Reshape
op on GPU when using a model with NCHW
data format.
Using the model and yml
file from https://github.com/XiaoMi/mace/issues/595#issuecomment-593762744, now validation is successful.
root@ds017:/mace# python tools/converter.py run --config /models/model/model_net3.yml --validate
CMD> bazel build //mace/proto:mace_py
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
Loading:
Loading: 0 packages loaded
Analyzing: target //mace/proto:mace_py (6 packages loaded)
INFO: Analysed target //mace/proto:mace_py (17 packages loaded).
INFO: Found 1 target...
[0 / 7] [-----] BazelWorkspaceStatusAction stable-status.txt
Target //mace/proto:mace_py up-to-date:
bazel-genfiles/mace/proto/mace_pb2.py
INFO: Elapsed time: 2.364s, Critical Path: 0.06s
INFO: 0 processes.
INFO: Build completed successfully, 1 total action
INFO: Build completed successfully, 1 total action
CMD> cp -f bazel-genfiles/mace/proto/mace_pb2.py tools/python/py_proto
CMD> bazel build //third_party/caffe:caffe_py
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
Loading:
Loading: 0 packages loaded
Analyzing: target //third_party/caffe:caffe_py (6 packages loaded)
INFO: Analysed target //third_party/caffe:caffe_py (17 packages loaded).
INFO: Found 1 target...
[0 / 1] [-----] BazelWorkspaceStatusAction stable-status.txt
Target //third_party/caffe:caffe_py up-to-date:
bazel-genfiles/third_party/caffe/caffe_pb2.py
INFO: Elapsed time: 2.337s, Critical Path: 0.05s
INFO: 0 processes.
INFO: Build completed successfully, 1 total action
INFO: Build completed successfully, 1 total action
CMD> cp -f bazel-genfiles/third_party/caffe/caffe_pb2.py tools/python/py_proto
* Build //mace/tools:mace_run_static with ABI arm64-v8a
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
WARNING: The major revision of the Android NDK referenced by android_ndk_repository rule 'androidndk' is 19. The major revisions supported by Bazel are [10, 11, 12, 13, 14, 15, 16]. Bazel will attempt to treat the NDK as if it was r16. This may cause compilation and linkage problems. Please download a supported NDK version.
INFO: Analysed target //mace/tools:mace_run_static (32 packages loaded).
INFO: Found 1 target...
Target //mace/tools:mace_run_static up-to-date:
bazel-bin/mace/tools/mace_run_static
INFO: Elapsed time: 11.826s, Critical Path: 0.50s
INFO: 0 processes.
INFO: Build completed successfully, 1 total action
('build', '//mace/tools:mace_run_static', '--config', 'android', '--cpu=arm64-v8a', '--define', 'neon=true', '--define', 'openmp=false', '--define', 'opencl=true', '--define', 'quantize=false', '--define', 'hexagon=false', '--define', 'hta=false', '--define', 'apu=false', '--config', 'optimization', '--config', 'symbol_hidden')
Build done!
***********************************************
Run model model on MI9
***********************************************
Generate input file: build/model/_tmp/model/ff8ca0d5edb943c867518c604a0c575d/MI9_msmnile/arm64-v8a/model_input_data
Generate input file done.
* Run 'model' with round=1, restart_round=1, tuning=False, out_of_range_check=False, omp_num_threads=(-1,), cpu_affinity_policy=(1,), gpu_perf_hint=(3,), gpu_priority_hint=(3,)
Push build/model/_tmp/model/ff8ca0d5edb943c867518c604a0c575d/MI9_msmnile/arm64-v8a/model_input_data to /data/local/tmp/mace_run
Push build/model/model/model.data to /data/local/tmp/mace_run
Push build/model/model/model.pb to /data/local/tmp/mace_run/model.pb
Push build/model/_tmp/arm64-v8a/mace_run_static to /data/local/tmp/mace_run
Push /tmp/cmd_file-model-1584697819.6052978 to /data/local/tmp/mace_run/cmd_file-model-1584697819.6052978
I mace/tools/mace_run.cc:527] model name: model
I mace/tools/mace_run.cc:528] mace version: v0.12.0-0-ga610d50
I mace/tools/mace_run.cc:529] input node: data
I mace/tools/mace_run.cc:530] input shape: 1,3,160,160
I mace/tools/mace_run.cc:531] output node: face_rpn_cls_prob_reshape_stride32,face_rpn_bbox_pred_stride32,face_rpn_landmark_pred_stride32,face_rpn_cls_prob_reshape_stride16,face_rpn_bbox_pred_stride16,face_rpn_landmark_pred_stride16,face_rpn_cls_prob_reshape_stride8,face_rpn_bbox_pred_stride8,face_rpn_landmark_pred_stride8
I mace/tools/mace_run.cc:532] output shape: 1,4,5,5:1,8,5,5:1,20,5,5:1,4,10,10:1,8,10,10:1,20,10,10:1,4,20,20:1,8,20,20:1,20,20,20
I mace/tools/mace_run.cc:533] input_file: /data/local/tmp/mace_run/model_input
I mace/tools/mace_run.cc:534] output_file: /data/local/tmp/mace_run/model_out
I mace/tools/mace_run.cc:535] input dir:
I mace/tools/mace_run.cc:536] output dir:
I mace/tools/mace_run.cc:537] model_data_file: /data/local/tmp/mace_run/model.data
I mace/tools/mace_run.cc:538] model_file: /data/local/tmp/mace_run/model.pb
I mace/tools/mace_run.cc:539] device: GPU
I mace/tools/mace_run.cc:540] round: 1
I mace/tools/mace_run.cc:541] restart_round: 1
I mace/tools/mace_run.cc:542] gpu_perf_hint: 3
I mace/tools/mace_run.cc:543] gpu_priority_hint: 3
I mace/tools/mace_run.cc:544] omp_num_threads: -1
I mace/tools/mace_run.cc:545] cpu_affinity_policy: 1
I mace/tools/mace_run.cc:548] limit_opencl_kernel_time: 0
I mace/tools/mace_run.cc:553] opencl_queue_window_size: 0
I mace/libmace/mace.cc:464] Creating MaceEngine, MACE version: v0.12.0-0-ga610d50
I mace/libmace/mace.cc:503] Initializing MaceEngine
I mace/libmace/mace.cc:636] Destroying MaceEngine
I mace/tools/mace_run.cc:596] restart round 0
W ./mace/utils/tuner.h:201] Failed to read tuned param file: /data/local/tmp/mace_run/model_tuned_opencl_parameter.MI9.msmnile.bin
I mace/libmace/mace.cc:911] Create MaceEngine from model graph proto and weights data
I mace/libmace/mace.cc:464] Creating MaceEngine, MACE version: v0.12.0-0-ga610d50
W mace/core/kv_storage.cc:109] Failed to read kv store file: /data/local/tmp/mace_run/interior//mace_cl_compiled_program.bin
W mace/core/runtime/opencl/opencl_runtime.cc:382] Load OpenCL cached compiled kernel file failed. Please make sure the storage directory exist and you have Write&Read permission
I mace/libmace/mace.cc:503] Initializing MaceEngine
I mace/tools/mace_run.cc:269] Create Mace Engine latency: 884.83 ms
I mace/tools/mace_run.cc:276] Total init latency: 884.993 ms
I mace/tools/mace_run.cc:370] Warm up run
I mace/tools/mace_run.cc:406] 1st warm up run latency: 1360.07 ms
I mace/tools/mace_run.cc:414] Run model
I mace/tools/mace_run.cc:476] Average latency: 11.777 ms
I mace/tools/mace_run.cc:491] Write output file /data/local/tmp/mace_run/model_out_face_rpn_cls_prob_reshape_stride32 with size 400 done.
I mace/tools/mace_run.cc:491] Write output file /data/local/tmp/mace_run/model_out_face_rpn_bbox_pred_stride32 with size 800 done.
I mace/tools/mace_run.cc:491] Write output file /data/local/tmp/mace_run/model_out_face_rpn_landmark_pred_stride32 with size 2000 done.
I mace/tools/mace_run.cc:491] Write output file /data/local/tmp/mace_run/model_out_face_rpn_cls_prob_reshape_stride16 with size 1600 done.
I mace/tools/mace_run.cc:491] Write output file /data/local/tmp/mace_run/model_out_face_rpn_bbox_pred_stride16 with size 3200 done.
I mace/tools/mace_run.cc:491] Write output file /data/local/tmp/mace_run/model_out_face_rpn_landmark_pred_stride16 with size 8000 done.
I mace/tools/mace_run.cc:491] Write output file /data/local/tmp/mace_run/model_out_face_rpn_cls_prob_reshape_stride8 with size 6400 done.
I mace/tools/mace_run.cc:491] Write output file /data/local/tmp/mace_run/model_out_face_rpn_bbox_pred_stride8 with size 12800 done.
I mace/tools/mace_run.cc:491] Write output file /data/local/tmp/mace_run/model_out_face_rpn_landmark_pred_stride8 with size 32000 done.
========================================================
capability(CPU) init warmup run_avg
========================================================
time 19.788 884.993 1360.074 11.777
I mace/libmace/mace.cc:636] Destroying MaceEngine
Running finished!
* Validate with caffe
Pull /data/local/tmp/mace_run/model_out_face_rpn_cls_prob_reshape_stride32 to build/model/_tmp/model/ff8ca0d5edb943c867518c604a0c575d/MI9_msmnile/arm64-v8a
Pull /data/local/tmp/mace_run/model_out_face_rpn_bbox_pred_stride32 to build/model/_tmp/model/ff8ca0d5edb943c867518c604a0c575d/MI9_msmnile/arm64-v8a
Pull /data/local/tmp/mace_run/model_out_face_rpn_landmark_pred_stride32 to build/model/_tmp/model/ff8ca0d5edb943c867518c604a0c575d/MI9_msmnile/arm64-v8a
Pull /data/local/tmp/mace_run/model_out_face_rpn_cls_prob_reshape_stride16 to build/model/_tmp/model/ff8ca0d5edb943c867518c604a0c575d/MI9_msmnile/arm64-v8a
Pull /data/local/tmp/mace_run/model_out_face_rpn_bbox_pred_stride16 to build/model/_tmp/model/ff8ca0d5edb943c867518c604a0c575d/MI9_msmnile/arm64-v8a
Pull /data/local/tmp/mace_run/model_out_face_rpn_landmark_pred_stride16 to build/model/_tmp/model/ff8ca0d5edb943c867518c604a0c575d/MI9_msmnile/arm64-v8a
Pull /data/local/tmp/mace_run/model_out_face_rpn_cls_prob_reshape_stride8 to build/model/_tmp/model/ff8ca0d5edb943c867518c604a0c575d/MI9_msmnile/arm64-v8a
Pull /data/local/tmp/mace_run/model_out_face_rpn_bbox_pred_stride8 to build/model/_tmp/model/ff8ca0d5edb943c867518c604a0c575d/MI9_msmnile/arm64-v8a
Pull /data/local/tmp/mace_run/model_out_face_rpn_landmark_pred_stride8 to build/model/_tmp/model/ff8ca0d5edb943c867518c604a0c575d/MI9_msmnile/arm64-v8a
face_rpn_cls_prob_reshape_stride32 MACE VS CAFFE similarity: 0.9999999920146434 , sqnr: 12416689.129010014 , pixel_accuracy: 0.8
******************************************
Similarity Test Passed
******************************************
face_rpn_bbox_pred_stride32 MACE VS CAFFE similarity: 0.9999543587571056 , sqnr: 10949.429334335218 , pixel_accuracy: 1.0
******************************************
Similarity Test Passed
******************************************
face_rpn_landmark_pred_stride32 MACE VS CAFFE similarity: 0.9999657541037306 , sqnr: 13611.449222524909 , pixel_accuracy: 1.0
******************************************
Similarity Test Passed
******************************************
face_rpn_cls_prob_reshape_stride16 MACE VS CAFFE similarity: 0.9999999900582555 , sqnr: 13540919.254218182 , pixel_accuracy: 0.775
******************************************
Similarity Test Passed
******************************************
face_rpn_bbox_pred_stride16 MACE VS CAFFE similarity: 0.9999490543979106 , sqnr: 9740.480377276486 , pixel_accuracy: 0.9625
******************************************
Similarity Test Passed
******************************************
face_rpn_landmark_pred_stride16 MACE VS CAFFE similarity: 0.9999721485332628 , sqnr: 17715.55990542613 , pixel_accuracy: 0.975
******************************************
Similarity Test Passed
******************************************
face_rpn_cls_prob_reshape_stride8 MACE VS CAFFE similarity: 0.9999999887438001 , sqnr: 13026576.736919338 , pixel_accuracy: 0.75
******************************************
Similarity Test Passed
******************************************
face_rpn_bbox_pred_stride8 MACE VS CAFFE similarity: 0.9998798908366177 , sqnr: 3858.796852940529 , pixel_accuracy: 0.95625
******************************************
Similarity Test Passed
******************************************
face_rpn_landmark_pred_stride8 MACE VS CAFFE similarity: 0.9999325048718496 , sqnr: 7404.160247931779 , pixel_accuracy: 0.97
******************************************
Similarity Test Passed
******************************************
Validation done!
Dana service is not available.
Elapse time: 0.439396 minutes.
* Build //mace/tools:mace_run_static with ABI arm64-v8a
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
WARNING: The major revision of the Android NDK referenced by android_ndk_repository rule 'androidndk' is 19. The major revisions supported by Bazel are [10, 11, 12, 13, 14, 15, 16]. Bazel will attempt to treat the NDK as if it was r16. This may cause compilation and linkage problems. Please download a supported NDK version.
INFO: Analysed target //mace/tools:mace_run_static (32 packages loaded).
INFO: Found 1 target...
Target //mace/tools:mace_run_static up-to-date:
bazel-bin/mace/tools/mace_run_static
INFO: Elapsed time: 12.009s, Critical Path: 0.50s
INFO: 0 processes.
INFO: Build completed successfully, 1 total action
('build', '//mace/tools:mace_run_static', '--config', 'android', '--cpu=arm64-v8a', '--define', 'neon=true', '--define', 'openmp=false', '--define', 'opencl=true', '--define', 'quantize=false', '--define', 'hexagon=false', '--define', 'hta=false', '--define', 'apu=false', '--config', 'optimization', '--config', 'symbol_hidden')
Build done!
**************************************************
Run model model on POCOF1
**************************************************
Generate input file: build/model/_tmp/model/ff8ca0d5edb943c867518c604a0c575d/POCOF1_sdm845/arm64-v8a/model_input_data
Generate input file done.
* Run 'model' with round=1, restart_round=1, tuning=False, out_of_range_check=False, omp_num_threads=(-1,), cpu_affinity_policy=(1,), gpu_perf_hint=(3,), gpu_priority_hint=(3,)
Push build/model/_tmp/model/ff8ca0d5edb943c867518c604a0c575d/POCOF1_sdm845/arm64-v8a/model_input_data to /data/local/tmp/mace_run
Push build/model/model/model.data to /data/local/tmp/mace_run
Push build/model/model/model.pb to /data/local/tmp/mace_run/model.pb
Push build/model/_tmp/arm64-v8a/mace_run_static to /data/local/tmp/mace_run
Push /tmp/cmd_file-model-1584697859.1764498 to /data/local/tmp/mace_run/cmd_file-model-1584697859.1764498
I mace/tools/mace_run.cc:527] model name: model
I mace/tools/mace_run.cc:528] mace version: v0.12.0-0-ga610d50
I mace/tools/mace_run.cc:529] input node: data
I mace/tools/mace_run.cc:530] input shape: 1,3,160,160
I mace/tools/mace_run.cc:531] output node: face_rpn_cls_prob_reshape_stride32,face_rpn_bbox_pred_stride32,face_rpn_landmark_pred_stride32,face_rpn_cls_prob_reshape_stride16,face_rpn_bbox_pred_stride16,face_rpn_landmark_pred_stride16,face_rpn_cls_prob_reshape_stride8,face_rpn_bbox_pred_stride8,face_rpn_landmark_pred_stride8
I mace/tools/mace_run.cc:532] output shape: 1,4,5,5:1,8,5,5:1,20,5,5:1,4,10,10:1,8,10,10:1,20,10,10:1,4,20,20:1,8,20,20:1,20,20,20
I mace/tools/mace_run.cc:533] input_file: /data/local/tmp/mace_run/model_input
I mace/tools/mace_run.cc:534] output_file: /data/local/tmp/mace_run/model_out
I mace/tools/mace_run.cc:535] input dir:
I mace/tools/mace_run.cc:536] output dir:
I mace/tools/mace_run.cc:537] model_data_file: /data/local/tmp/mace_run/model.data
I mace/tools/mace_run.cc:538] model_file: /data/local/tmp/mace_run/model.pb
I mace/tools/mace_run.cc:539] device: GPU
I mace/tools/mace_run.cc:540] round: 1
I mace/tools/mace_run.cc:541] restart_round: 1
I mace/tools/mace_run.cc:542] gpu_perf_hint: 3
I mace/tools/mace_run.cc:543] gpu_priority_hint: 3
I mace/tools/mace_run.cc:544] omp_num_threads: -1
I mace/tools/mace_run.cc:545] cpu_affinity_policy: 1
I mace/tools/mace_run.cc:548] limit_opencl_kernel_time: 0
I mace/tools/mace_run.cc:553] opencl_queue_window_size: 0
I mace/libmace/mace.cc:464] Creating MaceEngine, MACE version: v0.12.0-0-ga610d50
I mace/libmace/mace.cc:503] Initializing MaceEngine
I mace/libmace/mace.cc:636] Destroying MaceEngine
I mace/tools/mace_run.cc:596] restart round 0
W ./mace/utils/tuner.h:201] Failed to read tuned param file: /data/local/tmp/mace_run/model_tuned_opencl_parameter.POCOF1.sdm845.bin
I mace/libmace/mace.cc:911] Create MaceEngine from model graph proto and weights data
I mace/libmace/mace.cc:464] Creating MaceEngine, MACE version: v0.12.0-0-ga610d50
W mace/core/kv_storage.cc:109] Failed to read kv store file: /data/local/tmp/mace_run/interior//mace_cl_compiled_program.bin
W mace/core/runtime/opencl/opencl_runtime.cc:382] Load OpenCL cached compiled kernel file failed. Please make sure the storage directory exist and you have Write&Read permission
I mace/libmace/mace.cc:503] Initializing MaceEngine
I mace/tools/mace_run.cc:269] Create Mace Engine latency: 1204.28 ms
I mace/tools/mace_run.cc:276] Total init latency: 1204.47 ms
I mace/tools/mace_run.cc:370] Warm up run
I mace/tools/mace_run.cc:406] 1st warm up run latency: 1874.9 ms
I mace/tools/mace_run.cc:414] Run model
I mace/tools/mace_run.cc:476] Average latency: 12.42 ms
I mace/tools/mace_run.cc:491] Write output file /data/local/tmp/mace_run/model_out_face_rpn_cls_prob_reshape_stride32 with size 400 done.
I mace/tools/mace_run.cc:491] Write output file /data/local/tmp/mace_run/model_out_face_rpn_bbox_pred_stride32 with size 800 done.
I mace/tools/mace_run.cc:491] Write output file /data/local/tmp/mace_run/model_out_face_rpn_landmark_pred_stride32 with size 2000 done.
I mace/tools/mace_run.cc:491] Write output file /data/local/tmp/mace_run/model_out_face_rpn_cls_prob_reshape_stride16 with size 1600 done.
I mace/tools/mace_run.cc:491] Write output file /data/local/tmp/mace_run/model_out_face_rpn_bbox_pred_stride16 with size 3200 done.
I mace/tools/mace_run.cc:491] Write output file /data/local/tmp/mace_run/model_out_face_rpn_landmark_pred_stride16 with size 8000 done.
I mace/tools/mace_run.cc:491] Write output file /data/local/tmp/mace_run/model_out_face_rpn_cls_prob_reshape_stride8 with size 6400 done.
I mace/tools/mace_run.cc:491] Write output file /data/local/tmp/mace_run/model_out_face_rpn_bbox_pred_stride8 with size 12800 done.
I mace/tools/mace_run.cc:491] Write output file /data/local/tmp/mace_run/model_out_face_rpn_landmark_pred_stride8 with size 32000 done.
========================================================
capability(CPU) init warmup run_avg
========================================================
time 21.267 1204.469 1874.897 12.420
I mace/libmace/mace.cc:636] Destroying MaceEngine
Running finished!
* Validate with caffe
Pull /data/local/tmp/mace_run/model_out_face_rpn_cls_prob_reshape_stride32 to build/model/_tmp/model/ff8ca0d5edb943c867518c604a0c575d/POCOF1_sdm845/arm64-v8a
Pull /data/local/tmp/mace_run/model_out_face_rpn_bbox_pred_stride32 to build/model/_tmp/model/ff8ca0d5edb943c867518c604a0c575d/POCOF1_sdm845/arm64-v8a
Pull /data/local/tmp/mace_run/model_out_face_rpn_landmark_pred_stride32 to build/model/_tmp/model/ff8ca0d5edb943c867518c604a0c575d/POCOF1_sdm845/arm64-v8a
Pull /data/local/tmp/mace_run/model_out_face_rpn_cls_prob_reshape_stride16 to build/model/_tmp/model/ff8ca0d5edb943c867518c604a0c575d/POCOF1_sdm845/arm64-v8a
Pull /data/local/tmp/mace_run/model_out_face_rpn_bbox_pred_stride16 to build/model/_tmp/model/ff8ca0d5edb943c867518c604a0c575d/POCOF1_sdm845/arm64-v8a
Pull /data/local/tmp/mace_run/model_out_face_rpn_landmark_pred_stride16 to build/model/_tmp/model/ff8ca0d5edb943c867518c604a0c575d/POCOF1_sdm845/arm64-v8a
Pull /data/local/tmp/mace_run/model_out_face_rpn_cls_prob_reshape_stride8 to build/model/_tmp/model/ff8ca0d5edb943c867518c604a0c575d/POCOF1_sdm845/arm64-v8a
Pull /data/local/tmp/mace_run/model_out_face_rpn_bbox_pred_stride8 to build/model/_tmp/model/ff8ca0d5edb943c867518c604a0c575d/POCOF1_sdm845/arm64-v8a
Pull /data/local/tmp/mace_run/model_out_face_rpn_landmark_pred_stride8 to build/model/_tmp/model/ff8ca0d5edb943c867518c604a0c575d/POCOF1_sdm845/arm64-v8a
face_rpn_cls_prob_reshape_stride32 MACE VS CAFFE similarity: 0.9999999923376746 , sqnr: 12761169.735631926 , pixel_accuracy: 0.75
******************************************
Similarity Test Passed
******************************************
face_rpn_bbox_pred_stride32 MACE VS CAFFE similarity: 0.9999559880321396 , sqnr: 11251.16938898572 , pixel_accuracy: 1.0
******************************************
Similarity Test Passed
******************************************
face_rpn_landmark_pred_stride32 MACE VS CAFFE similarity: 0.9999370739215658 , sqnr: 7828.408858784128 , pixel_accuracy: 0.98
******************************************
Similarity Test Passed
******************************************
face_rpn_cls_prob_reshape_stride16 MACE VS CAFFE similarity: 0.9999999903230272 , sqnr: 13382509.571867507 , pixel_accuracy: 0.75
******************************************
Similarity Test Passed
******************************************
face_rpn_bbox_pred_stride16 MACE VS CAFFE similarity: 0.9999496464497996 , sqnr: 9698.99300525273 , pixel_accuracy: 0.9625
******************************************
Similarity Test Passed
******************************************
face_rpn_landmark_pred_stride16 MACE VS CAFFE similarity: 0.9999721620823648 , sqnr: 17802.002994858874 , pixel_accuracy: 0.99
******************************************
Similarity Test Passed
******************************************
face_rpn_cls_prob_reshape_stride8 MACE VS CAFFE similarity: 0.9999999887525579 , sqnr: 12572009.393454924 , pixel_accuracy: 0.6875
******************************************
Similarity Test Passed
******************************************
face_rpn_bbox_pred_stride8 MACE VS CAFFE similarity: 0.999877265839963 , sqnr: 3695.563582568223 , pixel_accuracy: 0.91875
******************************************
Similarity Test Passed
******************************************
face_rpn_landmark_pred_stride8 MACE VS CAFFE similarity: 0.9999272805682087 , sqnr: 6862.242260144659 , pixel_accuracy: 0.95
******************************************
Similarity Test Passed
******************************************
Validation done!
Dana service is not available.
Elapse time: 0.444397 minutes.
* Package libs for model
Start packaging 'model' libs into build/model/libmace_model.tar.gz
build/model/model/
build/model/model/gpu/
build/model/model/model.data
build/model/model/model.pb
Packaging Done!
--------------------------------------------------------------
Library
--------------------------------------------------------------
| key | value |
==============================================================
| MACE Model package Path| build/model/libmace_model.tar.gz|
--------------------------------------------------------------
Code isn't complete yet, indeed I'm not very familiar with MACE project classes and I don't know how to assess if a model requires this data layout transformation or not. Namely, how can I get the model data format (ǸCHW
, NHWC
) from within mace/ops/opencl/image/reshape.cc
? So that we can apply the transformation only if needed.
One additional point is related to performance. I noticed that adding this data layout transformation, of course, takes some time during inference. Maybe there's a better and more performing way to implement it?
@lu229 I hope this helps, please tag anyone else that might be interested in contributing